Version: 2.6.0

Average Latency Feedback

note

The following policy is based on the Feature Rollout with Average Latency Feedback blueprint.

Overview

API response time is a critical performance indicator and directly impacts user experience. As new features are released, it's crucial to maintain service latency within a defined threshold. This policy monitors the average latency of an API, using it as a feedback mechanism for controlled feature rollout.

Configuration

The service is instrumented with the Aperture SDK. A new feature, awesome-feature, is encapsulated within a control point in the code.

The load_ramp section details the rollout procedure:

awesome_feature is the target for the rollout process.
The rollout begins with 1% of traffic directed to the new feature, gradually increasing to 100% over a period of 300 seconds.

The feature rollout is manually initiated by applying the dynamic configuration for this policy, as specified in the dynamic-values tab below.

During the rollout, the average latency of the /checkout API on the checkout.prod.svc.cluster.local service gets monitored. If the API endpoint latency remains below 75ms, the awesome-feature rollout proceeds. If the average latency surpasses 75ms, the policy automatically reverts the rollout back to the initial 1% threshold.

aperturectl values.yaml
aperturectl dynamic-values.yaml

# yaml-language-server: $schema=../../../../../../blueprints/policies/feature-rollout/average-latency/gen/definitions.json
# Generated values file for policies/feature-rollout/average-latency blueprint
# Documentation/Reference for objects and parameters can be found at:
# https://docs.fluxninja.com/reference/blueprints/policies/feature-rollout/average-latency

# Parameters for the Feature Rollout policy.
# Type: policies/feature-rollout/base:schema:rollout_policy
# Required: True
policy:
  # Name of the policy.
  # Type: string
  # Required: True
  policy_name: "feature-rollout"
  components: []
  drivers:
    average_latency_drivers:
      - criteria:
          forward:
            threshold: 75
          reset:
            threshold: 75
        selectors:
          - control_point: ingress
            service: checkout-service.prod.svc.cluster.local
            label_matcher:
              match_labels:
                http.path: /checkout
  evaluation_interval: "10s"
  load_ramp:
    sampler:
      selectors:
        - control_point: awesome-feature
          service: checkout-service.prod.svc.cluster.local
      label_key: ""
    steps:
      - duration: "0s"
        target_accept_percentage: 1.0
      - duration: "300s"
        target_accept_percentage: 100.0
  resources:
    flow_control:
      classifiers: []

# yaml-language-server: $schema=file:/Users/tgill/Work/fluxninja/aperture/blueprints/policies/feature-rollout/average-latency/gen/dynamic-config-definitions.json
# Generated values file for policies/feature-rollout/average-latency blueprint
# Documentation/Reference for objects and parameters can be found at:
# https://docs.fluxninja.com/reference/policies/bundled-blueprints/policies/feature-rollout/average-latency

# Start feature rollout. This setting can be updated at runtime without shutting down the policy. The feature rollout gets paused if this flag is set to false in the middle of a feature rollout.
# Type: bool
rollout: true

# Reset feature rollout to the first step. This setting can be updated at the runtime without shutting down the policy.
# Type: bool
reset: false

Generated Policy

apiVersion: fluxninja.com/v1alpha1
kind: Policy
metadata:
  annotations:
    fluxninja.com/blueprint-name: policies/feature-rollout/average-latency
    fluxninja.com/blueprints-uri: local
    fluxninja.com/values:
      '{"policy": {"components": [ ], "drivers": {"average_latency_drivers":
      [{"criteria": {"forward": {"threshold": 75}, "reset": {"threshold": 75}}, "selectors":
      [{"control_point": "ingress", "label_matcher": {"match_labels": {"http.path":
      "/checkout"}}, "service": "checkout-service.prod.svc.cluster.local"}]}]}, "evaluation_interval":
      "1s", "load_ramp": {"regulator": {"label_key": "", "selectors": [{"control_point":
      "awesome-feature", "service": "checkout-service.prod.svc.cluster.local"}]},
      "steps": [{"duration": "0s", "target_accept_percentage": 1}, {"duration": "300s",
      "target_accept_percentage": 100}]}, "policy_name": "feature-rollout", "resources":
      {"flow_control": {"classifiers": [ ]}}}}'
  labels:
    fluxninja.com/validate: "true"
  name: feature-rollout
spec:
  circuit:
    components:
      - query:
          promql:
            evaluation_interval: 1s
            out_ports:
              output:
                signal_name: AVERAGE_LATENCY_0
            query_string:
              sum(increase(flux_meter_sum{flow_status="OK", flux_meter_name="feature-rollout/average_latency/0"}[30s]))/sum(increase(flux_meter_count{flow_status="OK",
              flux_meter_name="feature-rollout/average_latency/0"}[30s]))
      - decider:
          in_ports:
            lhs:
              signal_name: AVERAGE_LATENCY_0
            rhs:
              constant_signal:
                value: 75
          operator: lt
          out_ports:
            output:
              signal_name: FORWARD_0
      - decider:
          in_ports:
            lhs:
              signal_name: AVERAGE_LATENCY_0
            rhs:
              constant_signal:
                value: 75
          operator: gt
          out_ports:
            output:
              signal_name: RESET_0
      - bool_variable:
          config_key: rollout
          constant_output: false
          out_ports:
            output:
              signal_name: USER_ROLLOUT_CONTROL
      - bool_variable:
          config_key: reset
          constant_output: false
          out_ports:
            output:
              signal_name: USER_RESET_CONTROL
      - or:
          in_ports:
            inputs: []
          out_ports:
            output:
              signal_name: BACKWARD_INTENT
      - or:
          in_ports:
            inputs:
              - signal_name: RESET_0
              - signal_name: USER_RESET_CONTROL
          out_ports:
            output:
              signal_name: RESET
      - or:
          in_ports:
            inputs:
              - signal_name: FORWARD_0
          out_ports:
            output:
              signal_name: FORWARD_INTENT
      - inverter:
          in_ports:
            input:
              signal_name: BACKWARD_INTENT
          out_ports:
            output:
              signal_name: INVERTED_BACKWARD_INTENT
      - first_valid:
          in_ports:
            inputs:
              - signal_name: INVERTED_BACKWARD_INTENT
              - constant_signal:
                  value: 1
          out_ports:
            output:
              signal_name: NOT_BACKWARD
      - inverter:
          in_ports:
            input:
              signal_name: RESET
          out_ports:
            output:
              signal_name: INVERTED_RESET
      - first_valid:
          in_ports:
            inputs:
              - signal_name: INVERTED_RESET
              - constant_signal:
                  value: 1
          out_ports:
            output:
              signal_name: NOT_RESET
      - and:
          in_ports:
            inputs:
              - signal_name: NOT_BACKWARD
              - signal_name: NOT_RESET
              - signal_name: USER_ROLLOUT_CONTROL
              - signal_name: FORWARD_INTENT
          out_ports:
            output:
              signal_name: FORWARD
      - and:
          in_ports:
            inputs:
              - signal_name: BACKWARD_INTENT
              - signal_name: NOT_RESET
          out_ports:
            output:
              signal_name: BACKWARD
      - flow_control:
          load_ramp:
            in_ports:
              backward:
                signal_name: BACKWARD
              forward:
                signal_name: FORWARD
              reset:
                signal_name: RESET
            parameters:
              regulator:
                label_key: ""
                selectors:
                  - control_point: awesome-feature
                    service: checkout-service.prod.svc.cluster.local
              steps:
                - duration: 0s
                  target_accept_percentage: 1
                - duration: 300s
                  target_accept_percentage: 100
            pass_through_label_values_config_key: pass_through_label_values
    evaluation_interval: 1s
  resources:
    flow_control:
      classifiers: []
      flux_meters:
        feature-rollout/average_latency/0:
          selectors:
            - control_point: ingress
              label_matcher:
                match_labels:
                  http.path: /checkout
              service: checkout-service.prod.svc.cluster.local

Policy in Action

In this scenario, the new awesome-feature causes a performance regression in the service, leading to increased response times. As the rollout percentage increases, the latency threshold of 75ms is exceeded, prompting the policy to automatically revert the rollout back to the initial 1% threshold. This action results in the return of the latency to normal levels.

info

Circuit Diagram for this policy.

Feature Rollout with Average Latency Feedback

Overview​

Configuration​

Policy in Action​

Overview

Configuration

Policy in Action