API reference

Review the API reference documentation for kgateway with the agentgateway data plane.

Looking for the Envoy data plane APIs instead? See the kgateway with Envoy API docs.

Packages

agentgateway.dev/v1alpha1

Resource Types

AIBackend

AIBackend specifies the AI backend configuration

Validation:

  • ExactlyOneOf: [provider groups]

Appears in:

FieldDescriptionDefaultValidation
provider LLMProviderprovider specifies configuration for how to reach the configured LLM
provider.
ExactlyOneOf: [openai azureopenai anthropic gemini vertexai bedrock]
Optional: {}
groups PriorityGroup arraygroups specifies a list of groups in priority order where each group
defines a set of LLM providers. The priority determines the priority of
the backend endpoints chosen.
Note: provider names must be unique across all providers in all priority
groups. Backend policies may target a specific provider by name using
targetRefs[].sectionName.
Example configuration with two priority groups:
groups:
- providers:
- azureopenai:
deploymentName: gpt-4o-mini
apiVersion: 2024-02-15-preview
endpoint: ai-gateway.openai.azure.com
- providers:
- azureopenai:
deploymentName: gpt-4o-mini-2
apiVersion: 2024-02-15-preview
endpoint: ai-gateway-2.openai.azure.com
policies:
auth:
secretRef:
name: azure-secret
MaxItems: 8
MinItems: 1
Optional: {}

AIPromptEnrichment

AIPromptEnrichment defines the config to enrich requests sent to the LLM provider by appending and prepending system prompts.

Prompt enrichment allows you to add additional context to the prompt before sending it to the model. Unlike RAG or other dynamic context methods, prompt enrichment is static and is applied to every request.

Note: Some providers, including Anthropic, do not support SYSTEM role messages, and instead have a dedicated system field in the input JSON. In this case, use the defaults setting to set the system field.

The following example prepends a system prompt of Answer all questions in French. and appends Describe the painting as if you were a famous art critic from the 17th century. to each request that is sent to the openai HTTPRoute.

name: openai-opt
namespace: agentgateway-system

spec:

targetRefs:
- group: gateway.networking.k8s.io
  kind: HTTPRoute
  name: openai
ai:
    promptEnrichment:
      prepend:
      - role: SYSTEM
        content: "Answer all questions in French."
      append:
      - role: USER
        content: "Describe the painting as if you were a famous art critic from the 17th century."

Appears in:

FieldDescriptionDefaultValidation
prepend Message arrayA list of messages to be prepended to the prompt sent by the client.Optional: {}
append Message arrayA list of messages to be appended to the prompt sent by the client.Optional: {}

AIPromptGuard

AIPromptGuard configures a prompt guards to block unwanted requests to the LLM provider and mask sensitive data. Prompt guards can be used to reject requests based on the content of the prompt, as well as mask responses based on the content of the response.

This example rejects any request prompts that contain the string “credit card”, and masks any credit card numbers in the response.

promptGuard:
	request:
	- response:
	    message: "Rejected due to inappropriate content"
	  regex:
	    action: REJECT
	    matches:
	    - pattern: "credit card"
	      name: "CC"
	response:
	- regex:
	    builtins:
	    - CREDIT_CARD
	    action: MASK

Appears in:

FieldDescriptionDefaultValidation
request PromptguardRequest arrayPrompt guards to apply to requests sent by the client.ExactlyOneOf: [regex webhook openAIModeration bedrockGuardrails googleModelArmor]
MaxItems: 8
MinItems: 1
Optional: {}
response PromptguardResponse arrayPrompt guards to apply to responses returned by the LLM provider.ExactlyOneOf: [regex webhook bedrockGuardrails googleModelArmor]
MaxItems: 8
MinItems: 1
Optional: {}

APIKeyAuthentication

Validation:

  • ExactlyOneOf: [secretRef secretSelector]

Appears in:

FieldDescriptionDefaultValidation
mode APIKeyAuthenticationModemode is the validation mode for API key authentication.StrictEnum: [Strict Optional]
Optional: {}
secretRef LocalObjectReferencesecretRef references a Kubernetes Secret storing a set of API keys.
If there are many keys, secretSelector can be used instead.
Each entry in the Secret represents one API key. The key is an
arbitrary identifier. The value can either be:
* A string representing the API key.
* A JSON object with two fields, key and metadata. key contains
the API key. metadata contains arbitrary JSON metadata associated
with the key, which may be used by other policies. For example, you
may write an authorization policy allowing apiKey.group == 'sales'.
Example:
apiVersion: v1
kind: Secret
metadata:
name: api-key
stringData:
client1: |
{
“key”: “k-123”,
“metadata”: {
“group”: “sales”,
“created_at”: “2024-10-01T12:00:00Z”
}
}
client2: “k-456”
Optional: {}
secretSelector SecretSelectorsecretSelector selects multiple Secret resources containing API
keys. If the same key is defined in multiple secrets, the behavior is
undefined.
Each entry in the Secret represents one API key. The key is an
arbitrary identifier. The value can either be:
* A string representing the API key.
* A JSON object with two fields, key and metadata. key contains
the API key. metadata contains arbitrary JSON metadata associated
with the key, which may be used by other policies. For example, you
may write an authorization policy allowing apiKey.group == 'sales'.
Example:
apiVersion: v1
kind: Secret
metadata:
name: api-key
stringData:
client1: |
{
“key”: “k-123”,
“metadata”: {
“group”: “sales”,
“created_at”: “2024-10-01T12:00:00Z”
}
}
client2: “k-456”
Optional: {}

APIKeyAuthenticationMode

Underlying type: string

Validation:

  • Enum: [Strict Optional]

Appears in:

FieldDescription
StrictA valid API Key must be present.
This is the default option.
OptionalIf an API Key exists, validate it.
Warning: this allows requests without an API Key!

AWSGuardrailConfig

Appears in:

FieldDescriptionDefaultValidation
identifier ShortStringGuardrailIdentifier is the identifier of the Guardrail policy to use for the backend.MaxLength: 256
MinLength: 1
Required: {}
version ShortStringGuardrailVersion is the version of the Guardrail policy to use for the backend.MaxLength: 256
MinLength: 1
Required: {}

AccessLog

AccessLog specifies how per-request access logs are emitted.

Appears in:

FieldDescriptionDefaultValidation
filter CELExpressionfilter specifies a CEL expression that is used to filter logs. A log
will only be emitted if the expression evaluates to true.
Optional: {}
attributes LogTracingAttributesattributes specifies customizations to the key-value pairs that are
logged.
Optional: {}
otlp OtlpAccessLogotlp configures OTLP access log export to an
OpenTelemetry-compatible backend.
Optional: {}

Action

Underlying type: string

Action to take if a regex pattern is matched in a request or response. This setting applies only to request matches. PromptguardResponse matches are always masked by default.

Validation:

  • Enum: [Mask Reject]

Appears in:

FieldDescription
MaskMask the matched data in the request.
RejectReject the request if the regex matches content in the request.

AgentExtAuthGRPC

Appears in:

FieldDescriptionDefaultValidation
contextExtensions object (keys:string, values:string)contextExtensions specifies additional arbitrary key-value pairs to
send to the authorization server in the context_extensions field.
MaxProperties: 64
Optional: {}
requestMetadata object (keys:string, values:CELExpression)requestMetadata specifies metadata to be sent to the authorization
server. This maps to the metadata_context.filter_metadata field of the
request, and allows dynamic CEL expressions. If unset, by default the
envoy.filters.http.jwt_authn key is set if the JWT policy is used as
well, for compatibility.
MaxProperties: 64
Optional: {}

AgentExtAuthHTTP

Appears in:

FieldDescriptionDefaultValidation
path CELExpressionpath specifies the path to send to the authorization server. If
unset, this defaults to the original request path.
This is a CEL expression, which allows customizing the path based on the
incoming request. For example, to add a prefix, use
"/prefix/" + request.path.
Optional: {}
redirect CELExpressionredirect defines an optional expression to determine a path to
redirect to on authorization failure. This is useful to redirect to a
sign-in page.
Optional: {}
allowedRequestHeaders ShortString arrayallowedRequestHeaders specifies what additional headers from the client request
will be sent to the authorization server.
If unset, the following headers are sent by default: Authorization.
MaxItems: 64
MaxLength: 256
MinLength: 1
Optional: {}
addRequestHeaders object (keys:string, values:CELExpression)addRequestHeaders specifies what additional headers to add to the
request to the authorization server. While allowedRequestHeaders just
passes the original headers through, addRequestHeaders allows defining
custom headers based on CEL expressions.
MaxProperties: 64
Optional: {}
allowedResponseHeaders ShortString arrayallowedResponseHeaders specifies what headers from the authorization response
will be copied into the request to the backend.
MaxItems: 64
MaxLength: 256
MinLength: 1
Optional: {}
responseMetadata object (keys:string, values:CELExpression)responseMetadata specifies what metadata fields should be constructed
from the authorization response. These will be included under the
extauthz variable in future CEL expressions. Setting this is useful
for things like logging usernames, without needing to include them as
headers to the backend, as allowedResponseHeaders would.
MaxProperties: 64
Optional: {}

AgentgatewayBackend

FieldDescriptionDefaultValidation
apiVersion stringagentgateway.dev/v1alpha1
kind stringAgentgatewayBackend
kind stringKind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
Optional: {}
apiVersion stringAPIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
Optional: {}
metadata ObjectMetaRefer to Kubernetes API documentation for fields of metadata.Optional: {}
spec AgentgatewayBackendSpecspec defines the desired state of AgentgatewayBackend.ExactlyOneOf: [ai static dynamicForwardProxy mcp aws]
Required: {}
status AgentgatewayBackendStatusstatus defines the current state of AgentgatewayBackend.Optional: {}

AgentgatewayBackendSpec

Validation:

  • ExactlyOneOf: [ai static dynamicForwardProxy mcp aws]

Appears in:

FieldDescriptionDefaultValidation
static StaticBackendstatic represents a static hostname.Optional: {}
ai AIBackendai represents a LLM backend.ExactlyOneOf: [provider groups]
Optional: {}
mcp MCPBackendmcp represents an MCP backendOptional: {}
dynamicForwardProxy DynamicForwardProxyBackenddynamicForwardProxy configures the proxy to dynamically send requests to the destination based on the incoming
request HTTP host header, or TLS SNI for TLS traffic.
Note: this Backend type enables users to send trigger the proxy to send requests to arbitrary destinations. Proper
access controls must be put in place when using this backend type.
Optional: {}
aws AwsBackendaws represents an AWS service backend (AgentCore, etc.).ExactlyOneOf: [agentCore]
Optional: {}
policies BackendFullpolicies controls policies for communicating with this backend. Policies may also be set in AgentgatewayPolicy;
policies are merged on a field-level basis, with policies on the Backend (this field) taking precedence.
Optional: {}

AgentgatewayBackendStatus

AgentgatewayBackend defines the observed state of AgentgatewayBackend.

Appears in:

FieldDescriptionDefaultValidation
conditions Condition arrayConditions is the list of conditions for the backend.MaxItems: 8
Optional: {}

AgentgatewayParameters

AgentgatewayParameters are configuration that is used to dynamically provision the agentgateway data plane. Labels and annotations that apply to all resources may be specified at a higher level; see https://gateway-api.sigs.k8s.io/reference/spec/#gatewayinfrastructure

FieldDescriptionDefaultValidation
apiVersion stringagentgateway.dev/v1alpha1
kind stringAgentgatewayParameters
kind stringKind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
Optional: {}
apiVersion stringAPIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
Optional: {}
metadata ObjectMetaRefer to Kubernetes API documentation for fields of metadata.Optional: {}
spec AgentgatewayParametersSpecspec defines the desired state of AgentgatewayParameters.Required: {}
status AgentgatewayParametersStatusstatus defines the current state of AgentgatewayParameters.Optional: {}

AgentgatewayParametersConfigs

Appears in:

FieldDescriptionDefaultValidation
logging AgentgatewayParametersLogginglogging configuration for Agentgateway. By default, all logs are set to
info level.
Optional: {}
rawConfig JSONrawConfig provides an opaque mechanism to configure the agentgateway
config file. The agentgateway binary has a -f option to specify a
config file, and this field supplies that file. This will be merged with
configuration derived from typed fields like logging.format, and those
typed fields will take precedence.
Example:
rawConfig:
binds:
- port: 3000
listeners:
- routes:
- policies:
cors:
allowOrigins:
- “*"
allowHeaders:
- mcp-protocol-version
- content-type
- cache-control
backends:
- mcp:
targets:
- name: everything
stdio:
cmd: npx
args: ["@modelcontextprotocol/server-everything”]
Type: object
Optional: {}
image ImageThe agentgateway container image. See
https://kubernetes.io/docs/concepts/containers/images
for details.
Default values, which may be overridden individually:
registry: cr.agentgateway.dev
repository: agentgateway
tag:
pullPolicy: <omitted, relying on Kubernetes defaults which depend on the tag>
Optional: {}
env EnvVar arrayThe container environment variables. These override any existing
values. If you want to delete an environment variable entirely, use
$patch: delete with AgentgatewayParametersOverlays instead. Note that
variable
expansion

does apply, but is highly discouraged – to set dependent environment
variables, you can use $(VAR_NAME), but it’s highly discouraged.
$$(VAR_NAME) avoids expansion and results in a literal
$(VAR_NAME).
If SESSION_KEY is specified, it takes precedence over the
controller-managed per-Gateway session key Secret.
Optional: {}
resources ResourceRequirementsThe compute resources required by this container. See
https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
for details.
Optional: {}
shutdown ShutdownSpecShutdown delay configuration. How graceful planned or unplanned data
plane changes happen is in tension with how quickly rollouts of the data
plane complete. How long a data plane pod must wait for shutdown to be
perfectly graceful depends on how you have configured your Gateway
resources.
Optional: {}
istio IstioSpecConfigure Istio integration. If enabled, Agentgateway can natively connect to Istio enabled pods with mTLS.Optional: {}

AgentgatewayParametersLogging

Appears in:

FieldDescriptionDefaultValidation
level stringLogging level in standard RUST_LOG syntax, for example info (the
default), or a comma-separated per-module setting such as
rmcp=warn,hickory_server::server::server_future=off,typespec_client_core::http::policies::logging=warn.
Optional: {}
format AgentgatewayParametersLoggingFormatEnum: [json text]
Optional: {}

AgentgatewayParametersLoggingFormat

Underlying type: string

The default logging format is text.

Validation:

  • Enum: [json text]

Appears in:

FieldDescription
json
text

AgentgatewayParametersOverlays

Appears in:

FieldDescriptionDefaultValidation
deployment KubernetesResourceOverlaydeployment allows specifying overrides for the generated
Deployment resource.
Optional: {}
service KubernetesResourceOverlayservice allows specifying overrides for the generated Service
resource.
Optional: {}
serviceAccount KubernetesResourceOverlayserviceAccount allows specifying overrides for the generated
ServiceAccount resource.
Optional: {}
podDisruptionBudget KubernetesResourceOverlaypodDisruptionBudget allows creating a PodDisruptionBudget for the
agentgateway proxy. If absent, no PDB is created. If present, a PDB is
created with its selector automatically configured to target the
agentgateway proxy Deployment. The metadata and spec fields from
this overlay are applied to the generated PDB.
Optional: {}
horizontalPodAutoscaler KubernetesResourceOverlayhorizontalPodAutoscaler allows creating a HorizontalPodAutoscaler
for the agentgateway proxy. If absent, no HPA is created. If present, an
HPA is created with its scaleTargetRef automatically configured to
target the agentgateway proxy Deployment. The metadata and spec
fields from this overlay are applied to the generated HPA.
Optional: {}

AgentgatewayParametersSpec

Appears in:

FieldDescriptionDefaultValidation
logging AgentgatewayParametersLogginglogging configuration for Agentgateway. By default, all logs are set to
info level.
Optional: {}
rawConfig JSONrawConfig provides an opaque mechanism to configure the agentgateway
config file. The agentgateway binary has a -f option to specify a
config file, and this field supplies that file. This will be merged with
configuration derived from typed fields like logging.format, and those
typed fields will take precedence.
Example:
rawConfig:
binds:
- port: 3000
listeners:
- routes:
- policies:
cors:
allowOrigins:
- “*"
allowHeaders:
- mcp-protocol-version
- content-type
- cache-control
backends:
- mcp:
targets:
- name: everything
stdio:
cmd: npx
args: ["@modelcontextprotocol/server-everything”]
Type: object
Optional: {}
image ImageThe agentgateway container image. See
https://kubernetes.io/docs/concepts/containers/images
for details.
Default values, which may be overridden individually:
registry: cr.agentgateway.dev
repository: agentgateway
tag:
pullPolicy: <omitted, relying on Kubernetes defaults which depend on the tag>
Optional: {}
env EnvVar arrayThe container environment variables. These override any existing
values. If you want to delete an environment variable entirely, use
$patch: delete with AgentgatewayParametersOverlays instead. Note that
variable
expansion

does apply, but is highly discouraged – to set dependent environment
variables, you can use $(VAR_NAME), but it’s highly discouraged.
$$(VAR_NAME) avoids expansion and results in a literal
$(VAR_NAME).
If SESSION_KEY is specified, it takes precedence over the
controller-managed per-Gateway session key Secret.
Optional: {}
resources ResourceRequirementsThe compute resources required by this container. See
https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
for details.
Optional: {}
shutdown ShutdownSpecShutdown delay configuration. How graceful planned or unplanned data
plane changes happen is in tension with how quickly rollouts of the data
plane complete. How long a data plane pod must wait for shutdown to be
perfectly graceful depends on how you have configured your Gateway
resources.
Optional: {}
istio IstioSpecConfigure Istio integration. If enabled, Agentgateway can natively connect to Istio enabled pods with mTLS.Optional: {}
deployment KubernetesResourceOverlaydeployment allows specifying overrides for the generated
Deployment resource.
Optional: {}
service KubernetesResourceOverlayservice allows specifying overrides for the generated Service
resource.
Optional: {}
serviceAccount KubernetesResourceOverlayserviceAccount allows specifying overrides for the generated
ServiceAccount resource.
Optional: {}
podDisruptionBudget KubernetesResourceOverlaypodDisruptionBudget allows creating a PodDisruptionBudget for the
agentgateway proxy. If absent, no PDB is created. If present, a PDB is
created with its selector automatically configured to target the
agentgateway proxy Deployment. The metadata and spec fields from
this overlay are applied to the generated PDB.
Optional: {}
horizontalPodAutoscaler KubernetesResourceOverlayhorizontalPodAutoscaler allows creating a HorizontalPodAutoscaler
for the agentgateway proxy. If absent, no HPA is created. If present, an
HPA is created with its scaleTargetRef automatically configured to
target the agentgateway proxy Deployment. The metadata and spec
fields from this overlay are applied to the generated HPA.
Optional: {}

AgentgatewayParametersStatus

The current conditions of the AgentgatewayParameters. This is not currently implemented.

Appears in:

AgentgatewayPolicy

FieldDescriptionDefaultValidation
apiVersion stringagentgateway.dev/v1alpha1
kind stringAgentgatewayPolicy
kind stringKind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
Optional: {}
apiVersion stringAPIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
Optional: {}
metadata ObjectMetaRefer to Kubernetes API documentation for fields of metadata.Optional: {}
spec AgentgatewayPolicySpecspec defines the desired state of AgentgatewayPolicy.ExactlyOneOf: [targetRefs targetSelectors]
Required: {}
status PolicyStatusstatus defines the current state of AgentgatewayPolicy.Optional: {}

AgentgatewayPolicySpec

Validation:

  • ExactlyOneOf: [targetRefs targetSelectors]

Appears in:

FieldDescriptionDefaultValidation
targetRefs LocalPolicyTargetReferenceWithSectionName arraytargetRefs specifies the target resources by reference to attach the
policy to.
MaxItems: 16
MinItems: 1
Optional: {}
targetSelectors LocalPolicyTargetSelectorWithSectionName arraytargetSelectors specifies the target selectors used to select resources
to attach the policy to.
MaxItems: 16
MinItems: 1
Optional: {}
frontend Frontendfrontend defines settings for how to handle incoming traffic.
A frontend policy can only target a Gateway. Listener and
ListenerSet are not valid targets.
When multiple policies are selected for a given request, they are merged on a field-level basis, but not a deep
merge. For example, policy A sets tcp and tls, and policy B sets
tls; the effective policy would be tcp from policy A, and tls from
policy B.
Optional: {}
traffic Traffictraffic defines settings for how process traffic.
A traffic policy can target a Gateway (optionally, with a
sectionName indicating the listener), ListenerSet, or Route
(optionally, with a sectionName indicating the route rule).
When multiple policies are selected for a given request, they are merged on a field-level basis, but not a deep
merge. Precedence is given to more precise policies: Gateway «br />Listener < Route < Route Rule. For example, policy A sets
timeouts and retries, and policy B sets retries; the effective
policy would be timeouts from policy A, and retries from policy B.
Optional: {}
backend BackendFullbackend defines settings for how to connect to destination backends.
A backend policy can target a Gateway (optionally, with a
sectionName indicating the listener), ListenerSet, Route
(optionally, with a sectionName indicating the route rule), or a
Service or Backend (optionally, with a sectionName indicating the
port for Service, or sub-backend for Backend).
Note that a backend policy applies when connecting to a specific destination backend. Targeting a higher level
resource, like Gateway, is just a way to easily apply a policy to a
group of backends.
When multiple policies are selected for a given request, they are merged on a field-level basis, but not a deep
merge. Precedence is given to more precise policies: Gateway «br />Listener < Route < Route Rule < Backend or Service. For
example, if a Gateway policy sets tcp and tls, and a Backend
policy sets tls, the effective policy would be tcp from the
Gateway, and tls from the Backend.
Optional: {}

AnthropicConfig

AnthropicConfig settings for the Anthropic LLM provider.

Appears in:

FieldDescriptionDefaultValidation
model ShortStringOptional: Override the model name, such as gpt-4o-mini.
If unset, the model name is taken from the request.
MaxLength: 256
MinLength: 1
Optional: {}

AttributeAdd

Appears in:

FieldDescriptionDefaultValidation
name ShortStringMaxLength: 256
MinLength: 1
Required: {}
expression CELExpressionRequired: {}

AwsAgentCoreBackend

AwsAgentCoreBackend configures Amazon Bedrock AgentCore.

Appears in:

FieldDescriptionDefaultValidation
agentRuntimeArn stringagentRuntimeArn is the ARN of the AgentCore runtime.Required: {}
qualifier stringqualifier optionally specifies the alias or version qualifier.Optional: {}

AwsAuth

AwsAuth specifies the authentication method to use for the backend.

Appears in:

FieldDescriptionDefaultValidation
secretRef LocalObjectReferencesecretRef references a Kubernetes Secret containing the AWS
credentials. The Secret must have keys accessKey, secretKey, and
optionally sessionToken.
Required: {}

AwsBackend

AwsBackend configures an AWS service backend.

Validation:

  • ExactlyOneOf: [agentCore]

Appears in:

FieldDescriptionDefaultValidation
agentCore AwsAgentCoreBackendagentCore configures Amazon Bedrock AgentCore as a backend.Optional: {}

AzureAuth

Appears in:

FieldDescriptionDefaultValidation
secretRef LocalObjectReferencesecretRef references a Kubernetes Secret containing the Azure
credentials. The Secret must have keys clientId, tenantId, and
clientSecret.
Optional: {}
managedIdentity AzureManagedIdentityDetails for managed identity authenticationOptional: {}

AzureManagedIdentity

Appears in:

FieldDescriptionDefaultValidation
clientId stringRequired: {}
objectId stringRequired: {}
resourceId stringRequired: {}

AzureOpenAIConfig

AzureOpenAIConfig settings for the Azure OpenAI LLM provider.

Appears in:

FieldDescriptionDefaultValidation
endpoint ShortStringThe endpoint for the Azure OpenAI API to use, such as my-endpoint.openai.azure.com.
If the scheme is included, it is stripped.
MaxLength: 256
MinLength: 1
Required: {}
deploymentName ShortStringThe name of the Azure OpenAI model deployment to use.
For more information, see the Azure OpenAI model docs.
This is required if apiVersion is not v1. For v1, the model can be
set in the request.
MaxLength: 256
MinLength: 1
Optional: {}
apiVersion TinyStringThe version of the Azure OpenAI API to use.
For more information, see the Azure OpenAI API version reference.
If unset, defaults to v1.
MaxLength: 64
MinLength: 1
Optional: {}

BackendAI

Appears in:

FieldDescriptionDefaultValidation
prompt AIPromptEnrichmentEnrich requests sent to the LLM provider by appending and prepending system prompts. This can be configured only for
LLM providers that use the CHAT or CHAT_STREAMING API route type.
Optional: {}
promptGuard AIPromptGuardpromptGuard enables adding guardrails to LLM requests and responses.Optional: {}
defaults FieldDefault arrayProvide defaults to merge with user input fields. If the field is already set, the field in the request is used.MaxItems: 64
MinItems: 1
Optional: {}
overrides FieldDefault arrayProvide overrides to merge with user input fields. If the field is already set, the field will be overwritten.MaxItems: 64
MinItems: 1
Optional: {}
transformations FieldTransformation arrayProvide CEL transformations to compute and set fields in the request body.
The expression result overwrites any existing value for that field.
This has a higher priority than overrides if both are set for the same
key.
MaxItems: 64
MinItems: 1
Optional: {}
modelAliases object (keys:string, values:string)ModelAliases maps friendly model names to actual provider model names.
Example: \{"fast": "gpt-3.5-turbo", "smart": "gpt-4-turbo"\}.
Note: This field is only applicable when using the agentgateway data plane.
MaxProperties: 64
Optional: {}
promptCaching PromptCachingConfigpromptCaching enables automatic prompt caching for supported
providers, currently AWS Bedrock.
Reduces API costs by caching static content like system prompts and tool definitions.
Only applicable for Bedrock Claude 3+ and Nova models.
Optional: {}
routes object (keys:string, values:RouteType)routes defines how to identify the type of traffic to handle.
The keys are URL path suffixes matched using ends-with comparison, for
example "/v1/chat/completions".
The special * wildcard matches any path.
If not specified, all traffic defaults to completions type.
Optional: {}

BackendAuth

Validation:

  • ExactlyOneOf: [key secretRef passthrough aws azure gcp]

Appears in:

FieldDescriptionDefaultValidation
key stringkey provides an inline key to use as the value of the
Authorization header. This option is the least secure; usage of a
Secret is preferred.
MaxLength: 2048
Optional: {}
secretRef LocalObjectReferencesecretRef references a Kubernetes Secret storing the key to use as
the authorization value. This must be stored in the Authorization key.
Optional: {}
passthrough BackendAuthPassthroughpassthrough passes through an existing token that has been sent by the
client and validated. Other policies, like JWT and API key
authentication, will strip the original client credentials. Passthrough backend authentication
causes the original token to be added back into the request. If there are no client authentication policies on the
request, the original token would be unchanged, so this would have no effect.
Optional: {}
aws AwsAuthAuth specifies an explicit AWS authentication method for the backend.
When omitted, we will try to use the default AWS SDK authentication methods.
Optional: {}
azure AzureAuthAzure specifies an Azure authentication method for the backend.Optional: {}
gcp GcpAuthAuth specifies to use a Google authentication method for the backend.
When omitted, we will try to use the default AWS SDK authentication methods.
Optional: {}

BackendAuthPassthrough

Appears in:

BackendEviction

BackendEviction defines settings for evicting unhealthy backends.

Appears in:

FieldDescriptionDefaultValidation
duration DurationDuration specifies the base time a backend should be evicted after being marked unhealthy.
Subsequent evictions use multiplicative backoff (duration * times_evicted).
If all endpoints are evicted, the load balancer falls back to returning evicted endpoints
rather than failing entirely.
If unset, defaults to 3s.
3sOptional: {}
restoreHealth integerRestoreHealth is the health score (0–100) assigned to a backend when it returns from eviction.
For gradual recovery, set below 100; for full recovery immediately, set 100.
If unset, the backend resumes with the health it had when evicted.
Maximum: 100
Minimum: 0
Optional: {}
consecutiveFailures integerConsecutiveFailures is the number of consecutive unhealthy responses required before the backend is evicted.
For example, a value of 5 means the backend must receive 5 unhealthy responses in a row before being evicted.
When both consecutiveFailures and healthThreshold are set, the backend is evicted when either condition is met.
When neither is set, a single unhealthy response can trigger eviction.
Minimum: 0
Optional: {}
healthThreshold integerHealthThreshold is the EWMA (exponentially-weighted moving average) health score threshold, expressed as 0–100.
When set, a backend is only evicted if its computed health drops below this value after an unhealthy response.
For example, 50 means the backend is evicted when its EWMA health falls below 50% following failures.
Unlike consecutiveFailures (which counts consecutive failures), this uses a sliding-window average
so a single success in a stream of failures can delay eviction.
When both consecutiveFailures and healthThreshold are set, the backend is evicted when either condition is met.
When neither is set, a single unhealthy response triggers eviction.
Maximum: 100
Minimum: 0
Optional: {}

BackendFull

Appears in:

FieldDescriptionDefaultValidation
tcp BackendTCPtcp defines settings for managing TCP connections to the backend.Optional: {}
tls BackendTLStls defines settings for managing TLS connections to the backend.
If this field is set, TLS will be initiated to the backend; the system trusted CA certificates will be used to
validate the server, and the SNI will automatically be set based on the destination.
AtMostOneOf: [verifySubjectAltNames insecureSkipVerify]
Optional: {}
http BackendHTTPhttp defines settings for managing HTTP requests to the backend.Optional: {}
tunnel BackendTunneltunnel defines settings for managing tunnel connections (with behavior like HTTPS_PROXY) to the backend.Optional: {}
transformation Transformationtransformation is used to mutate and transform requests and responses sent to and from the backend.Optional: {}
auth BackendAuthauth defines settings for managing authentication to the backend.ExactlyOneOf: [key secretRef passthrough aws azure gcp]
Optional: {}
health Healthhealth defines settings for passive and active health checking.Optional: {}
ai BackendAIai specifies settings for AI workloads. This is only applicable when
connecting to a Backend of type ai.
Optional: {}
mcp BackendMCPmcp specifies settings for MCP workloads. This is only applicable when
connecting to a Backend of type mcp.
This field is deprecated; prefer to use traffic policy jwtAuthentication.mcp, which ensures authentication runs before
other policies such as transformation and rate limiting.
Optional: {}

BackendHTTP

Appears in:

FieldDescriptionDefaultValidation
version HTTPVersionversion specifies the HTTP protocol version to use when connecting to
the backend.
If not specified, the version is automatically determined:
* Service types can specify it with appProtocol on the Service
port.
* If traffic is identified as gRPC, HTTP2 is used.
* If the incoming traffic was plaintext HTTP, the original protocol will
be used.
* If the incoming traffic was HTTPS, HTTP1 will be used. This is
because most clients will transparently upgrade HTTPS traffic to
HTTP2, even if the backend doesn’t support it.
Enum: [HTTP1 HTTP2]
Optional: {}
requestTimeout DurationrequestTimeout specifies the deadline for receiving a response from the backend.Optional: {}

BackendMCP

Appears in:

FieldDescriptionDefaultValidation
authorization Authorizationauthorization defines MCPBackend level authorization. Unlike authorization at the HTTP level, which will reject
unauthorized requests with a 403 error, this policy works at the
MCPBackend level.
List operations, such as list_tools, will have each item evaluated.
Items that do not meet the rule will be filtered.
Get or call operations, such as call_tool, will evaluate the specific
item and reject requests that do not meet the rule.
Optional: {}
authentication MCPAuthenticationauthentication defines MCPBackend-specific authentication rules.Optional: {}

BackendSimple

Appears in:

FieldDescriptionDefaultValidation
tcp BackendTCPtcp defines settings for managing TCP connections to the backend.Optional: {}
tls BackendTLStls defines settings for managing TLS connections to the backend.
If this field is set, TLS will be initiated to the backend; the system trusted CA certificates will be used to
validate the server, and the SNI will automatically be set based on the destination.
AtMostOneOf: [verifySubjectAltNames insecureSkipVerify]
Optional: {}
http BackendHTTPhttp defines settings for managing HTTP requests to the backend.Optional: {}
tunnel BackendTunneltunnel defines settings for managing tunnel connections (with behavior like HTTPS_PROXY) to the backend.Optional: {}
transformation Transformationtransformation is used to mutate and transform requests and responses sent to and from the backend.Optional: {}
auth BackendAuthauth defines settings for managing authentication to the backend.ExactlyOneOf: [key secretRef passthrough aws azure gcp]
Optional: {}
health Healthhealth defines settings for passive and active health checking.Optional: {}

BackendTCP

Appears in:

FieldDescriptionDefaultValidation
keepalive KeepalivekeepAlive defines settings for enabling TCP keepalives on the
connection.
Optional: {}
connectTimeout DurationconnectTimeout defines the deadline for establishing a connection to
the destination.
Optional: {}

BackendTLS

Validation:

  • AtMostOneOf: [verifySubjectAltNames insecureSkipVerify]

Appears in:

FieldDescriptionDefaultValidation
mtlsCertificateRef LocalObjectReference arraymtlsCertificateRef enables mutual TLS to the backend, using the
specified key (tls.key) and cert (tls.crt) from the referenced
Secret.
An optional ca.cert field, if present, will be used to verify the
server certificate. If caCertificateRefs is also specified, the
caCertificateRefs field takes priority.
If unspecified, no client certificate will be used.
MaxItems: 1
Optional: {}
caCertificateRefs LocalObjectReference arraycaCertificateRefs defines the CA certificate ConfigMap to use to
verify the server certificate.
If unset, the system’s trusted certificates are used.
MaxItems: 1
Optional: {}
insecureSkipVerify InsecureTLSModeinsecureSkipVerify originates TLS but skips verification of the backend’s certificate.
WARNING: This is an insecure option that should only be used if the risks are understood.
There are two modes:
* All disables all TLS verification.
* Hostname verifies the CA certificate is trusted, but ignores any
mismatch of hostname or SANs. Note that this method is still insecure;
prefer setting verifySubjectAltNames to customize the valid hostnames
if possible.
Enum: [All Hostname]
Optional: {}
sni SNIsni specifies the Server Name Indicator (SNI) to be used in the TLS
handshake. If unset, the SNI is automatically set based on the
destination hostname.
MaxLength: 253
MinLength: 1
Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$
Optional: {}
verifySubjectAltNames ShortString arrayverifySubjectAltNames specifies the Subject Alternative Names (SAN)
to verify in the server certificate.
If not present, the destination hostname is automatically used.
MaxItems: 16
MaxLength: 256
MinItems: 1
MinLength: 1
Optional: {}
alpnProtocols TinyStringalpnProtocols sets the Application-Layer Protocol Negotiation (ALPN)
value to use in the TLS handshake.
If not present, defaults to ["h2", "http/1.1"].
MaxItems: 16
MaxLength: 64
MinItems: 1
MinLength: 1
Optional: {}

BackendTunnel

Appears in:

FieldDescriptionDefaultValidation
backendRef BackendObjectReferencebackendRef references the proxy server to reach.
Supported types: Service and Backend.
Required: {}

BackendWithAI

Appears in:

FieldDescriptionDefaultValidation
tcp BackendTCPtcp defines settings for managing TCP connections to the backend.Optional: {}
tls BackendTLStls defines settings for managing TLS connections to the backend.
If this field is set, TLS will be initiated to the backend; the system trusted CA certificates will be used to
validate the server, and the SNI will automatically be set based on the destination.
AtMostOneOf: [verifySubjectAltNames insecureSkipVerify]
Optional: {}
http BackendHTTPhttp defines settings for managing HTTP requests to the backend.Optional: {}
tunnel BackendTunneltunnel defines settings for managing tunnel connections (with behavior like HTTPS_PROXY) to the backend.Optional: {}
transformation Transformationtransformation is used to mutate and transform requests and responses sent to and from the backend.Optional: {}
auth BackendAuthauth defines settings for managing authentication to the backend.ExactlyOneOf: [key secretRef passthrough aws azure gcp]
Optional: {}
health Healthhealth defines settings for passive and active health checking.Optional: {}
ai BackendAIai specifies settings for AI workloads. This is only applicable when
connecting to a Backend of type ai.
Optional: {}

BasicAuthentication

Validation:

  • ExactlyOneOf: [users secretRef]

Appears in:

FieldDescriptionDefaultValidation
mode BasicAuthenticationModemode is the validation mode for basic auth authentication.StrictEnum: [Strict Optional]
Optional: {}
realm stringrealm specifies the realm to return in the WWW-Authenticate
header for failed authentication requests. If unset, Restricted will
be used.
Optional: {}
users string arrayusers provides an inline list of username and password pairs that will
be accepted. Each entry represents one line of the htpasswd format:
https://httpd.apache.org/docs/2.4/programs/htpasswd.html.
Note: passwords should be the hash of the password, not the raw password. Use the htpasswd or similar commands
to generate a hash. MD5, bcrypt, crypt, and SHA-1 are supported.
Example:
users:
- “user1:$apr1$ivPt0D4C$DmRhnewfHRSrb3DQC.WHC."
- “user2:$2y$05$r3J4d3VepzFkedkd/q1vI.pBYIpSqjfN0qOARV3ScUHysatnS0cL2”
MaxItems: 256
MinItems: 1
Optional: {}
secretRef LocalObjectReferencesecretRef references a Kubernetes Secret storing the .htaccess
file. The Secret must have a key named .htaccess, and should contain
the complete .htaccess file.
Note: passwords should be the hash of the password, not the raw password. Use the htpasswd or similar commands
to generate a hash. MD5, bcrypt, crypt, and SHA-1 are supported.
Example:
apiVersion: v1
kind: Secret
metadata:
name: basic-auth
stringData:
.htaccess: |
alice:$apr1$3zSE0Abt$IuETi4l5yO87MuOrbSE4V.
bob:$apr1$Ukb5LgRD$EPY2lIfY.A54jzLELNIId/
Optional: {}

BasicAuthenticationMode

Underlying type: string

Validation:

  • Enum: [Strict Optional]

Appears in:

FieldDescription
StrictA valid username and password must be present.
This is the default option.
OptionalIf a username and password exists, validate it.
Warning: this allows requests without a username!

BedrockConfig

Appears in:

FieldDescriptionDefaultValidation
region stringRegion is the AWS region to use for the backend.
Defaults to us-east-1 if not specified.
us-east-1MaxLength: 63
MinLength: 1
Pattern: ^[a-z0-9-]+$
Optional: {}
model ShortStringOptional: Override the model name, such as gpt-4o-mini.
If unset, the model name is taken from the request.
MaxLength: 256
MinLength: 1
Optional: {}
guardrail AWSGuardrailConfigguardrail configures the Guardrail policy to use for the backend. See
https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html.
If not specified, the AWS Guardrail policy will not be used.
Optional: {}

BedrockGuardrails

Appears in:

FieldDescriptionDefaultValidation
identifier ShortStringGuardrailIdentifier is the identifier of the Guardrail policy to use for the backend.MaxLength: 256
MinLength: 1
Required: {}
version ShortStringGuardrailVersion is the version of the Guardrail policy to use for the backend.MaxLength: 256
MinLength: 1
Required: {}
region ShortStringRegion is the AWS region where the guardrail is deployed (for example,
us-west-2).
MaxLength: 256
MinLength: 1
Required: {}
policies BackendSimplepolicies controls policies for communicating with AWS Bedrock Guardrails.Optional: {}

BuiltIn

Underlying type: string

Built-in regex patterns for specific types of strings in prompts. For example, if you specify CreditCard, any credit card numbers in the request or response are matched.

Validation:

  • Enum: [Ssn CreditCard PhoneNumber Email CaSin]

Appears in:

FieldDescription
SsnDefault regex matching for Social Security numbers.
CreditCardDefault regex matching for credit card numbers.
PhoneNumberDefault regex matching for phone numbers.
EmailDefault regex matching for email addresses.
CaSinDefault regex matching for Canadian Social Insurance Numbers.

CORS

Appears in:

CSRF

Appears in:

FieldDescriptionDefaultValidation
additionalOrigins ShortString arrayadditionalOrigins specifies additional source origins that will be
allowed in addition to the destination origin. The Origin consists of
a scheme and a host, with an optional port, and takes the form
<scheme>://<host>(:<port>).
MaxItems: 16
MaxLength: 256
MinItems: 1
MinLength: 1
Optional: {}

CipherSuite

Underlying type: string

Validation:

  • Enum: [TLS13_AES_256_GCM_SHA384 TLS13_AES_128_GCM_SHA256 TLS13_CHACHA20_POLY1305_SHA256 TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256 TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256]

Appears in:

FieldDescription
TLS13_AES_256_GCM_SHA384TLS 1.3 cipher suites
TLS13_AES_128_GCM_SHA256
TLS13_CHACHA20_POLY1305_SHA256
TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384TLS 1.2 cipher suites
TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256
TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256

CustomResponse

CustomResponse configures a response to return to the client if request content is matched against a regex pattern and the action is REJECT.

Appears in:

FieldDescriptionDefaultValidation
message stringA custom response message to return to the client. If not specified, defaults to
The request was rejected due to inappropriate content.
The request was rejected due to inappropriate contentOptional: {}
statusCode integerThe status code to return to the client. Defaults to 403.403Maximum: 599
Minimum: 200
Optional: {}

DirectResponse

DirectResponse defines the policy to send a direct response to the client.

Appears in:

FieldDescriptionDefaultValidation
status integerStatusCode defines the HTTP status code to return for this route.Maximum: 599
Minimum: 200
Required: {}
body stringBody defines the content to be returned in the HTTP response body.
The maximum length of the body is restricted to prevent excessively large responses.
If this field is omitted, no body is included in the response.
MaxLength: 4096
MinLength: 1
Optional: {}

DynamicForwardProxyBackend

Appears in:

ExtAuth

Validation:

  • ExactlyOneOf: [grpc http]

Appears in:

FieldDescriptionDefaultValidation
backendRef BackendObjectReferencebackendRef references the External Authorization server to reach.
Supported types: Service and Backend.
Required: {}
failureMode FailureModeFailureMode controls behavior when the external authorization service is
unavailable or returns an error. “FailOpen” allows the request to continue.
“FailClosed” (default) denies the request.
Enum: [FailOpen FailClosed]
Optional: {}
grpc AgentExtAuthGRPCgrpc specifies that the gRPC External Authorization
protocol should be used.
Optional: {}
http AgentExtAuthHTTPhttp specifies that the HTTP protocol should be used for connecting to
the authorization server. The authorization server must return a 200
status code, otherwise the request is considered an authorization
failure.
Optional: {}
forwardBody ExtAuthBodyforwardBody configures whether to include the HTTP body in the request.
If enabled, the request body will be buffered.
Optional: {}

ExtAuthBody

Appears in:

FieldDescriptionDefaultValidation
maxSize integermaxSize specifies, in bytes, the largest body that will be buffered
and sent to the authorization server. If the body size is larger than
maxSize, then the request will be rejected with a response.
Minimum: 1
Required: {}

ExtProc

Appears in:

FieldDescriptionDefaultValidation
backendRef BackendObjectReferencebackendRef references the External Processor server to reach.
Supported types: Service and Backend.
Required: {}

FailureMode

Underlying type: string

Validation:

  • Enum: [FailOpen FailClosed]

Appears in:

FieldDescription
FailClosedFailClosed fails the entire MCP session if any target fails.
FailOpenFailOpen skips failed targets and continues serving from healthy ones.

FieldDefault

FieldDefault provides default values for specific fields in the JSON request body sent to the LLM provider. These defaults are merged with the user-provided request to ensure missing fields are populated.

User input fields here refer to the fields in the JSON request body that a client sends when making a request to the LLM provider. Defaults set here do not override those user-provided values unless you explicitly set override to true.

Example: Setting a default system field for Anthropic, which does not support system role messages:

defaults:
  - field: "system"
    value: "answer all questions in French"

Example: Setting a default temperature and overriding max_tokens:

defaults:
  - field: "temperature"
    value: "0.5"
  - field: "max_tokens"
    value: "100"
    override: true

Example: Setting custom lists fields:

defaults:
  - field: "custom_integer_list"
    value: [1,2,3]

overrides:
  - field: "custom_string_list"
    value: ["one","two","three"]

Note: The field values correspond to keys in the JSON request body, not fields in this CRD.

Appears in:

FieldDescriptionDefaultValidation
field ShortStringThe name of the field.MaxLength: 256
MinLength: 1
Required: {}
value JSONThe field default value, which can be any JSON Data Type.Required: {}

FieldTransformation

FieldTransformation maps a request JSON field to a CEL expression string. The expression is evaluated against the current request body and its result is assigned to the configured field.

Appears in:

FieldDescriptionDefaultValidation
field ShortStringThe name of the field to set.MaxLength: 256
MinLength: 1
Required: {}
expression CELExpressionCEL expression used to compute the field value.Required: {}

Frontend

Appears in:

FieldDescriptionDefaultValidation
tcp FrontendTCPtcp defines settings on managing incoming TCP connections.Optional: {}
networkAuthorization AuthorizationnetworkAuthorization defines CEL authorization on downstream network connections.
This runs before protocol handling and is intended for L4 access control,
for example using source.address with cidr(...).containsIP(...).
Optional: {}
tls FrontendTLStls defines settings on managing incoming TLS connections.Optional: {}
http FrontendHTTPhttp defines settings on managing incoming HTTP requests.Optional: {}
accessLog AccessLogaccessLog contains access logging configuration.Optional: {}
tracing Tracingtracing contains various settings for the OpenTelemetry tracer.Optional: {}

FrontendHTTP

Appears in:

FieldDescriptionDefaultValidation
maxBufferSize integermaxBufferSize defines the maximum HTTP body size that will be buffered
into memory.
Bodies will only be buffered for policies which require buffering.
If unset, this defaults to 2mb.
Minimum: 1
Optional: {}
http1MaxHeaders integerhttp1MaxHeaders defines the maximum number of headers that are allowed
in HTTP/1.1 requests.
If unset, this defaults to 100.
Maximum: 4096
Minimum: 1
Optional: {}
http1IdleTimeout Durationhttp1IdleTimeout defines the timeout before an unused connection is
closed.
If unset, this defaults to 10 minutes.
Optional: {}
http2WindowSize integerhttp2WindowSize indicates the initial window size for stream-level flow
control for received data.
Minimum: 1
Optional: {}
http2ConnectionWindowSize integerhttp2ConnectionWindowSize indicates the initial window size for
connection-level flow control for received data.
Minimum: 1
Optional: {}
http2FrameSize integerhttp2FrameSize sets the maximum frame size to use.
If unset, this defaults to 16kb.
Maximum: 1.677215e+06
Minimum: 16384
Optional: {}
http2KeepaliveInterval DurationOptional: {}
http2KeepaliveTimeout DurationOptional: {}

FrontendTCP

Appears in:

FieldDescriptionDefaultValidation
keepalive Keepalivekeepalive defines settings for enabling TCP keepalives on the connection.Optional: {}

FrontendTLS

Appears in:

FieldDescriptionDefaultValidation
handshakeTimeout DurationhandshakeTimeout specifies the deadline for a TLS handshake to
complete. If unset, this defaults to 15s.
Optional: {}
alpnProtocols TinyStringalpnProtocols sets the Application-Layer Protocol Negotiation (ALPN)
value to use in the TLS handshake.
If not present, defaults to ["h2", "http/1.1"].
MaxItems: 16
MaxLength: 64
MinItems: 1
MinLength: 1
Optional: {}
minProtocolVersion TLSVersionMinTLSVersion configures the minimum TLS version to support.Enum: [1.2 1.3]
Optional: {}
maxProtocolVersion TLSVersionMaxTLSVersion configures the maximum TLS version to support.Enum: [1.2 1.3]
Optional: {}
cipherSuites CipherSuite arrayCipherSuites configures the list of cipher suites for a TLS listener.
The value is a comma-separated list of cipher suites, for example
TLS13_AES_256_GCM_SHA384,TLS13_AES_128_GCM_SHA256.
Use this in the TLS options field of a TLS listener.
Enum: [TLS13_AES_256_GCM_SHA384 TLS13_AES_128_GCM_SHA256 TLS13_CHACHA20_POLY1305_SHA256 TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256 TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256]
Optional: {}

GcpAuth

GcpAuth specifies how to authenticate on Google Cloud Platform.

Appears in:

FieldDescriptionDefaultValidation
type GcpAuthTypeThe type of token to generate. To authenticate to GCP services,
generally an AccessToken is used. To authenticate to Cloud Run, an
IdToken is used.
Enum: [AccessToken IdToken]
Optional: {}
audience ShortStringaudience allows explicitly configuring the aud of the ID token. Only
valid with IdToken type. If not set, the aud is automatically
derived from the backend hostname.
MaxLength: 256
MinLength: 1
Optional: {}

GcpAuthType

Underlying type: string

Validation:

  • Enum: [AccessToken IdToken]

Appears in:

FieldDescription
AccessToken
IdToken

GeminiConfig

GeminiConfig settings for the Gemini LLM provider.

Appears in:

FieldDescriptionDefaultValidation
model ShortStringOptional: Override the model name, such as gemini-2.5-pro.
If unset, the model name is taken from the request.
MaxLength: 256
MinLength: 1
Optional: {}

GlobalRateLimit

Appears in:

FieldDescriptionDefaultValidation
backendRef BackendObjectReferencebackendRef references the rate limit server to reach.
Supported types: Service and Backend.
Required: {}
failureMode FailureModefailureMode controls behavior when the remote rate limit service is
unavailable or returns an error. FailOpen allows the request to continue.
FailClosed (default) denies the request.
Enum: [FailOpen FailClosed]
Optional: {}
domain ShortStringdomain specifies the domain under which this limit should apply.
This is an arbitrary string that enables a rate limit server to distinguish between different applications.
MaxLength: 256
MinLength: 1
Required: {}
descriptors RateLimitDescriptor arraydescriptors define the dimensions for rate limiting. These values are
passed to the rate limit service which applies configured limits based
on them. Each descriptor represents a single rate limit rule with one or
more entries.
MaxItems: 16
MinItems: 1
Required: {}

GoogleModelArmor

Appears in:

FieldDescriptionDefaultValidation
templateId ShortStringTemplateID is the template ID for Google Model Armor.MaxLength: 256
MinLength: 1
Required: {}
projectId ShortStringProjectID is the Google Cloud project ID.MaxLength: 256
MinLength: 1
Required: {}
location ShortStringLocation is the Google Cloud location (for example, us-central1).
Defaults to us-central1 if not specified.
us-central1MaxLength: 256
MinLength: 1
Optional: {}
policies BackendSimplepolicies controls policies for communicating with Google Model Armor.Optional: {}

HTTPVersion

Underlying type: string

Appears in:

FieldDescription
HTTP1
HTTP2

HeaderName

Underlying type: string

An HTTP Header Name.

Validation:

  • MaxLength: 256
  • MinLength: 1
  • Pattern: ^:?[A-Za-z0-9!#$%&'*+\-.^_\x60|~]+$

Appears in:

HeaderTransformation

Appears in:

FieldDescriptionDefaultValidation
name HeaderNameThe name of the header to add.MaxLength: 256
MinLength: 1
Pattern: ^:?[A-Za-z0-9!#$%&'*+\-.^_\x60|~]+$
Required: {}
value CELExpressionvalue is the CEL expression to apply to generate the output value for
the header.
Required: {}

Health

Appears in:

FieldDescriptionDefaultValidation
unhealthyCondition CELExpressionUnhealthyCondition is a CEL expression that determines whether a response indicates an unhealthy backend.
When the expression evaluates to true, the backend is considered unhealthy and may be evicted.
For example, to evict on 5xx responses: response.code >= 500.
When unset, any 5xx response, or a connection failure, is treated as unhealthy.
This default lowers the backend’s health score but does not trigger eviction on its own.
Optional: {}
eviction BackendEvictionEviction defines settings for evicting unhealthy backends.Optional: {}

HostnameRewrite

Appears in:

FieldDescriptionDefaultValidation
mode HostnameRewriteModemode sets the hostname rewrite mode.
The following may be specified:
* Auto: automatically set the Host header based on the destination.
* None: do not rewrite the Host header. The original Host header
will be passed through.
This setting defaults to Auto when connecting to hostname-based
Backend types, and None otherwise, for Service or IP-based
backends.
Enum: [Auto None]
Required: {}

HostnameRewriteMode

Underlying type: string

Appears in:

FieldDescription
Auto
None

Image

A container image. See https://kubernetes.io/docs/concepts/containers/images for details.

Appears in:

FieldDescriptionDefaultValidation
registry stringThe image registry.Optional: {}
repository stringThe image repository (name).Optional: {}
tag stringThe image tag.Optional: {}
digest stringThe hash digest of the image, e.g. sha256:12345...Optional: {}
pullPolicy PullPolicyThe image pull policy for the container. See
https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy
for details.
Optional: {}

InsecureTLSMode

Underlying type: string

Appears in:

FieldDescription
AllInsecureTLSModeInsecure disables all TLS verification
HostnameInsecureTLSModeHostname enables verifying the CA certificate, but disables verification of the hostname/SAN.
Note this is still, generally, very “insecure” as the name suggests.

IstioSpec

Appears in:

FieldDescriptionDefaultValidation
caAddress stringThe address of the Istio CA. If unset, defaults to https://istiod.istio-system.svc:15012.Optional: {}
trustDomain stringThe Istio trust domain. If not set, defaults to cluster.local.Optional: {}

JWKS

Validation:

  • ExactlyOneOf: [remote inline]

Appears in:

FieldDescriptionDefaultValidation
remote RemoteJWKSremote specifies how to reach the JSON Web Key Set from a remote
address.
Optional: {}
inline stringinline specifies an inline JSON Web Key Set used to validate the
signature of the JWT.
MaxLength: 65536
MinLength: 2
Optional: {}

JWTAuthentication

Appears in:

FieldDescriptionDefaultValidation
mode JWTAuthenticationModemode is the validation mode for JWT authentication.StrictEnum: [Strict Optional Permissive]
Optional: {}
providers JWTProvider arrayMaxItems: 64
MinItems: 1
Required: {}
mcp JWTMCPConfigmcp optionally enables MCP OAuth metadata endpoint handling
and MCP-specific authentication behavior on top of standard JWT validation.
When set, the gateway will serve the MCP OAuth metadata discovery endpoints.
Optional: {}

JWTAuthenticationMode

Underlying type: string

Validation:

  • Enum: [Strict Optional Permissive]

Appears in:

FieldDescription
StrictA valid token, issued by a configured issuer, must be present.
This is the default option.
OptionalIf a token exists, validate it.
Warning: this allows requests without a JWT token!
PermissiveRequests are never rejected. This is useful for usage of claims in later steps (authorization, logging, etc).
Warning: this allows requests without a JWT token!

JWTMCPConfig

JWTMCPConfig holds MCP-specific extensions for JWT authentication.

Appears in:

FieldDescriptionDefaultValidation
resourceMetadata object (keys:string, values:JSON)resourceMetadata defines the metadata to use for MCP resources,
served at the MCP OAuth metadata endpoints.
Optional: {}
provider McpIDPprovider specifies the identity provider to use for MCP authentication flows.Enum: [Auth0 Keycloak]
Optional: {}

JWTProvider

Appears in:

FieldDescriptionDefaultValidation
issuer ShortStringissuer identifies the IdP that issued the JWT. This corresponds to the
iss claim (https://tools.ietf.org/html/rfc7519#section-4.1.1).
MaxLength: 256
MinLength: 1
Required: {}
audiences string arrayaudiences specifies the list of allowed audiences that are allowed
access. This corresponds to the aud claim
(https://datatracker.ietf.org/doc/html/rfc7519#section-4.1.3).
If unset, any audience is allowed.
MaxItems: 64
MinItems: 1
Optional: {}
jwks JWKSjwks defines the JSON Web Key Set used to validate the signature of the
JWT.
ExactlyOneOf: [remote inline]
Required: {}

Keepalive

TCP keepalive settings.

Appears in:

FieldDescriptionDefaultValidation
retries integerretries specifies the maximum number of keep-alive probes to send before dropping the connection.
If unset, this defaults to 9.
Maximum: 64
Minimum: 1
Optional: {}
time Durationtime specifies the number of seconds a connection needs to be idle before keep-alive probes start being sent.
If unset, this defaults to 180s.
Optional: {}
interval Durationinterval specifies the number of seconds between keep-alive probes.
If unset, this defaults to 180s.
Optional: {}

LLMProvider

LLMProvider specifies the target large language model provider that the backend should route requests to.

Validation:

  • ExactlyOneOf: [openai azureopenai anthropic gemini vertexai bedrock]

Appears in:

FieldDescriptionDefaultValidation
openai OpenAIConfigOpenAI providerOptional: {}
azureopenai AzureOpenAIConfigAzure OpenAI providerOptional: {}
anthropic AnthropicConfigAnthropic providerOptional: {}
gemini GeminiConfigGemini providerOptional: {}
vertexai VertexAIConfigVertex AI providerOptional: {}
bedrock BedrockConfigBedrock providerOptional: {}
host ShortStringHost specifies the hostname to send the requests to.
If not specified, the default hostname for the provider is used.
MaxLength: 256
MinLength: 1
Optional: {}
port integerPort specifies the port to send the requests to.Maximum: 65535
Minimum: 1
Optional: {}
path LongStringPath specifies the URL path to use for the LLM provider API requests.
This is useful when you need to route requests to a different API endpoint while maintaining
compatibility with the original provider’s API structure.
If not specified, the default path for the provider is used.
MaxLength: 1024
MinLength: 1
Optional: {}
pathPrefix LongStringPathPrefix overrides the default base path prefix (e.g. “/v1”) for upstream requests.
Path translation for cross-format requests still applies using this prefix.
Only supported for OpenAI and Anthropic providers.
MaxLength: 1024
MinLength: 1
Optional: {}

LocalRateLimit

Policy for local rate limiting. Local rate limits are handled locally on a per-proxy basis, without co-ordination between instances of the proxy.

Validation:

  • ExactlyOneOf: [requests tokens]

Appears in:

FieldDescriptionDefaultValidation
requests integerrequests specifies the number of HTTP requests per unit of time that
are allowed. Requests exceeding this limit will fail with a 429
error.
Minimum: 1
Optional: {}
tokens integertokens specifies the number of LLM tokens per unit of time that are
allowed. Requests exceeding this limit will fail with a 429 error.
Both input and output tokens are counted. However, token counts are not known until the request completes. As a
result, token-based rate limits will apply to future requests only.
Minimum: 1
Optional: {}
unit LocalRateLimitUnitunit specifies the unit of time that requests are limited on.Enum: [Seconds Minutes Hours]
Required: {}
burst integerburst specifies an allowance of requests above the request-per-unit
that should be allowed within a short period of time.
Optional: {}

LocalRateLimitUnit

Underlying type: string

Appears in:

FieldDescription
Seconds
Minutes
Hours

LogTracingAttributes

Appears in:

FieldDescriptionDefaultValidation
remove TinyString arrayremove lists the default fields that should be removed. For example,
http.method.
MaxItems: 32
MaxLength: 64
MinItems: 1
MinLength: 1
Optional: {}
add AttributeAdd arrayadd specifies additional key-value pairs to be added to each entry.
The value is a CEL expression. If the CEL expression fails to evaluate,
the pair will be excluded.
MinItems: 1
Optional: {}

MCPAuthentication

Appears in:

FieldDescriptionDefaultValidation
resourceMetadata object (keys:string, values:JSON)ResourceMetadata defines the metadata to use for MCP resources.Optional: {}
provider McpIDPprovider specifies the identity provider to use for authentication.Enum: [Auth0 Keycloak]
Optional: {}
issuer ShortStringissuer identifies the IdP that issued the JWT. This corresponds to the
iss claim (https://tools.ietf.org/html/rfc7519#section-4.1.1).
MaxLength: 256
MinLength: 1
Optional: {}
audiences string arrayaudiences specifies the list of allowed audiences that are allowed
access. This corresponds to the aud claim
(https://datatracker.ietf.org/doc/html/rfc7519#section-4.1.3).
If unset, any audience is allowed.
MaxItems: 64
MinItems: 1
Optional: {}
jwks RemoteJWKSjwks defines the remote JSON Web Key used to validate the signature of
the JWT.
Required: {}
mode JWTAuthenticationModemode is the validation mode for JWT authentication.StrictEnum: [Strict Optional Permissive]
Optional: {}

MCPBackend

MCPBackend configures mcp backends.

Appears in:

FieldDescriptionDefaultValidation
targets McpTargetSelector arraytargets is a list of MCP targets to use for this backend. Policies
targeting MCP targets must use targetRefs[].sectionName to select
the target by name.
ExactlyOneOf: [selector static]
MaxItems: 32
MinItems: 1
Required: {}
sessionRouting SessionRoutingsessionRouting configures MCP session behavior for requests.
Defaults to Stateful if not set.
Enum: [Stateful Stateless]
Optional: {}
failureMode FailureModefailureMode controls behavior when MCP targets fail to initialize or
become unavailable at runtime. FailOpen skips failed targets and
continues serving from healthy ones. FailClosed (default) fails the
entire session if any target fails.
Enum: [FailOpen FailClosed]
Optional: {}

MCPProtocol

Underlying type: string

MCPProtocol defines the protocol to use for the MCPBackend target.

Validation:

  • Enum: [StreamableHTTP SSE]

Appears in:

FieldDescription
StreamableHTTPMCPProtocolStreamableHTTP specifies that StreamableHTTP must be used as
the protocol.
SSEMCPProtocolSSE specifies that Server-Sent Events (SSE) must be used as
the protocol.

McpIDP

Underlying type: string

Appears in:

FieldDescription
Auth0
Keycloak

McpSelector

Appears in:

FieldDescriptionDefaultValidation
namespaces LabelSelectornamespace is the label selector for namespaces that Service
resources should be selected from. If unset, only the namespace of the
AgentgatewayBackend is searched.
Optional: {}
services LabelSelectorservices is the label selector for which Service resources should be
selected.
Optional: {}

McpTarget

McpTarget defines a single MCPBackend target configuration.

Validation:

  • ExactlyOneOf: [host backendRef]

Appears in:

FieldDescriptionDefaultValidation
host ShortStringHost is the hostname or IP address of the MCP target.MaxLength: 256
MinLength: 1
Optional: {}
backendRef LocalObjectReferencebackendRef references a namespace-local Service resource by name.
When set, this replaces host only; port, path, and protocol
remain configured on this target.
Optional: {}
port integerPort is the port number of the MCP target.Maximum: 65535
Minimum: 1
Required: {}
path LongStringPath is the URL path of the MCP target endpoint.
Defaults to "/sse" for the SSE protocol or "/mcp" for the
StreamableHTTP protocol if not specified.
MaxLength: 1024
MinLength: 1
Optional: {}
protocol MCPProtocolProtocol is the protocol to use for the connection to the MCP
target.
Enum: [StreamableHTTP SSE]
Optional: {}
policies BackendSimplepolicies controls policies for communicating with this backend.
Policies may also be set in AgentgatewayPolicy, or in the top-level
AgentgatewayBackend. Policies are merged on a field-level basis, with
order: AgentgatewayPolicy < AgentgatewayBackend < AgentgatewayBackend MCP (this field).
This field may only be used with host-based static targets, not
backendRef.
Optional: {}

McpTargetSelector

McpTargetSelector defines the MCPBackend target to use for this backend.

Validation:

  • ExactlyOneOf: [selector static]

Appears in:

FieldDescriptionDefaultValidation
name SectionNameName of the MCP target.Required: {}
selector McpSelectorselector is the label selector used to select Service resources.
If policies are needed on a per-service basis, AgentgatewayPolicy can
target the desired Service.
Optional: {}
static McpTargetstatic configures a static MCP destination. When connecting to
in-cluster Service resources, it is recommended to use selector
instead.
ExactlyOneOf: [host backendRef]
Optional: {}

Message

An entry for a message to prepend or append to each prompt.

Appears in:

FieldDescriptionDefaultValidation
role stringRole of the message. The available roles depend on the backend
LLM provider model, such as SYSTEM or USER in the OpenAI API.
Required: {}
content stringString content of the message.Required: {}

NamedLLMProvider

Appears in:

FieldDescriptionDefaultValidation
name SectionNameName of the provider. Policies can target this provider by name.Required: {}
policies BackendWithAIpolicies controls policies for communicating with this backend.
Policies may also be set in AgentgatewayPolicy, or in the top-level
AgentgatewayBackend. Policies are merged on a field-level basis, with
order: AgentgatewayPolicy < AgentgatewayBackend < AgentgatewayBackend
LLM provider (this field).
Optional: {}
openai OpenAIConfigOpenAI providerOptional: {}
azureopenai AzureOpenAIConfigAzure OpenAI providerOptional: {}
anthropic AnthropicConfigAnthropic providerOptional: {}
gemini GeminiConfigGemini providerOptional: {}
vertexai VertexAIConfigVertex AI providerOptional: {}
bedrock BedrockConfigBedrock providerOptional: {}
host ShortStringHost specifies the hostname to send the requests to.
If not specified, the default hostname for the provider is used.
MaxLength: 256
MinLength: 1
Optional: {}
port integerPort specifies the port to send the requests to.Maximum: 65535
Minimum: 1
Optional: {}
path LongStringPath specifies the URL path to use for the LLM provider API requests.
This is useful when you need to route requests to a different API endpoint while maintaining
compatibility with the original provider’s API structure.
If not specified, the default path for the provider is used.
MaxLength: 1024
MinLength: 1
Optional: {}
pathPrefix LongStringPathPrefix overrides the default base path prefix (e.g. “/v1”) for upstream requests.
Path translation for cross-format requests still applies using this prefix.
Only supported for OpenAI and Anthropic providers.
MaxLength: 1024
MinLength: 1
Optional: {}

OTLPProtocol

Underlying type: string

Appears in:

FieldDescription
HTTP
GRPC

OpenAIConfig

OpenAIConfig settings for the OpenAI LLM provider.

Appears in:

FieldDescriptionDefaultValidation
model ShortStringOptional: Override the model name, such as gpt-4o-mini.
If unset, the model name is taken from the request.
MaxLength: 256
MinLength: 1
Optional: {}

OpenAIModeration

Appears in:

FieldDescriptionDefaultValidation
model stringmodel specifies the moderation model to use. For example,
omni-moderation.
Optional: {}
policies BackendSimplepolicies controls policies for communicating with OpenAI.Optional: {}

OtlpAccessLog

OtlpAccessLog defines configuration for shipping access logs to an OpenTelemetry-compatible backend via OTLP.

Appears in:

FieldDescriptionDefaultValidation
backendRef BackendObjectReferencebackendRef references the OTLP server to send access logs to.
Supported types: Service and AgentgatewayBackend.
Required: {}
protocol OTLPProtocolprotocol specifies the OTLP protocol variant to use.GRPCEnum: [HTTP GRPC]
Optional: {}
path LongStringpath specifies the OTLP/HTTP path to use. This is only applicable
when protocol is HTTP. If unset, this defaults to /v1/logs.
MaxLength: 1024
MinLength: 1
Optional: {}

PolicyPhase

Underlying type: string

Validation:

  • Enum: [PreRouting PostRouting]

Appears in:

FieldDescription
PreRouting
PostRouting

PriorityGroup

Appears in:

FieldDescriptionDefaultValidation
providers NamedLLMProvider arrayproviders specifies a list of LLM providers within this group. Each provider is treated equally in terms of priority,
with automatic weighting based on health.
MaxItems: 16
MinItems: 1
Required: {}

PromptCachingConfig

PromptCachingConfig configures automatic prompt caching for supported LLM providers. Currently only AWS Bedrock supports this feature (Claude 3+ and Nova models).

When enabled, the gateway automatically inserts cache points at strategic locations to reduce API costs. Bedrock charges lower rates for cached tokens (90% discount).

Example:

promptCaching:
  cacheSystem: true
  cacheMessages: true
  cacheTools: false

Cost savings example:

  • Without caching: 10,000 tokens × $3/MTok = $0.03
  • With caching (90% cached): 1,000 × $3/MTok + 9,000 × $0.30/MTok = $0.0057 (81% savings)

Appears in:

FieldDescriptionDefaultValidation
cacheSystem booleanCacheSystem enables caching for system prompts.
Inserts a cache point after all system messages.
trueOptional: {}
cacheMessages booleanCacheMessages enables caching for conversation messages.
Caches all messages in the conversation for cost savings.
trueOptional: {}
cacheTools booleanCacheTools enables caching for tool definitions.
Inserts a cache point after all tool specifications.
falseOptional: {}
minTokens integerMinTokens specifies the minimum estimated token count
before caching is enabled. Uses rough heuristic (word count × 1.3) to estimate tokens.
Bedrock requires at least 1,024 tokens for caching to be effective.
1024Minimum: 0
Optional: {}
cacheMessageOffset integerCacheMessageOffset shifts the message cache point further back in the
conversation. 0 (default) places it at the second-to-last message.
Higher values move it N additional messages towards the start, clamped
to bounds.
0Minimum: 0
Optional: {}

PromptguardRequest

PromptguardRequest defines the prompt guards to apply to requests sent by the client.

Validation:

  • ExactlyOneOf: [regex webhook openAIModeration bedrockGuardrails googleModelArmor]

Appears in:

FieldDescriptionDefaultValidation
response CustomResponseA custom response message to return to the client. If not specified, defaults to
The request was rejected due to inappropriate content.
Optional: {}
regex RegexRegular expression (regex) matching for prompt guards and data masking.Optional: {}
webhook WebhookConfigure a webhook to forward requests to for prompt guarding.Optional: {}
openAIModeration OpenAIModerationopenAIModeration passes prompt data through the OpenAI Moderations
endpoint.
See https://developers.openai.com/api/reference/resources/moderations for more information.
Optional: {}
bedrockGuardrails BedrockGuardrailsbedrockGuardrails configures AWS Bedrock Guardrails for prompt
guarding.
Optional: {}
googleModelArmor GoogleModelArmorgoogleModelArmor configures Google Model Armor for prompt guarding.Optional: {}

PromptguardResponse

PromptguardResponse configures the response that the prompt guard applies to responses returned by the LLM provider.

Validation:

  • ExactlyOneOf: [regex webhook bedrockGuardrails googleModelArmor]

Appears in:

FieldDescriptionDefaultValidation
response CustomResponseA custom response message to return to the client. If not specified, defaults to
The response was rejected due to inappropriate content.
Optional: {}
regex RegexRegular expression (regex) matching for prompt guards and data masking.Optional: {}
webhook WebhookConfigure a webhook to forward responses to for prompt guarding.Optional: {}
bedrockGuardrails BedrockGuardrailsbedrockGuardrails configures AWS Bedrock Guardrails for prompt
guarding.
Optional: {}
googleModelArmor GoogleModelArmorgoogleModelArmor configures Google Model Armor for prompt guarding.Optional: {}

RateLimitDescriptor

Appears in:

FieldDescriptionDefaultValidation
entries RateLimitDescriptorEntry arrayentries are the individual components that make up this descriptor.MaxItems: 16
MinItems: 1
Required: {}
unit RateLimitUnitunit defines what to use as the cost function. If unspecified,
Requests is used.
Enum: [Requests Tokens]
Optional: {}

RateLimitDescriptorEntry

A descriptor entry defines a single entry in a rate limit descriptor.

Appears in:

FieldDescriptionDefaultValidation
name TinyStringname specifies the name of the descriptor.MaxLength: 64
MinLength: 1
Required: {}
expression CELExpressionexpression is a Common Expression Language (CEL) expression that
defines the value for the descriptor.
For example, to rate limit based on the Client IP: source.address.
See https://agentgateway.dev/docs/standalone/latest/reference/cel/ for more info.
Required: {}

RateLimitUnit

Underlying type: string

Appears in:

FieldDescription
Tokens
Requests

RateLimits

Appears in:

FieldDescriptionDefaultValidation
local LocalRateLimit arrayLocal defines a local rate limiting policy.ExactlyOneOf: [requests tokens]
MaxItems: 16
MinItems: 1
Optional: {}
global GlobalRateLimitGlobal defines a global rate limiting policy using an external service.Optional: {}

Regex

Regex configures the regular expression (regex) matching for prompt guards and data masking.

Appears in:

FieldDescriptionDefaultValidation
matches LongString arrayA list of regex patterns to match against the request or response.
Matches and built-ins are additive.
MaxLength: 1024
MinLength: 1
Optional: {}
builtins BuiltIn arrayA list of built-in regex patterns to match against the request or response.
Matches and built-ins are additive.
Enum: [Ssn CreditCard PhoneNumber Email CaSin]
Optional: {}
action ActionThe action to take if a regex pattern is matched in a request or response.
This setting applies only to request matches. PromptguardResponse
matches are always masked by default.
Defaults to Mask.
MaskEnum: [Mask Reject]
Optional: {}

RemoteJWKS

Appears in:

FieldDescriptionDefaultValidation
jwksPath stringPath to the IdP jwks endpoint, relative to the root, commonly
".well-known/jwks.json".
MaxLength: 2000
MinLength: 1
Required: {}
cacheDuration Duration5mOptional: {}
backendRef BackendObjectReferencebackendRef references the remote JWKS server to reach.
Supported types are Service and static Backend. An
AgentgatewayPolicy containing backend TLS config can then be attached
to the Service or Backend in order to set TLS options for a
connection to the remote jwks source.
Required: {}

ResourceAdd

Appears in:

FieldDescriptionDefaultValidation
name ShortStringMaxLength: 256
MinLength: 1
Required: {}
expression CELExpressionRequired: {}

Retry

Retry defines the retry policy.

Appears in:

RouteType

Underlying type: string

RouteType specifies how the AI gateway should process incoming requests based on the URL path and the API format expected.

Validation:

  • Enum: [Completions Messages Models Passthrough Detect Responses AnthropicTokenCount Embeddings Realtime]

Appears in:

FieldDescription
CompletionsRouteTypeCompletions processes OpenAI /v1/chat/completions format requests.
MessagesRouteTypeMessages processes Anthropic /v1/messages format requests.
ModelsRouteTypeModels handles the /v1/models endpoint.
PassthroughRouteTypePassthrough sends requests upstream as-is without LLM processing.
DetectRouteTypeDetect sends requests as-is but attempts to extract
request/response metadata for telemetry and rate limiting.
ResponsesRouteTypeResponses processes OpenAI /v1/responses format requests.
AnthropicTokenCountRouteTypeAnthropicTokenCount processes Anthropic
/v1/messages/count_tokens format requests.
EmbeddingsRouteTypeEmbeddings processes OpenAI /v1/embeddings format requests.
RealtimeRouteTypeRealtime processes OpenAI /v1/realtime requests.

SecretSelector

Appears in:

FieldDescriptionDefaultValidation
matchLabels object (keys:string, values:string)Label selector to select the target resource.Required: {}

SessionRouting

Underlying type: string

Validation:

  • Enum: [Stateful Stateless]

Appears in:

FieldDescription
StatefulStateful mode creates an MCP session (via mcp-session-id) and
internally
ensures requests for that session are routed to a consistent backend replica.
Stateless

ShutdownSpec

Appears in:

FieldDescriptionDefaultValidation
min integerMinimum time (in seconds) to wait before allowing Agentgateway to
terminate. Refer to the CONNECTION_MIN_TERMINATION_DEADLINE
environment variable for details.
Maximum: 3.1536e+07
Minimum: 0
Required: {}
max integerMaximum time (in seconds) to wait before allowing Agentgateway to
terminate. Refer to the TERMINATION_GRACE_PERIOD_SECONDS
environment variable for details.
Maximum: 3.1536e+07
Minimum: 0
Required: {}

StaticBackend

Appears in:

FieldDescriptionDefaultValidation
host ShortStringhost to connect to.MaxLength: 256
MinLength: 1
Required: {}
port integerport to connect to.Maximum: 65535
Minimum: 1
Required: {}

TLSVersion

Underlying type: string

Validation:

  • Enum: [1.2 1.3]

Appears in:

FieldDescription
1.2agentgateway currently only supports TLS 1.2 and TLS 1.3.
1.3

Timeouts

Appears in:

FieldDescriptionDefaultValidation
request Durationrequest specifies a timeout for an individual request from the gateway to a backend. This covers the time from when
the request first starts being sent from the gateway to when the full response has been received from the backend.
Optional: {}

Tracing

Appears in:

FieldDescriptionDefaultValidation
backendRef BackendObjectReferencebackendRef references the OTLP server to reach.
Supported types: Service and AgentgatewayBackend.
Required: {}
protocol OTLPProtocolprotocol specifies the OTLP protocol variant to use.GRPCEnum: [HTTP GRPC]
Optional: {}
path LongStringpath specifies the OTLP path to use. This is only applicable when
protocol is HTTP. If unset, this defaults to /v1/traces.
MaxLength: 1024
MinLength: 1
Optional: {}
attributes LogTracingAttributesattributes specifies customizations to the key-value pairs that are
included in the trace.
Optional: {}
resources ResourceAdd arrayresources describes the entity producing telemetry and specifies the
resources to be included in the trace.
Optional: {}
randomSampling CELExpressionrandomSampling is an expression to determine the amount of random
sampling. Random sampling will initiate a new trace span if the incoming
request does not have a trace initiated already. This should evaluate to
a float between 0.0 and 1.0, or a boolean (true or false). If
unspecified, random sampling is disabled.
Optional: {}
clientSampling CELExpressionclientSampling is an expression to determine the amount of client
sampling. Client sampling determines whether to initiate a new trace
span if the incoming request does have a trace already. This should
evaluate to a float between 0.0 and 1.0, or a boolean (true or
false). If unspecified, client sampling is 100% enabled.
Optional: {}

Traffic

Appears in:

FieldDescriptionDefaultValidation
phase PolicyPhaseThe phase to apply the traffic policy to. If the phase is PreRouting,
the targetRef must be a Gateway or a Listener. PreRouting is
typically used only when a policy needs to influence the routing
decision.
Even when using PostRouting mode, the policy can target the
Gateway or Listener. This is a helper for applying the policy to all
routes under that Gateway or Listener, and follows the merging logic
described above.
Note: PreRouting and PostRouting rules do not merge together. These
are independent execution phases. That is, all PreRouting rules will
merge and execute, then all PostRouting rules will merge and execute.
If unset, this defaults to PostRouting.
Enum: [PreRouting PostRouting]
Optional: {}
transformation Transformationtransformation is used to mutate and transform requests and responses
before forwarding them to the destination.
Optional: {}
extProc ExtProcextProc specifies the external processing configuration for the policy.Optional: {}
extAuth ExtAuthextAuth specifies the external authentication configuration for the policy.
This controls what external server to send requests to for authentication.
ExactlyOneOf: [grpc http]
Optional: {}
rateLimit RateLimitsrateLimit specifies the rate limiting configuration for the policy.
This controls the rate at which requests are allowed to be processed.
Optional: {}
cors CORScors specifies the CORS configuration for the policy.Optional: {}
csrf CSRFcsrf specifies the Cross-Site Request Forgery (CSRF) policy for this traffic policy.
The CSRF policy has the following behavior:
* Safe methods (GET, HEAD, OPTIONS) are automatically allowed.
* Requests without Sec-Fetch-Site or Origin headers are assumed to
be same-origin or non-browser requests and are allowed.
* Otherwise, the Sec-Fetch-Site header is checked, with a fallback to
comparing the Origin header to the Host header.
Optional: {}
headerModifiers HeaderModifiersheaderModifiers defines the policy to modify request and response headers.Optional: {}
hostRewrite HostnameRewritehostRewrite specifies how to rewrite the Host header for requests.
If the HTTPRoute urlRewrite filter already specifies a host rewrite,
this setting is ignored.
Optional: {}
timeouts Timeoutstimeouts defines the timeouts for requests.
It is applicable to HTTPRoute resources and ignored for other targeted
kinds.
Optional: {}
retry Retryretry defines the policy for retrying requests.Optional: {}
authorization Authorizationauthorization specifies the access rules based on roles and
permissions.
If multiple authorization rules are applied across different policies (at the same, or different, attahcment points),
all rules are merged.
Optional: {}
jwtAuthentication JWTAuthenticationjwtAuthentication authenticates users based on JWT tokens.Optional: {}
basicAuthentication BasicAuthenticationbasicAuthentication authenticates users based on the Basic
authentication scheme (RFC 7617), where a username and password are
encoded in the request.
ExactlyOneOf: [users secretRef]
Optional: {}
apiKeyAuthentication APIKeyAuthenticationapiKeyAuthentication authenticates users based on a configured API
key.
ExactlyOneOf: [secretRef secretSelector]
Optional: {}
directResponse DirectResponsedirectResponse configures the policy to send a direct response to the
client.
Optional: {}

Transform

Appears in:

FieldDescriptionDefaultValidation
set HeaderTransformation arrayset is a list of headers and the value they should be set to.MaxItems: 16
MinItems: 1
Optional: {}
add HeaderTransformation arrayadd is a list of headers to add to the request and what that value
should be set to. If there is already a header with these values then
append the value as an extra entry.
MaxItems: 16
MinItems: 1
Optional: {}
remove HeaderName arrayremove is a list of header names to remove from the request or
response.
MaxItems: 16
MaxLength: 256
MinItems: 1
MinLength: 1
Pattern: ^:?[A-Za-z0-9!#$%&'*+\-.^_\x60|~]+$
Optional: {}
body CELExpressionbody controls manipulation of the HTTP body.Optional: {}
metadata object (keys:string, values:CELExpression)Refer to Kubernetes API documentation for fields of metadata.MaxProperties: 16
MinProperties: 1
Optional: {}

Transformation

Appears in:

FieldDescriptionDefaultValidation
request Transformrequest is used to modify the request path.Optional: {}
response Transformresponse is used to modify the response path.Optional: {}

VertexAIConfig

VertexAIConfig settings for the Vertex AI LLM provider.

Appears in:

FieldDescriptionDefaultValidation
model ShortStringOptional: Override the model name, such as gpt-4o-mini.
If unset, the model name is taken from the request.
MaxLength: 256
MinLength: 1
Optional: {}
projectId TinyStringThe ID of the Google Cloud Project that you use for the Vertex AI.MaxLength: 64
MinLength: 1
Required: {}
region TinyStringThe location of the Google Cloud Project that you use for the Vertex AI.
Defaults to global if not specified.
globalMaxLength: 64
MinLength: 1
Optional: {}

Webhook

Webhook configures a webhook to forward requests or responses to for prompt guarding.

Appears in:

FieldDescriptionDefaultValidation
backendRef BackendObjectReferencebackendRef references the webhook server to reach.
Supported types: Service and Backend.
Required: {}
forwardHeaderMatches HTTPHeaderMatch arrayForwardHeaderMatches defines a list of HTTP header matches that will be
used to select the headers to forward to the webhook.
Request headers are used when forwarding requests and response headers
are used when forwarding responses.
By default, no headers are forwarded.
Optional: {}

Shared Types

The following types are defined in the shared package and used across multiple APIs.

Authorization

Authorization defines the configuration for role-based access control.

FieldTypeDescription
policyAuthorizationPolicypolicy specifies the authorization rule to evaluate. * For Allow rules: any policy allows the request. * For Require rules: all policies must match for the request to be allowed. * For Deny rules: any matching policy denies the request. Note: a CEL expression that fails to evaluate is not considered to match, making this a risky policy; prefer to use Require. The presence of at least one Allow rule triggers a deny-by-default policy, requiring at least 1 match to allow. With no rules, all requires are allowed. Required.
actionAuthorizationPolicyActionaction defines whether the rule allows, denies, or requires the request if matched. If unspecified, the default is Allow. Require policies are conjunctive across merged policies: all require policies must match.

AuthorizationPolicy

AuthorizationPolicy defines a single authorization rule.

FieldTypeDescription
matchExpressions[]CELExpressionMatchExpressions defines a set of conditions that must be satisfied for the rule to match. These expressions should be in the form of a Common Expression Language (CEL) expression. Required.

AuthorizationPolicyAction

Underlying type: string

AuthorizationPolicyAction defines the action to take when the RBACPolicies matches.

CELExpression

Underlying type: string

CELExpression represents a Common Expression Language (CEL) expression.

Validation:

  • MinLength=1
  • MaxLength=16384

HeaderModifiers

HeaderModifiers can be used to define the policy to modify request and response headers.

Validation:

  • AtLeastOneFieldSet
FieldTypeDescription
request*gwv1.HTTPHeaderFilterRequest modifies request headers.
response*gwv1.HTTPHeaderFilterResponse modifies response headers.

KubernetesResourceOverlay

KubernetesResourceOverlay provides a mechanism to customize generated Kubernetes resources using Strategic Merge Patch semantics. # Overlay Application Order Overlays are applied after all typed configuration fields have been processed. The full merge order is: 1. GatewayClass typed configuration fields, for example replicas or image settings from parametersRef 2. Gateway typed configuration fields from infrastructure.parametersRef 3. GatewayClass overlays are applied 4. Gateway overlays are applied This ordering means Gateway-level configuration overrides GatewayClass-level configuration at each stage. For example, if both levels set the same label, the Gateway value wins.

FieldTypeDescription
metadata*ObjectMetadatametadata defines a subset of object metadata to be customized. labels and annotations are merged with existing values. If both GatewayClass and Gateway parameters define the same label or annotation key, the Gateway value takes precedence (applied second).
spec*apiextensionsv1.JSONspec provides an opaque mechanism to configure the resource spec. This field accepts a complete or partial Kubernetes resource spec, such as PodSpec or ServiceSpec, and will be merged with the generated configuration using Strategic Merge Patch semantics. # Application Order Overlays are applied after all typed configuration fields from both levels. The full merge order is: 1. GatewayClass typed configuration fields 2. Gateway typed configuration fields 3. GatewayClass overlays 4. Gateway overlays (can override all previous values) # Strategic Merge Patch & Deletion Guide This merge strategy allows you to override individual fields, merge lists, or delete items without needing to provide the entire resource definition. 1. Replacing Values (Scalars): Simple fields (strings, integers, booleans) in your config will overwrite the generated defaults. 2. Merging Lists (Append/Merge): Lists with “merge keys”, like containers which merges on name, or tolerations which merges on key, will append your items to the generated list, or update existing items if keys match. 3. Deleting Fields or List Items ($patch: delete): To remove a field or list item from the generated resource, use the $patch: delete directive. This works for both map fields and list items, and is the recommended approach because it works with both client-side and server-side apply. spec: template: spec: # Delete pod-level securityContext securityContext: $patch: delete # Delete nodeSelector nodeSelector: $patch: delete containers: # Be sure to use the correct proxy name here or you will add a # container instead of modifying a container. - name: proxy-name # Delete container-level securityContext securityContext: $patch: delete 4. Null Values (server-side apply only): Setting a field to null can also remove it, but this ONLY works with kubectl apply --server-side or equivalent. With regular client-side kubectl apply, null values are stripped by kubectl before reaching the API server, so the deletion won’t occur. Prefer $patch: delete for consistent behavior across both apply modes. spec: template: spec: nodeSelector: null # Removes nodeSelector (server-side apply only!) 5. Replacing Maps Entirely ($patch: replace): To replace an entire map with your values (instead of merging), use $patch: replace. This removes all existing keys and replaces them with only your specified keys. spec: template: spec: nodeSelector: $patch: replace custom-key: custom-value 6. Replacing Lists Entirely ($patch: replace): If you want to strictly define a list and ignore all generated defaults, use $patch: replace. service: spec: ports: - $patch: replace - name: http port: 80 targetPort: 8080 protocol: TCP - name: https port: 443 targetPort: 8443 protocol: TCP

LongString

Underlying type: string

Validation:

  • MinLength=1
  • MaxLength=1024

ObjectMetadata

ObjectMetadata contains labels and annotations for metadata overlays.

FieldTypeDescription
labelsmap[string]stringMap of string keys and values that can be used to organize and categorize (scope and select) objects. May match selectors of replication controllers and services. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels
annotationsmap[string]stringAnnotations is an unstructured key value map stored with a resource that may be set by external tools to store and retrieve arbitrary metadata. They are not queryable and should be preserved when modifying objects. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations

PolicyAncestorStatus

FieldTypeDescription
ancestorRefgwv1.ParentReferenceAncestorRef corresponds with a ParentRef in the spec that this PolicyAncestorStatus struct describes the status of. Required.
controllerNamestringControllerName is a domain/path string that indicates the name of the controller that wrote this status. This corresponds with the controllerName field on GatewayClass. Example: example.net/gateway-controller. The format of this field is DOMAIN "/" PATH, where DOMAIN and PATH are valid Kubernetes names (https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names). Controllers MUST populate this field when writing status. Controllers should ensure that entries in status populated with their ControllerName are cleaned up when they are no longer necessary. Required.
conditions[]metav1.ConditionConditions describes the status of the Policy with respect to the given Ancestor.

PolicyStatus

FieldTypeDescription
conditions[]metav1.Condition
ancestors[]PolicyAncestorStatusRequired.

SNI

Underlying type: string

Validation:

  • MinLength=1
  • MaxLength=253
  • Pattern=^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$

ShortString

Underlying type: string

Validation:

  • MinLength=1
  • MaxLength=256

TinyString

Underlying type: string

Validation:

  • MinLength=1
  • MaxLength=64
Agentgateway assistant

Ask me anything about agentgateway configuration, features, or usage.

Note: AI-generated content might contain errors; please verify and test all returned information.

Tip: one topic per conversation gives the best results. Use the + button in the chat header to start a new conversation.

Switching topics? Starting a new conversation improves accuracy.
↑↓ navigate select esc dismiss

What could be improved?

Your feedback helps us improve assistant answers and identify docs gaps we should fix.

Need more help? Join us on Discord: https://discord.gg/y9efgEmppm

Want to use your own agent? Add the Solo MCP server to query our docs directly. Get started here: https://search.solo.io/.