GEP-1713: ListenerSets - Standard Mechanism to Merge Multiple Gateways¶
- Issue: #1713
- Status: Provisional
(See status definitions here.)
Introduction¶
The Gateway
Resource is a point of contention since it is the only place to attach listeners with certificates. We propose a new resource called ListenerSet
to allow a shared list of listeners to be attached to a single Gateway
.
Goals¶
- Define a mechanism to merge listeners into a single
Gateway
Future Potential Goals (Beyond the GEP)¶
From Gateway Hiearchy Brainstorming:
- Attaching listeners to
Gateways
in different namespaces - Standardize merging multiple lists of Listeners together (#1863)
- Increase the number of Gateway Listeners that are supported (#2869)
- Provide a mechanism for third party components to generate listeners and attach them to a Gateway (#1863)
- Delegate TLS certificate management to App Owners and/or different namespaces (#102, #103)
- Delegate domains to different namespaces, but allow those namespace to define TLS and routing configuration within those namespaces with Gateway-like resources (#102, #103)
- Enable admins to delegate SNI-based routing for TLS passthrough to other teams and/or namespaces (#3177) (Remove TLSRoute)
- Simplify L4 routing by removing at least one of the required layers (Gateway -> Route -> Service)
- Delegate routing to namespaces based on path prefix (previously known as Route delegation)
- Static infrastructure attachment (#3103)
Use Cases & Motivation¶
Knative generates on demand per-service certificates using HTTP-01 challenges.
There can be O(1000) Knative Services
in the cluster which means we have O(1000) distinct certificates.
Thus updating a single Gateway
resource with this many certificates is a contention point and inhibits horizontal scaling of our controllers.
Istio Ambient, similarly, creates a listener per Kubernetes service.
More broadly, large scale gateway users often expose O(1000)
domains, but are currently limited by the maximum of 64 listeners
.
The spec currently has language to indicate implementations MAY
merge Gateways
resources but does not define any specific requirements for how that should work.
https://github.com/kubernetes-sigs/gateway-api/blob/541e9fc2b3c2f62915cb58dc0ee5e43e4096b3e2/apis/v1beta1/gateway_types.go#L76-L78
Feature Details¶
We define ListenerSet
as the name of the feature outlined in this GEP.
The feature will be part of the experimental channel, which implementations can choose to support. All the MUST
requirements in this document apply to implementations that choose to support this feature.
API¶
This proposal introduces a new ListenerSet
resource that has the ability to attach a set of listeners to multiple parent Gateways
.
Go¶
type GatewaySpec struct {
...
// Note: this is a list to allow future potential features
AllowedListeners []*AllowedListeners `json:"allowedListeners"`
...
}
type AllowedListeners struct {
// TODO - discuss changing this to Same in the future
// +kubebuilder:default={from: None}
Namespaces *ListenerNamespaces `json:"namespaces,omitempty"`
}
// ListenerNamespaces indicate which namespaces ListenerSets should be selected from.
type ListenerNamespaces struct {
// From indicates where ListenerSets can attach to this Gateway. Possible
// values are:
//
// * Same: Only ListenerSets in the same namespace may be attached to this Gateway.
// * None: Only listeners defined in the Gateway's spec are allowed
//
// +optional
// +kubebuilder:default=Same
// +kubebuilder:validation:Enum=Same;None
From *FromNamespaces `json:"from,omitempty"`
}
// ListenerSet defines a set of additional listeners to attach to an existing Gateway.
type ListenerSet struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
// Spec defines the desired state of ListenerSet.
Spec ListenerSetSpec `json:"spec"`
// Status defines the current state of ListenerSet.
Status ListenerSetStatus `json:"status,omitempty"`
}
// ListenerSetSpec defines the desired state of a ListenerSet.
type ListenerSetSpec struct {
// ParentRef references the Gateway that the listeners are attached to.
ParentRef ParentGatewayReference `json:"parentRef,omitempty"`
// Listeners associated with this ListenerSet. Listeners define
// logical endpoints that are bound on this referenced parent Gateway's addresses.
//
// Listeners in a `Gateway` and their attached `ListenerSets` are concatenated
// as a list when programming the underlying infrastructure.
//
// Listeners should be merged using the following precedence:
//
// 1. "parent" Gateway
// 2. ListenerSet ordered by creation time (oldest first)
// 3. ListenerSet ordered alphabetically by “{namespace}/{name}”.
//
// +listType=map
// +listMapKey=name
// +kubebuilder:validation:MinItems=1
// +kubebuilder:validation:MaxItems=64
Listeners []ListenerEntry `json:"listeners"`
}
// ListenerEntry embodies the concept of a logical endpoint where a Gateway accepts
// network connections.
type ListenerEntry struct {
// Name is the name of the Listener. This name MUST be unique within a
// Gateway.
//
// Support: Core
Name SectionName `json:"name"`
// Hostname specifies the virtual hostname to match for protocol types that
// define this concept. When unspecified, all hostnames are matched. This
// field is ignored for protocols that don't require hostname based
// matching.
//
// Implementations MUST apply Hostname matching appropriately for each of
// the following protocols:
//
// * TLS: The Listener Hostname MUST match the SNI.
// * HTTP: The Listener Hostname MUST match the Host header of the request.
// * HTTPS: The Listener Hostname SHOULD match at both the TLS and HTTP
// protocol layers as described above. If an implementation does not
// ensure that both the SNI and Host header match the Listener hostname,
// it MUST clearly document that.
//
// For HTTPRoute and TLSRoute resources, there is an interaction with the
// `spec.hostnames` array. When both listener and route specify hostnames,
// there MUST be an intersection between the values for a Route to be
// accepted. For more information, refer to the Route specific Hostnames
// documentation.
//
// Hostnames that are prefixed with a wildcard label (`*.`) are interpreted
// as a suffix match. That means that a match for `*.example.com` would match
// both `test.example.com`, and `foo.test.example.com`, but not `example.com`.
//
// Support: Core
//
// +optional
Hostname *Hostname `json:"hostname,omitempty"`
// Port is the network port. Multiple listeners may use the
// same port, subject to the Listener compatibility rules.
//
// If the port is specified as zero, the implementation will assign
// a unique port. If the implementation does not support dynamic port
// assignment, it MUST set `Accepted` condition to `False` with the
// `UnsupportedPort` reason.
//
// Support: Core
//
// +optional
Port *PortNumber `json:"port,omitempty"`
// Protocol specifies the network protocol this listener expects to receive.
//
// Support: Core
Protocol ProtocolType `json:"protocol"`
// TLS is the TLS configuration for the Listener. This field is required if
// the Protocol field is "HTTPS" or "TLS". It is invalid to set this field
// if the Protocol field is "HTTP", "TCP", or "UDP".
//
// The association of SNIs to Certificate defined in GatewayTLSConfig is
// defined based on the Hostname field for this listener.
//
// The GatewayClass MUST use the longest matching SNI out of all
// available certificates for any TLS handshake.
//
// Support: Core
//
// +optional
TLS *GatewayTLSConfig `json:"tls,omitempty"`
// AllowedRoutes defines the types of routes that MAY be attached to a
// Listener and the trusted namespaces where those Route resources MAY be
// present.
//
// Although a client request may match multiple route rules, only one rule
// may ultimately receive the request. Matching precedence MUST be
// determined in order of the following criteria:
//
// * The most specific match as defined by the Route type.
// * The oldest Route based on creation timestamp. For example, a Route with
// a creation timestamp of "2020-09-08 01:02:03" is given precedence over
// a Route with a creation timestamp of "2020-09-08 01:02:04".
// * If everything else is equivalent, the Route appearing first in
// alphabetical order (namespace/name) should be given precedence. For
// example, foo/bar is given precedence over foo/baz.
//
// All valid rules within a Route attached to this Listener should be
// implemented. Invalid Route rules can be ignored (sometimes that will mean
// the full Route). If a Route rule transitions from valid to invalid,
// support for that Route rule should be dropped to ensure consistency. For
// example, even if a filter specified by a Route rule is invalid, the rest
// of the rules within that Route should still be supported.
//
// Support: Core
// +kubebuilder:default={namespaces:{from: Same}}
// +optional
AllowedRoutes *AllowedRoutes `json:"allowedRoutes,omitempty"`
}
// ListenerSetStatus defines the observed state of a ListenerSet
type ListenerSetStatus struct {
// Listeners provide status for each unique listener port defined in the Spec.
//
// +optional
// +listType=map
// +listMapKey=name
// +kubebuilder:validation:MaxItems=64
Listeners []ListenerEntryStatus `json:"listeners,omitempty"`
// Conditions describe the current conditions of the ListenerSet.
//
// Implementations should prefer to express ListenerSet conditions
// using the `GatewayConditionType` and `GatewayConditionReason`
// constants so that operators and tools can converge on a common
// vocabulary to describe Gateway state.
//
// Known condition types are:
//
// * "Accepted"
// * "Programmed"
//
// +optional
// +listType=map
// +listMapKey=type
// +kubebuilder:validation:MaxItems=8
Conditions []metav1.Condition `json:"conditions,omitempty"`
}
// ListenerEntryStatus is the status associated with a ListenerEntry.
type ListenerEntryStatus struct {
// Name is the name of the Listener that this status corresponds to.
Name SectionName `json:"name"`
// Port is the network port that this listener is listening on.
Port PortNumber `json:"port"`
// SupportedKinds is the list indicating the Kinds supported by this
// listener. This MUST represent the kinds an implementation supports for
// that Listener configuration.
//
// If kinds are specified in Spec that are not supported, they MUST NOT
// appear in this list and an implementation MUST set the "ResolvedRefs"
// condition to "False" with the "InvalidRouteKinds" reason. If both valid
// and invalid Route kinds are specified, the implementation MUST
// reference the valid Route kinds that have been specified.
//
// +kubebuilder:validation:MaxItems=8
SupportedKinds []RouteGroupKind `json:"supportedKinds"`
// AttachedRoutes represents the total number of Routes that have been
// successfully attached to this Listener.
//
// Successful attachment of a Route to a Listener is based solely on the
// combination of the AllowedRoutes field on the corresponding Listener
// and the Route's ParentRefs field. A Route is successfully attached to
// a Listener when it is selected by the Listener's AllowedRoutes field
// AND the Route has a valid ParentRef selecting the whole Gateway
// resource or a specific Listener as a parent resource (more detail on
// attachment semantics can be found in the documentation on the various
// Route kinds ParentRefs fields). Listener or Route status does not impact
// successful attachment, i.e. the AttachedRoutes field count MUST be set
// for Listeners with condition Accepted: false and MUST count successfully
// attached Routes that may themselves have Accepted: false conditions.
//
// Uses for this field include troubleshooting Route attachment and
// measuring blast radius/impact of changes to a Listener.
AttachedRoutes int32 `json:"attachedRoutes"`
// Conditions describe the current condition of this listener.
//
// +listType=map
// +listMapKey=type
// +kubebuilder:validation:MaxItems=8
Conditions []metav1.Condition `json:"conditions"`
}
// ParentGatewayReference identifies an API object including its namespace,
// defaulting to Gateway.
type ParentGatewayReference struct {
// Group is the group of the referent.
//
// +optional
// +kubebuilder:default="gateway.networking.k8s.io"
Group *Group `json:"group"`
// Kind is kind of the referent. For example "Gateway".
//
// +optional
// +kubebuilder:default=Gateway
Kind *Kind `json:"kind"`
// Name is the name of the referent.
Name ObjectName `json:"name"`
}
YAML¶
The following example shows a Gateway
with an HTTP listener and two child HTTPS ListenerSets
with unique hostnames and certificates.
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: parent-gateway
spec:
gatewayClassName: example
listeners:
- name: foo
hostname: foo.com
protocol: HTTP
port: 80
---
apiVersion: gateway.networking.x-k8s.io/v1alpha1
kind: ListenerSet
metadata:
name: first-workload-listeners
spec:
parentRef:
name: parent-gateway
kind: Gateway
group: gateway.networking.k8s.io
listeners:
- name: first
hostname: first.foo.com
protocol: HTTPS
port: 443
tls:
mode: Terminate
certificateRefs:
- kind: Secret
group: ""
name: first-workload-cert # Provisioned via HTTP01 challenge
---
apiVersion: gateway.networking.x-k8s.io/v1alpha1
kind: ListenerSet
metadata:
name: second-workload-listeners
spec:
parentRef:
name: parent-gateway
kind: Gateway
group: gateway.networking.k8s.io
listeners:
- name: second
hostname: second.foo.com
protocol: HTTPS
port: 443
tls:
mode: Terminate
certificateRefs:
- kind: Secret
group: ""
name: second-workload-cert # Provisioned via HTTP01 challenge
ListenerEntry¶
ListenerEntry
is currently a copy of the Listener
struct with some changes
1. Port
is now a pointer to allow for dynamic port assignment.
Semantics¶
Gateway Changes¶
An initial experimental release of ListenerSets
will have no modifications to listener list on the Gateway
resource. Using ListenerSets
will require a dummy listener to be configured.
In a future (potential) release when an implementation supports ListenerSets
, Gateways
MUST allow the list of listeners to be empty. Thus the present minItems=1
constraint on the listener list will be removed. This allows implementations to avoid security, cost etc. concerns with having dummy listeners.
When there are no listeners the Gateway
's status.listeners
should be empty or unset. status.listeners
is already an optional field.
Implementations, when creating a Gateway
, may provision underlying infrastructure when there are no listeners present. The status conditions Accepted
and Programmed
conditions should reflect state of this provisioning.
Gateway <> ListenerSet Handshake¶
By default a Gateway
MUST NOT allow ListenerSets
to be attached. Users can enable this behaviour by configuring their Gateway
to allow ListenerSet
attachment:
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: parent-gateway
spec:
allowedListeners:
- from: Same
Route Attaching¶
Routes MUST be able to specify a ListenerSet
as a parentRef
. Routes can use sectionName
/port
fields in ParentReference
to help target a specific listener. If no listener is targeted (sectionName
/port
are unset) then the Route attaches to all the listeners in the ListenerSet
.
Routes MUST be able to attach to a ListenerSet
and it's parent Gateway
by having multiple parentRefs
eg:
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: httproute-example
spec:
parentRefs:
- name: second-workload-listeners
kind: ListenerSet
sectionName: second
For instance, the following HTTPRoute
attempts to attach to a listener defined in the parent Gateway
using the sectionName foo
. This is not valid and the route's status Accepted
condition should be set to False
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: httproute-example
spec:
parentRefs:
- name: some-workload-listeners
kind: ListenerSet
sectionName: foo
To attach to listeners in both a Gateway
and ListenerSet
the route MUST have two parentRefs
:
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: httproute-example
spec:
parentRefs:
- name: second-workload-listeners
kind: ListenerSet
sectionName: second
- name: parent-gateway
kind: Gateway
sectionName: foo
Listener Validation¶
Implementations MUST treat the parent Gateway
s as having the merged list of all listeners from itself and attached ListenerSets
. See 'Listener Precedence' for more details on ordering.
Validation of this list of listeners MUST behave the same as if the list were part of a single Gateway
.
From the earlier example the above resources would be equivalent to a single Gateway
where the listeners are collapsed into a single list.
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: parent-gateway
spec:
gatewayClassName: example
listeners:
- name: foo
hostname: foo.com
protocol: HTTP
port: 80
- name: first
hostname: first.foo.com
protocol: HTTPS
port: 443
tls:
mode: Terminate
certificateRefs:
- kind: Secret
group: ""
name: first-workload-cert # Provisioned via HTTP01 challenge
- name: second
hostname: second.foo.com
protocol: HTTPS
port: 443
tls:
mode: Terminate
certificateRefs:
- kind: Secret
group: ""
name: second-workload-cert # Provisioned via HTTP01 challenge
Listener Precedence¶
Listeners in a Gateway
and their attached ListenerSets
are concatenated as a list when programming the underlying infrastructure
Listeners should be merged using the following precedence:
- "parent" Gateway
- ListenerSet ordered by creation time (oldest first)
- ListenerSet ordered alphabetically by “{namespace}/{name}”.
Conflicts are covered in the section 'ListenerConditions within a ListenerSet'
Gateway Conditions¶
Gateway
's Accepted
and Programmed
top-level conditions remain unchanged and reflect the status of the local configuration.
Implementations MUST support a new Gateway
condition type AttachedListenerSets
.
The condition's Status
has the following values:
True
whenSpec.AllowedListeners
is set and at least one child Listener arrives from aListenerSet
False
whenSpec.AllowedListeners
is set but has no valid listeners are attachedUnknown
when noSpec.AllowedListeners
config is present
Parent Gateways
MUST NOT have ListenerSet
listeners in their status.listeners
conditions list.
ListenerSet Conditions¶
ListenerSets
have a top-level Accepted
and Programmed
conditions.
The Accepted
condition MUST be set on every ListenerSet
, and indicates that the ListenerSet
is semantically valid and accepted by its parentRef
.
Valid reasons for Accepted
being False
are:
NotAllowed
- theparentRef
doesn't allow attachmentParentNotAccepted
- theparentRef
isn't accepted (eg. invalid address)UnsupportedValue
- a listener in the set is using an unsupported feature/value
The Programmed
condition MUST be set on every ListenerSet
and have a similar meaning to the Gateway Programmed
condition but only reflect the listeners in this ListenerSet
.
Accepted
and Programmed
conditions when surfacing details about listeners, MUST only summarize the status.parents.listeners
conditions that are exclusive to the ListenerSet
.
An exception to this is when the parent Gateway
's Accepted
or Programmed
conditions transition to False
ListenerSets
MUST NOT have their parent Gateway
's' listeners in the associated status.parents.listeners
conditions list.
ListenerConditions within a ListenerSet¶
An implementation MAY reject listeners by setting the ListenerEntryStatus
Accepted
condition to False
with the Reason
TooManyListeners
If a listener has a conflict, this should be reported in the ListenerEntryStatus
of the conflicted ListenerSet
by setting the Conflicted
condition to True
.
Implementations SHOULD be cautious about what information from the parent or siblings are reported to avoid accidentally leaking sensitive information that the child would not otherwise have access to. This can include contents of secrets etc.
Policy Attachment¶
Policy attachment is [under discussion] in https://github.com/kubernetes-sigs/gateway-api/discussions/2927
Similar to Routes, ListenerSet
can inherit policy from a Gateway.
Policies that attach to a ListenerSet
apply to all listeners defined in that resource, but do not impact listeners in the parent Gateway
. This allows ListenerSets
attached to the same Gateway
to have different policies.
If the implementation cannot apply the policy to only specific listeners, it should reject the policy.
Alternatives¶
Re-using Gateway Resource¶
The first iteration of this GEP proposed re-using the Gateway
resource and introducing an attachTo
property in the infrastructure
stanza.
The main downside of this approach is that users still require Gateway
write access to create listeners. Secondly, it introduces complexity to future Gateway
features as GEP authors would have now have to account for merging semantics.
New 'GatewayGroup' Resource¶
This was proposed in the Gateway Hiearchy Brainstorming document (see references below). The idea is to introduce a central resource that will coalease Gateways together and offer forms of delegation.
Issues with this is complexity with status propagation, cluster vs. namespace scoping etc. It also lacks a migration path for existing Gateways to help shard listeners.
Use of Multiple Disjointed Gateways¶
An alternative would be to encourage users to not use overly large Gateways to minimize the blast radius of any issues. Use of disjoint Gateways could accomplish this but it has the disadvantage of consuming more resources and introducing complexity when it comes to operations work (eg. setting up DNS records etc.)
Increase the Listener Limit¶
Increasing the limit may help in situations where you are creating many listeners such as adding certificates created using an ACME HTTP01 challenge. Unfortunately this still makes the Gateway a single point of contention. Unfortunately, there will always be an upper bound because of etcd limitations. For workloads like Knative we can have O(1000) Services on the cluster with unique subdomains.
Expand Route Functionality¶
For workloads with many certificates one option would be to introduce a tls
stanza somewhere in the Route types. These Routes would then attach to a single Gateway. Then application operators can provide their own certificates. This probably would require some ability to have a handshake agreement with the Gateway.
Sorta related there was a Route Delegation GEP (https://github.com/kubernetes-sigs/gateway-api/issues/1058) that was abandoned
References¶
First Revision of the GEP
- https://github.com/kubernetes-sigs/gateway-api/pull/1863
Mentioned in Prior GEPs:
- https://github.com/kubernetes-sigs/gateway-api/pull/1757
Prior Discussions:
- https://github.com/kubernetes-sigs/gateway-api/discussions/1248
- https://github.com/kubernetes-sigs/gateway-api/discussions/1246
Gateway Hierarchy Brainstorming:
- https://docs.google.com/document/d/1qj7Xog2t2fWRuzOeTsWkabUaVeOF7_2t_7appe8EXwA/edit