Getting Envoy to pick up rotated certificates
Like many people, I use cert-manager
to automatically renew my website’s TLS certificates with
Let’s Encrypt. Unlike many people, I don’t use an
Ingress controller to get traffic into my cluster, I just have a few instances
of Envoy that terminate TLS and route traffic to the
appropriate backend. Cert-manager handles the mechanics of certificate renewal
very efficiently; it runs a controller loop that checks all my Certificate
objects for expiration, and when a certificate is close to expiring, it goes out
and renews it. It then updates a Kubernetes Secret with the new key material,
and Kubernetes then makes that new data available to pods that have mounted the
Secret as a volume. From there, it’s up to the application to notice that some
symlinks have moved around and reload the certificate. Up until very recently,
Envoy did not bother to check. So at some point, you had to do a rolling restart
of the Envoy deployment to pick up the new certificate. Because there is 30 days
between when cert-manager renews the certificate and when the old certificate
actually expired, this was rarely a problem in practice. If any of your machines
went down, or you edited the config file to add a new route, or you upgraded
Envoy itself, the pod containing Envoy would restart, pick up the new
certificates, and you’d never notice that it wasn’t automatically reloading the
certificate.
As someone who doesn’t like to leave important production operations to chance, though, I knew I needed a better system. Fortunately, Envoy added a way to reload certificates with the 1.14 release. Let’s try that.
The mechanics⌗
Everyone’s Envoy configuration is different, so I’m just going to provide a very
minimal envoy.yaml
that we’ll modify to make certs automatically reload. You
can then apply this to your own configuration. (You can experiment with this on
your workstation by building Envoy, or extracting the binary from a Docker image
with docker cp
. That’s what I do for all my local Envoy work. Although Envoy
does not distribute binaries, the binary from the Docker image works great on my
Ubuntu 19.10 workstation.)
Here is a basic envoy.yaml
that serves HTTPS on port 10000 with a static
response:
static_resources:
listeners:
- name: test
address:
socket_address:
protocol: TCP
address: 127.0.0.1
port_value: 10000
listener_filters:
- name: "envoy.listener.tls_inspector"
typed_config: {}
filter_chains:
- tls_context:
common_tls_context:
alpn_protocols: ["h2", "http/1.1"]
tls_certificates:
- certificate_chain:
filename: "/certs/tls.crt"
private_key:
filename: "/certs/tls.key"
filters:
- name: envoy.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
stat_prefix: test
route_config:
virtual_hosts:
- name: test
domains: ["*"]
routes:
- match: { prefix: "/" }
direct_response:
status: 200
body:
inline_string: "Hello from Envoy"
http_filters:
- name: envoy.router
Put your TLS key and cert in /certs
, and curl https://localhost:10000/
will
return “Hello from Envoy”. It works. You can do whatever you want to /certs
and Envoy will keep using the TLS configuration that it loaded at startup.
To fix that, we need to make the TLS context for our listener use SDS instead of a static configuration.
The first step is to create another config file that contains information about
the secret discovery. I put my main envoy.yaml
in a ConfigMap that gets
mounted into /etc/envoy
, and just added a sds.yaml
to that ConfigMap to
store the SDS configuration. All it is is a plaintext representation of what an
SDS API server would serve to Envoy, if it was getting configuration from an xDS
server and not the filesystem. It looks like:
resources:
- "@type": "type.googleapis.com/envoy.api.v2.auth.Secret"
tls_certificate:
certificate_chain:
filename: "/certs/tls.crt"
private_key:
filename: "/certs/tls.key"
While this looks almost exactly what we put in the main envoy.yaml
before,
this is what triggers the code to start watching various directories for changes
with inotify and lead to the eventual refreshing of your certificate.
We also need to make some changes to envoy.yaml
itself. Instead of statically
configuring the listener with a certificate, we need the listener to load the
certificate from SDS. In the listener’s filter_chains
section, we’ll change
the tls_context
to a more general transport_socket
, and then point it at our
sds.yaml
. (It is not necessary to convert tls_context
to transport_socket
,
but tls_context
will be gone by the end of the year, so you might as well
change it now.)
So now instead of:
listeners:
- name: test
...
filter_chains:
- tls_context: {...}
filters: [...]
We’ll have:
listeners:
- name: test
...
filter_chains:
- transport_socket:
name: "envoy.transport_sockets.tls"
typed_config:
"@type": "type.googleapis.com/envoy.api.v2.auth.DownstreamTlsContext"
common_tls_context:
alpn_protocols: ["h2", "http/1.1"]
tls_certificate_sds_secret_configs:
sds_config:
path: /etc/envoy/sds.yaml
filters: [...]
Using SDS also activates other parts of Envoy’s code that wants Envoy to have
some identifying information associated with the node. You can supply that on
the command line, or in the bootstrap config with a node
configuration at the
top level:
node:
id: test
cluster: test
If you omit this, you’ll get an error like:
TlsCertificateSdsApi: node 'id' and 'cluster' are required. Set it either in 'node' config or via --service-node and --service-cluster options.
(In production, I use the pod’s hostname, like envoy-b958c94b7-2fbws
, for the
ID and ingress:public:https
as the cluster name. That is what my
cluster discovery service calls my
cluster. It doesn’t matter for this, but it does matter for other things. You
probably already have this set up.)
The result is a final envoy.yaml
that looks like:
node:
id: test
cluster: test
static_resources:
listeners:
- name: test
address:
socket_address:
protocol: TCP
address: 127.0.0.1
port_value: 10000
listener_filters:
- name: envoy.listener.tls_inspector
typed_config: {}
filter_chains:
- transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.api.v2.auth.DownstreamTlsContext
common_tls_context:
alpn_protocols: ["h2", "http/1.1"]
tls_certificate_sds_secret_configs:
sds_config:
path: /etc/envoy/sds.yaml
filters:
- name: envoy.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
stat_prefix: test
route_config:
virtual_hosts:
- name: test
domains: ["*"]
routes:
- match: { prefix: "/" }
direct_response:
status: 200
body:
inline_string: "Hello from Envoy"
http_filters:
- name: envoy.router
With that running, your certificates should be used by Envoy as soon as they are rotated!
There is a delay between a secret being updated and the volume mount changing, controlled by your cluster administrator (it’s a parameter to the Kubelet) – if you are watching the Kubernetes event log or cert-manager’s log, you might not see the new certificate as soon as you think it’s ready, but it should be available on the order of 5 minutes later.
Envoy also prints some logs at the debug
level:
[2020-04-26 18:41:04.243][23137][debug][file] [source/common/filesystem/inotify/watcher_impl.cc:72] notification: fd: 1 mask: 80 file: ..data
[2020-04-26 18:41:04.243][23137][debug][file] [source/common/filesystem/inotify/watcher_impl.cc:88] matched callback: directory: ..data
[2020-04-26 18:41:04.243][23137][debug][config] [source/extensions/transport_sockets/tls/ssl_socket.cc:678] Secret is updated.
[2020-04-26 18:41:04.245][23137][debug][file] [source/common/filesystem/inotify/watcher_impl.cc:88] matched callback: directory: ..data
Be aware that debug logging is not on by default; you’ll have to turn it on if
you want to watch this happen the first time. In general, the way that I check
that it worked is by looking at the /certs
admin API endpoint, or by the
server.days_until_first_cert_expiring
stat (which you should be feeding into
your monitoring).
The details⌗
When I read the changelog for Envoy 1.14, I knew I wanted to try this feature, but I also assumed that it wouldn’t be a simple cut-n-paste job to get it working. In retrospect, I was wrong; it was actually simple to get working since it was designed to exactly work with Kubernetes’s filesystem structure, and I happen to deploy on Kubernetes. I wrote a little test program, to try things out on my workstation before blindly forging ahead in production. This ended up taking quite a bit of time and did not work initially.
The first version of my code assumed that the atomic updating would work like
the rest of Envoy (through its
Runtime
configuration) – i.e., put your certificates in some directory, and symlink
another directory (call it data
) to that. Your certs are then in
/whatever/data/tls.key
and /whatever/data/tls.crt
, and /whatever/data
is
just a symlink to /somewhere/20190401-certs
. When you want the certificates to
rotate, you symlink the new directory to .tmp
or something, and then
atomically replace data
with .tmp
, mv -Tf .tmp data
.
However, Envoy does not recognize that sequence for certificates. It requires
you to do the exact dance that Kubernetes does, which involves two levels of
symlinks. If you have a volume mount /certs
, then the current version of your
Kubernetes secret is actually stored in /certs/..timestamp
(where timestamp is
actually something like 2020_04_09_17_25_30.145602340
). So you’ll have
/certs/..timestamp/tls.key
, etc., as a normal file. This current timestamp is
then linked to a directory called ..data
. Finally, /certs/tls.key
(and
friends), are linked to ..data/tls.key
. When a data update arrives, the files
are written to a new ..timestamp
directory, and the ..data
symlink is
atomically replaced. This is close to what I did in the first version of my
program, but not exactly the same. As a result, Envoy did not notice any changes
my program made. I changed my program to do exactly what Kubernetes does, and
then it started working. Now that program exists so you can test locally without
having to understand the details ;)
Here is the ls
output of that sort of directory structure:
# ls -laR jrock.us
jrock.us:
total 4
drwxrwxrwt 3 root root 140 Apr 9 17:25 .
drwxr-xr-x 1 root root 4096 Apr 9 17:25 ..
drwxr-xr-x 2 root root 100 Apr 9 17:25 ..2020_04_09_17_25_30.145602340
lrwxrwxrwx 1 root root 31 Apr 9 17:25 ..data -> ..2020_04_09_17_25_30.145602340
lrwxrwxrwx 1 root root 13 Apr 9 17:25 ca.crt -> ..data/ca.crt
lrwxrwxrwx 1 root root 14 Apr 9 17:25 tls.crt -> ..data/tls.crt
lrwxrwxrwx 1 root root 14 Apr 9 17:25 tls.key -> ..data/tls.key
jrock.us/..2020_04_09_17_25_30.145602340:
total 8
drwxr-xr-x 2 root root 100 Apr 9 17:25 .
drwxrwxrwt 3 root root 140 Apr 9 17:25 ..
-rw-r--r-- 1 root root 0 Apr 9 17:25 ca.crt
-rw-r--r-- 1 root root 3558 Apr 9 17:25 tls.crt
-rw-r--r-- 1 root root 1679 Apr 9 17:25 tls.key
Looking at the tests in the
PR where this feature
was introduced was helpful, as is the
related ticket. (It didn’t
make sense to me until I just kubectl exec
’d into a container and ran
ls -laR
on a mounted secret, though. If you are implementing something
similar, I recommend doing that. I would also greatly appreciate a link to the
code in Kubernetes that manages this. I spent about 20 minutes looking, but
couldn’t find it, which annoys me.)
Conclusion⌗
In the end, this was a very simple change. Here’s all I needed to do for my own personal site: jrock.us#3d986…
Anyway, I hope this is helpful to someone. I am glad I no longer have to care about certificates, and hopefully you don’t have to either!