Like many people, I use cert-manager
to automatically renew my website’s TLS certificates with
Let’s Encrypt. Unlike many people, I don’t use an
Ingress controller to get traffic into my cluster, I just have a few instances
of Envoy that terminate TLS and route traffic to the
appropriate backend. Cert-manager handles the mechanics of certificate renewal
very efficiently; it runs a controller loop that checks all my Certificate
objects for expiration, and when a certificate is close to expiring, it goes out
and renews it. It then updates a Kubernetes Secret with the new key material,
and Kubernetes then makes that new data available to pods that have mounted the
Secret as a volume. From there, it’s up to the application to notice that some
symlinks have moved around and reload the certificate. Up until very recently,
Envoy did not bother to check. So at some point, you had to do a rolling restart
of the Envoy deployment to pick up the new certificate. Because there is 30 days
between when cert-manager renews the certificate and when the old certificate
actually expired, this was rarely a problem in practice. If any of your machines
went down, or you edited the config file to add a new route, or you upgraded
Envoy itself, the pod containing Envoy would restart, pick up the new
certificates, and you’d never notice that it wasn’t automatically reloading the
certificate.