Skip to content

Possible bad propagation check with dns-01 challenge #1777

@oseiberts11

Description

@oseiberts11

Welcome

  • Yes, I'm using a binary release within 2 latest releases.
  • Yes, I've searched similar issues on GitHub and didn't find any.
  • Yes, I've included all information below (version, config, etc).

What did you expect to see?

A sucessful certificate generation.

What did you see instead?

We had an error message from Let's Encrypt:
2022/11/30 10:24:59 error: one or more domains had a problem: [cloud.syseleven.de] acme: error: 400 :: urn:ietf:params:acme:error:dns :: DNS problem: SERVFAIL looking up TXT for _acme-challenge.cloud.syseleven.de - the domain's nameservers may be malfunctioning",
but the propagation check with 2 name servers apparently passed:

2022/11/30 10:23:40 [INFO] [cloud.syseleven.de] acme: Could not find solver for: tls-alpn-01
2022/11/30 10:23:40 [INFO] [cloud.syseleven.de] acme: Could not find solver for: http-01
2022/11/30 10:23:40 [INFO] [cloud.syseleven.de] acme: use dns-01 solver
2022/11/30 10:23:40 [INFO] [cloud.syseleven.de] acme: Preparing to solve DNS-01
2022/11/30 10:23:50 [INFO] [cloud.syseleven.de] acme: Trying to solve DNS-01
2022/11/30 10:24:00 [INFO] [cloud.syseleven.de] acme: Checking DNS record propagation using [8.8.8.8:53 4.4.4.4:53]
2022/11/30 10:24:10 [INFO] Wait for propagation [timeout: 10m0s, interval: 10s]
2022/11/30 10:24:10 [INFO] [cloud.syseleven.de] acme: Waiting for DNS record propagation.
2022/11/30 10:24:20 [INFO] [cloud.syseleven.de] acme: Waiting for DNS record propagation.
2022/11/30 10:24:30 [INFO] [cloud.syseleven.de] acme: Waiting for DNS record propagation.
2022/11/30 10:24:47 [INFO] [cloud.syseleven.de] acme: Cleaning DNS-01 challenge

With a successful result, before the last line, there would have been a message like
[cloud.syseleven.de] The server validated our request.

Note that we used 4.4.4.4 which was a working public name server in the past, but apparently no longer. It must have stopped working relatively recently. We discovered this while trying to debug this. As of now, it does not respond to any query.

However there was no indication from lego that there was a problem and it looks like it accepted the broken server as working, and continued on, as if everything was working.

Even when we replaced 4.4.4.4 with another server, the next attempt failed in the same way.

This makes me think that the propagation check doesn't really work. How else could a random nameserver serve the correct TXT record (I surely hope that this is part of the check, right?) but when Let's Encrypt does the query it fails. I noticed that you get the SERVFAIL error also if the TXT record is simply missing. It seems extremely unlikely that the name servers worked long enough for a query via 8.8.8.8 to work, and then suddenly broke when Let's Encrypt

How do you use lego?

Docker image

Reproduction steps

We use a gitlab CI pipeline to run this command periodically:
lego --accept-tos --dns, designate --path /tmp/lego --dns.resolvers 8.8.8.8 --dns.resolvers", 4.4.4.4 --server=https://acme-v02.api.letsencrypt.org/directory --email [email protected] --key-type rsa4096 -d "*.cloud.syseleven.net" -d "*.infra.sys11cloud.net" -d "*.infrabk.sys11cloud.net" -d "*.infrabl.sys11cloud.net -d "*.infrafe.sys11cloud.net" -d "cloud.syseleven.de" renew --preferred-chain "ISRG Root X1"

Version of lego

Our docker image is based on

`FROM goacme/lego:v4.9.1`

Logs

See above

Go environment (if applicable)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions