Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DNS SRV records #483

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Add DNS SRV records #483

wants to merge 2 commits into from

Conversation

emersion
Copy link
Contributor

I've seen a lot of users try to connect to "libera.chat" instead of "irc.libera.chat". This results in connection timeouts.

This is an attempt to improve the status quo.

Previous proposals:

@progval
Copy link
Contributor

progval commented Dec 10, 2021

As much as I like SRV, I wonder if this is the right direction. These days, the internet seems to be moving to "well-known" HTTP URLs, because web apps can't use the DNS.

Obviously web apps wouldn't connect to irc.libera.chat/6697 anyway; but we could define a single well-known for both normal sockets and websockets so that it's less technical overhead for network admins.

@aaronmdjones
Copy link
Contributor

You may want to clarify that the example record should be in the IN (Internet) namespace.

    _ircs._tcp IN SRV 0 1 6697 irc.example.org.

@emersion
Copy link
Contributor Author

emersion commented Dec 10, 2021

These days, the internet seems to be moving to "well-known" HTTP URLs, because web apps can't use the DNS.

I'd prefer SRV records, because I don't want to depend on an HTTP library in my IRC clients.

@emersion
Copy link
Contributor Author

You may want to clarify that the example record should be in the IN (Internet) namespace.

RFC 2782 doesn't use that syntax, nor does RFC 6186.

@grawity
Copy link
Contributor

grawity commented Dec 10, 2021

I'm ok with including IN, but ok with omitting it as well, as the other two classes (Hesiod and chaosnet) are so incredibly unlikely. Hell, even Hesiod itself switched to IN later.

However, I'd like the spec to explicitly say which hostname – the original input or the SRV target – should be used for TLS host verification. In other words, does the certificate need to be for libera.chat (like with cnames) or for irc.libera.chat (like with MX records)? That's something that took me a while to figure out when setting up Matrix, and it tends to vary between SRV-using protocols in general.

@aaronmdjones
Copy link
Contributor

aaronmdjones commented Dec 10, 2021

Verifying the certificate against the SRV target would be dangerous; it allows an active MITM to alter the SRV reply to get you to validate against a domain name they control.

@grawity
Copy link
Contributor

grawity commented Dec 10, 2021

Verifying the certificate against the SRV target would be dangerous; it allows an active MITM to alter the SRV reply to get you to validate against a domain name they control.

Indeed, which is probably why Matrix chose to handle it like CNAME and use the original domain name.

But many other SRV consumers do use the target domain instead (for example, LDAPS), either for historical reasons or because DNSSEC.

In the end, I don't care which way you decide to do it, I care about whether it'll be documented in the spec.

@slingamn
Copy link
Contributor

Indeed, which is probably why Matrix chose to handle it like CNAME and use the original domain name.

For many users (in particular, people using ACME with the http-01 challenge), if the root domain and the IRC domain point to different hosts, it doesn't seem practical to get a certificate covering both domains onto the IRC host. So it seems in order to use this, you'd have to transfer one of the certificates between servers (and then expose both certificates and rely on SNI).

IMO the concerns about MITM are sufficiently serious that this should not be used as a means of automatically redirecting users to the correct server --- it should be at most a mechanism to suggest to the user that they reconfigure, e.g. a dialog box saying "there's no IRC server on libera.chat; did you mean irc.libera.chat?"

I'd prefer SRV records, because I don't want to depend on an HTTP library in my IRC clients.

Just clarifying: AFAIK there's no way to look up a SRV record in JavaScript, so we're talking about desktop clients?

@emersion
Copy link
Contributor Author

emersion commented Dec 11, 2021

For reference, the Matrix spec (https://spec.matrix.org/latest/server-server-api/#resolving-server-names):

[…] a server is found by resolving an SRV record for _matrix._tcp.<hostname>. This may result in a hostname (to be resolved using AAAA or A records) and port. Requests are made to the resolved IP address and port, using 8448 as a default port, with a Host header of <hostname>. The target server must present a valid certificate for <hostname>.

I agree we should require the certificate to be valid for the original hostname.

For many users (in particular, people using ACME with the http-01 challenge), if the root domain and the IRC domain point to different hosts, it doesn't seem practical to get a certificate covering both domains onto the IRC host. So it seems in order to use this, you'd have to transfer one of the certificates between servers (and then expose both certificates and rely on SNI).

Let's go through the possible setups for network operators here:

  1. A single server for example.org and irc.example.org: just share the cert on the filesystem
  2. One server for example.org, multiple IRC servers behind irc.example.org sharing the load (e.g. Libera Chat): then http-01 can't be used anyways, since the server which will have to complete the challenge is random. I bet these networks are already using dns-01, can someone from Libera confirm?
  3. A single server for example.org and a single server for irc.example.org: these are the problematic setups.

How often does (3) happen in practice? It sounds like sharing the certs would be a minor annoyance in this specific case.

IMO the concerns about MITM are sufficiently serious that this should not be used as a means of automatically redirecting users to the correct server --- it should be at most a mechanism to suggest to the user that they reconfigure, e.g. a dialog box saying "there's no IRC server on libera.chat; did you mean irc.libera.chat?"

This severely degrades the client's UX. I think with the TLS cert requirement the MITM concerns are resolved.

Another way to resolve them would be to add a subdomain requirement: the SRV target MUST be a subdomain of the original hostname. This wouldn't cover rarer cases like ubuntu.comirc.libera.chat and this doesn't seem as strong as the TLS cert requirement.

Just clarifying: AFAIK there's no way to look up a SRV record in JavaScript, so we're talking about desktop clients?

Yes, only clients connecting via TCP are taken into account here. JavaScript clients don't connect via TCP as noted above, and are typically configured by the network operator themselves, so wouldn't benefit from a discovery mechanism regardless.

@aaronmdjones
Copy link
Contributor

One server for example.org, multiple IRC servers behind irc.example.org sharing the load (e.g. Libera Chat): then http-01 can't be used anyways, since the server which will have to complete the challenge is random. I bet these networks are already using dns-01, can someone from Libera confirm?

We are. Adding an extra hostname to our certificates requires no further effort on our part.

@slingamn
Copy link
Contributor

I think with the TLS cert requirement the MITM concerns are resolved.

ACK, this resolves my concerns.

@grawity
Copy link
Contributor

grawity commented Dec 12, 2021

Latest change looks good to me.

emersion added a commit to emersion/soju that referenced this pull request Dec 21, 2021
@emersion
Copy link
Contributor Author

@grawity
Copy link
Contributor

grawity commented Dec 22, 2021

Hmm out of curiosity, are the networks in question planning to use SRV purely as a redirect (single record pointing to the existing irc round-robin), or would they use SRV directly for load-balancing (multiple records pointing to individual servers)?

That is, should clients expect to be able to just take srv[0] like in soju, or should they at least build a flat list from all records, or should they go all the way and handle priorities/weights?

@emersion
Copy link
Contributor Author

The Go library already takes care of the shuffling, srv[0] is random from the list: https://cs.opensource.google/go/go/+/refs/tags/go1.17.5:src/net/dnsclient.go;drc=refs%2Ftags%2Fgo1.17.5;l=194


Example:

_ircs._tcp SRV 0 1 6697 irc.example.org.
Copy link

@lanodan lanodan Dec 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The record should be explicitly be for example.org, I think it would be better for the TLS explanation below.

Suggested change
_ircs._tcp SRV 0 1 6697 irc.example.org.
_ircs._tcp.example.org. SRV 0 1 6697 irc.example.org.

@kylef
Copy link
Contributor

kylef commented Jan 1, 2022

IRC networks conforming to this specification MUST publish an SRV record with the "ircs" service label

The expected behaviour for a conforming network is very clear and explicit, however we do not cover what circumstances a conforming IRC client should query for a SRV record, or if they must etc always check it etc. The linked client implementation in soju uses heuristics such as the end user did not specify a port to use.

Should we strengthen up the desired client behaviours in the spec?

@slingamn
Copy link
Contributor

slingamn commented Jan 1, 2022

It seems appropriate to me to leave this implementation-defined. I had actually imagined this as a fallback that can be used when the initial connection attempt fails.

What would be gained by mandating a specific client behavior here?

@kylef
Copy link
Contributor

kylef commented Jan 2, 2022

What would be gained by mandating a specific client behavior here?

Clients can offer a consistent experience in regards to DNS SRV records. There isn't a suggestion, recommendation, or should clause at the moment.

For network or server operators making a decision if they want to add DNS SRV records, one consideration would be how they are used by clients and the net user experience which cannot be determined by this specification. Each client doing something different with the record may hamper the usability of the feature, or desirability of implementation.

Comment on lines +41 to +43
After discovering an IRC server via SRV records, the client MUST check that the
TLS server certificate is valid for the original hostname (`example.org` in the
example above).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not very practical when example.org hosts other services than the IRC network. If the private certificate of IRC server is leaked, TLS communications of other services are compromised as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the concern here. Is the claim that IRC servers have a larger attack surface for compromises than, say, the HTTP servers or load balancers that would normally serve a certificate covering example.org?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, attack surface of IRC server is much larger than of reverse proxy HTTP server.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree, but an operator who felt this way could put a TLS-terminating reverse proxy in front of their IRC server. (For example, nginx supports this with the stream directive.)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could indeed be a good solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not very practical when example.org hosts other services than the IRC network. If the private certificate of IRC server is leaked, TLS communications of other services are compromised as well.

As explained above; if you do not enforce that the certificate is valid for the original hostname (the one the client looked up), then a trivial DNS MITM is possible, to direct the client to a hostname of the attacker's choosing, defeating the entire point of using TLS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not very practical when example.org hosts other services than the IRC network.

It seems to be practical enough for XMPP server operators. XMPP also requires the certificate to validate against the domain from JID, not against the SRV target.

Copy link
Contributor

@kylef kylef Jan 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just throwing an idea up there (haven't given it much thought). Could the TLS hostname validation use the SRV hostname instead (for exampole _ircs._tcp.example.org). That way we would be restricting the certificate for only use with the irc service and not any other service running in the example.org domain (where comprimise of cert could allow it to be used for other services in the root domain).

For clarity, I am proposing that _ircs._tcp.example.org would be in the TLS SAN. (not the value of the SRV record).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you can't have underscores in dNSName, RFC 4985 exists as an alternative. Sadly, Baseline Requirements of CA/B Forum don't allow issuance of SRVName yet, but there is a discussion to allow them.

@slingamn
Copy link
Contributor

slingamn commented Jan 2, 2022

@kylef I take your point, but, given that there is no clear candidate for a recommendation, a SHOULD is toothless and a MUST incentivizes implementations that disagree with the MUST to ignore the specification altogether.

Maybe we should start collecting potential client behaviors so that either (a) one of them could be selected as a SHOULD (b) they could all be listed in a non-normative section?

@slingamn
Copy link
Contributor

slingamn commented Jan 2, 2022

I guess I should state my own priorities: I care a lot about the handshake being fast, so I don't want to add a recommendation (even a SHOULD) that would mandate a SRV lookup (even with caching) in the case where the client is already "correctly" configured.

@edk0
Copy link
Contributor

edk0 commented Jan 30, 2022

FWIW we've added a basic SRV record for libera.chat, which is now also in our certificate SANs. I'd love to know if people are finding this useful. We might consider using the load balancing features of SRV instead of just pointing to the round robin if a significant number of clients intend to respect them.

$ dig +short srv _ircs._tcp.libera.chat
0 1 6697 irc.libera.chat.

It's also duplicated on irc.libera.chat, which I'm less sure about, but it'd be nice to get away from well-known ports altogether.

As for the cert debate, MTAs have been relying on DNSSEC for this forever. Can we do the same? (But I also really don't mind just using the original hostname for validation if that's what everyone wants)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants