This page aims to look at how DNS support in Asterisk has evolved over time, how modules use DNS, and potential ways to improve it.
You might be asking yourself why this page exists and what was the catalyst for its creation. During the development and subsequent testing of the chan_pjsip module it was discovered that while the DNS support within PJSIP is an improvement over that in chan_sip it does not implement all of the desired functionality. This includes NAPTR support, full IPv6 support, and unspecified SRV lookups.
The DNS resolver in PJSIP does not currently have any support for NAPTR. This is left as a TODO comment.
Full IPv6 Support
When resolving hosts the PJSIP resolver will always look for A records and never AAAA records.
The type of SRV record to look up is based on the transport parameter of the SIP URI. If no transport parameter is provided UDP is always queried.
While DNS support within Asterisk has not changed substantially over time it has seen some additions:
DNS resolution within Asterisk started with using the gethostbyname function. This blocks the calling thread while performing an A record lookup on a provided hostname.
SRV support within the core was added using DNS primitives available from the system. Note that just like the above this operates in a blocking fashion. Multiple records can be returned but it is up to the calling function to use them.
DNS manager was added to Asterisk to solve the issue of stale DNS entries remaining in use after they may have expired. It works by periodically looking up hostnames it is aware of and calling a callback in the module using it to inform it of the update. Note that just like the above on an initial lookup it blocks the calling thread.
Support for IPv6 was added to Asterisk by wrapping system provided functions which include support for it, such as getnameinfo. Yet again this blocks the calling thread.
Many modules follow a common simple approach for handling DNS resolution.
For configured hostnames modules call the Asterisk provided DNS resolution wrappers in a blocking fashion when reading the configuration and store a single result. This result is usually never updated unless a reload is invoked when changes are made. This can lead to DNS records changing with the module having no knowledge that it has occurred.
Modules that provide SRV functionality use only the first result after applying weights and preferences.
Hostnames that need to be resolved at runtime as the system operates also use the DNS resolution wrappers but do not cache the result for any future lookups. If the DNS server the system is configured to use is unavailable this will cause "weirdness" as many things start to block for long periods of time.
Improving DNS Support in PJSIP
Since this page was created as a result of chan_pjsip I'm going to touch on improving DNS support within PJSIP itself. The PJLIB library, which PJSIP uses, has a built-in DNS client. In order to extend support the additional features would need to be added to this DNS client. While Teluu is the maintainer of PJSIP this would require us to write the features and to, in the future, continue to add new features if needed. Extending this DNS client would also not advance DNS support within Asterisk itself. Its use and scope would be limited to chan_pjsip.
Parsing support for NAPTR records would need to be added. Lookups would need to be changed to perform NAPTR, SRV, and then A/AAAA.
Lookups would need to be changed to do an AAAA lookup and then an A lookup.
SRV lookups would need to be changed to not require an explicit transport to determine which transport to use.
Improving DNS Support Everywhere
Core DNS support can be improved in one of two ways:
Write our own DNS client
This option allows us to ship great DNS support within Asterisk with no outside dependencies. Any user of Asterisk would immediately benefit from it without having to go to extra lengths. This option places a huge burden on us, however, to become the maintainer of our own DNS client. With new features being added to DNS this would require us to be constantly aware and keeping things up to date. Failure to do so would result in Asterisk returning to a state where the DNS support is sub-par.
Use a third-party DNS client
There exist many third-party DNS clients which are license compatible with Asterisk. This places less of a burden on us for maintenance but does mean we would be relying on others to further our DNS support. From a user experience perspective it also introduces another dependency to be installed if better DNS support is required. Depending on the core implementation this could, however, be optional.
Third Party DNS Clients
While this list is small from my research many DNS client libraries present a very similar API. A synchronous and asynchronous API function is made available which takes in the record type and the host. Once resolved the results are made available in a structure containing the DNS records. This makes presenting a core unified pluggable API extremely easy.
The c-ares library is extremely popular and is in use by many projects. A few include libcurl, node.js, and Wireshark. It supports synchronous and asynchronous lookups of all record types. It also provides convenience functions which parse a few different record types (such as NAPTR) into easier to consume structures. Various options exist to tweak the behavior, such as forcing TCP, changing DNS server addresses, and retry behavior. Documentation is straight forward but does not include complete examples for different scenarios.
The documentation provided on the website is 2-3 years out of date. It also does not present an example or a suggested way of using the library. In order to come up with a working solution you have to experiment using the out of date documentation, using the source code as guidance, using other projects as guidance, and talking to other people who have used it.
Using the library is somewhat easy in the end though. It seems to want you to create a resolver "channel" for each query you want to do. While it is possible to use the same channel for multiple queries this is problematic as will be explained later. Once you have a channel you call functions telling it to resolve things and it does. Once resolved a callback is invoked with the result and a void pointer you pass in. The potential trouble, though, is that you need to synchronize with the thread processing the results if you want async with a long lived thread servicing the resolver channel. This is because unless a query is going on the c-ares library it will not provide you a file descriptor to poll on. If TCP is enabled the file descriptor for that is always returned. This is why the approach of creating an individual channel for each query and then spawning a thread (or using a threadpool) for the resolver channel for async seems to be the way to go. It removes the synchronization that has to occur when using the persistent long lived thread. In the case of sync resolution polling and processing in the calling thread is what you need to do, unless you otherwise wait in the calling thread and signal back from the persistent thread actually doing the processing. It's also not possible to cancel individual queries on a channel. Cancelling will cancel ALL queries on it. The c-ares library does not cache results at all.
The library is pretty much available everywhere.
Releases are made sporadically. Some years have more, some years have less. The mailing list has between 5 to 20 emails on average per month. Some posts go unanswered. Issues and contributions are submitted to the mailing list (while github pull requests can be done it is not preferred). Contributions are merged swiftly upon acceptance and vouching/testing by others. Going back through mailing list archives it's hard to determine the response time for reported issues. Many just get submitted with a patch.
The libunbound library is part of the unbound DNS resolver and shares much of the same code. It supports synchronous and asynchronous lookups of all record types as well as DNSSEC. Various options exist to tweak the behavior, such as forcing TCP, changing DNS server addresses, and retry behavior. Documentation is straight forward and includes various examples for scenarios. What is not provided is a convenience mechanism to parse record types into an easier to consume structure. The as-is DNS records are returned.
The documentation provided on the website (which is basically a man page) is up to date with the current implementation. Examples are provided for different use cases (synchronous, asynchronous) and they work. They are a suitable starting basis for experimenting with the library and can get you going fast.
The libunbound library uses what it call a "context" to refer to a resolver. A context can be used for multiple queries and each query are individually addressable. When you asynchronously do a query you get back an identifier which can be used to subsequently cancel that specific query. Each context also has a DNS cache which obeys the TTL (configuration exists to impose min/max if needed). The library also provides a single file descriptor on each context which can be polled on in a long lived thread. When data is available to be processed it becomes readable. The library provides individual functions for both sync and async resolution. The sync function will block the calling thread until results are available. The async function returns immediately and invokes a callback that is passed in with the results and a void pointer you pass in when resolution completes. The library also supports DNSSEC. This does require additional configuration but the functionality exists.
The library is available in more recent (well, semi-recent) distros (Ubuntu 10.04+, CentOS 5+).
Releases are made sporadically. Some years have more, some years have less. As the project consists of both a DNS server and a DNS client library aspect the mailing list is more active than that of c-ares. It sees between 20-30 emails on average per month. This also means that any contributions and improvements to the DNS server may also improve the DNS client. There is a publicly accessible issue tracker where contributions can be submitted and issues can be reported. Reported issues are also fixed quickly. Contributions are merged swiftly upon acceptance.
Core DNS API
Since DNS clients all share the same common fundamental interface it is easy to abstract them behind a core pluggable DNS API. This allows a different one to be plugged in if requirements or features change but also allows the presence of a third-party DNS client to be optional and not mandatory. If a third-party DNS client is not available a system level resolution module can be used instead. A numerical priority value will determine which module is used if multiple are present.
A resolver is a module which performs the actual DNS resolution. This is the pluggable aspect. The interface for a resolver may look like this:
The interface with a resolver is always asynchronous to lower the requirements of the resolver and to also provide a consistent synchronous experience to users. Synchronous support can be implemented in a generic fashion using the asynchronous functionality.
The resolution API is the API used by modules (and the core) to interface with a resolver and perform DNS resolution. The interface for resolution may look like this:
A result is the records and information accumulated after a resolver has completed the DNS resolution process. A result may look like this:
The result would contain the completely DNS record.
One area where I could see some issues is the fact that DNS resolution would return DNS records and not higher level structures. As other DNS clients have not adopted helper functions like c-ares to do this it may be worthwhile to implement some of these as part of the Asterisk DNS API itself. A, AAAA, NAPTR, and SRV parsing helpers. They would interpret the raw DNS record and construct a suitable structure with the information. This would make it easier for consumers to adopt this new API.
Low Level Adoption
With a new DNS API there should be users which take advantage of it. There are a few ways to go about this:
As the new DNS API would be completely standalone anything wishing to take advantage of the features it provides would need to be explicitly updated to use it. This would minimize any potential regressions as a result of its introduction.
Behind The Scenes
The existing core wrappers for system resolution could be updated to use the new DNS API. Each consumer of these would not need to be updated to use it but would be leveraging the new DNS API. If modules outgrow this or want additional features they can be updated to use it. This introduces some risk but it can be tested and made configurable.
Change The World
Each module would be updated to use the new DNS API. This would be a major effort and increases the potential regressions to an unsafe amount.
At a fundamental level users must be changed to take full advantage of the new DNS API. They must use asynchronous resolution and callbacks which are invoked upon resolution completion. They must use all results available and not cache results. Resolution must occur at use time, at which point a resolver may provide a cached result. When reading in configuration they should not resolve the provided hostname.
The Path Forward
I think after looking at the investment that would have to be made to either roll our own DNS client or to advance PJLIB's it becomes evident that the best solution is to let something else handle DNS and have us simply provide a consistent interface to it. This will allow us to have the best of both worlds but not require us to become a full time maintainer of a DNS client. Using the pluggable mechanism if we dislike the direction that the DNS client we have chosen is taking we can swap it out without any impact to users. To gain some of the benefits of the new DNS API I think it would also be worthwhile to try to aim for the "Behind The Scenes" adoption approach. While this does not make modules smarter it would allow them to use the DNS cache that many DNS clients provide with no modifications to them required.