It's Always DNS: The February 2018 Issue

Posted 1519689600 seconds after the Unix epoch

There’s this “It’s Always DNS” meme that often goes round the Interwebs, somewhat bolstered by the Twitter outages that occurred (DNS, of course) in late 2016. I haven’t had that many DNS problems in my line of work, but the other day I saw something that made me jump right on the “It’s always DNS” bandwagon.

Cat trying (not very hard) to reach a nameserver

Cat trying (not very hard) to reach a nameserver

So I was setting up a test instance of an internal application at work, as one often does. In recent times, Docker and its associate tools have made this a breeze. (However, they can sometimes suddenly move beneath your feet in ways you don’t expect. More on this in a later post.) The application was set to authenticate over LDAP.

I found that the initial login and load was taking something of the order of 5 to 7 seconds! This Netscapesque login time was absolutely not acceptable, so I went spelunking. After setting up logging for measuring how long it took to render the page (helped along by this excellent post), it was clear that the problem was not with the page, but with how long the user was taking to authenticate to the LDAP server.

Bingo! So I tried authenticating directly with the LDAP server using ldapsearch, and saw a minute delay. Here, a coworker dropped in and reminded me that DNS was likely to be an issue, so I dug a bit deeper, and played around with DNS servers.

It turned out that in my hastily setup VM that was hosting the docker containers for the application, had its primary DNS server set incorrectly! So, the /etc/resolv.conf looked something like this:

nameserver a.b.c.d
nameserver x.y.z.w

The primary nameserver was set incorrectly, and hence, when resolving the LDAP server, the auth module I was using would try to resolve via the primary nameserver, fail, and then fall back to the secondary before finally authenticating. (man resolv.conf showed the default timeout as 5s, which tallied quite well with the observation.) Simply removing the offending a.b.c.d fixed the problem and made the authentication and subsequent page load much snappier.

Lessons learnt:

❧ Please send me your suggestions, comments, etc. at