Bits and Chaos


Between bits and chaos, a sysadmin stands.

Google’s namebench and your name server

Google has recently announced its own public DNS server, responding at IP address and (how nice). Also, they released Namebench, a Python tool to compare DNS performances.

Namebench basically determines your current DNS setup, some DNS you could use according to your ISP and geographic region, and tests them against also, of course, Google Public DNS.

Each DNS server is tested on the resolution of the most popular 10,000 site names, according to Alexa web survey. Each DNS test is done in parallel with the others, so network latency spikes are more evenly distributed.

I gave it a try, to meausure how fast Google DNS server is, how well my ISP performs and how good is the local DNS I’m using.

Namebench produces a lot of data, in the sake of clearness I show here only the graph of the response time, trimmed to the first 200ms of response time: each resolution taking more than 200ms is out of the graph.

In all the graphs, you see that almost every DNS does a lot of caching: cache plays a role in reducing to almost zero the response time, and after a cache miss the response time increases almost linearly, as the DNS server must perform a recursive query to give the answer to the client.

I made three run of Namebench, to see how much the cache plays a role for my local DNS server, which is the standard BIND shipped with Fedora 11, configured as a caching nameserver, without chrooting.

On the first graph, you could see that my local DNS resolves about 10% of the requests extremely fast: these requests get an answer from the local cache or require little interaction with external (root) DNS nameservers. All others requests require some network interaction, and the response time increase linearly. Take into account that all the graphs are for responses requiring up to 200ms, so there are not the unlucky interactions where my local DNS take 1800ms to give an answer: the local DNS has the worst performances in these (rare) cases.

The second graph is for a run made immediately after the first, to see the effect of local DNS cache filling: about 25% of the requests are now satisfied by the cache. In this run Namebench has replaced UltraDNS with the DNS of the University of Basilicata, Italy.

On the third graph, the cache for the local DNS plays the same as the second run, so there is a cache saturation effect. The local DNS is not suffering from memory saturation, so there is not point in increasing the local cache size by the max-cache-size directive.

There is something more in the graphs. Response curves have the shape of a constant (near zero) time for some of the queries, which means that the caches are massive, then the responses time grow linearly as the data in the cache are expired and the queried name server must contact the authoritative name servers doing a recursive query. Also:

  1. Google Public DNS has a cache hit for almost 50% of the requests, and for a cache miss the response time is dominated by the network time (from the DNS server point of view, i.e. the time it takes to do a recursive query) but this time is almost constant;
  2. OpenDNS response curvers are initially linear, which could means that the network path for reaching OpenDNS servers is not as optimized as Google’s path, but after that the cache is here to do its job;
  3. My ISP DNS (labeled as Wind2-IT) has usually good performances, probably more because the network path is its friend, it’s entirely possible that the cache is not so big;
  4. Local DNS suffers when, to fulfill a request, has to made some recursive queries, as these are usually carried over UDP and the local router is not higly optimized for UDP NATting (educated guess).

It is important to stress that the tests are made over the list of the 10,000 most popular websites: it’s probably the only way to have a benchmark of the general use, but if you visit just some a bunch of sites (as it’s usually the case) you must consider how much these results could apply to your environment. Also, these websites are all treated equal, while clearly popularity plays a role every time you deal with a cache.

These benchmarks have shown that my current setup (a local DNS) is the best, but when a cache miss occurs, and there are a lot of recursive queries to be made, the local router (and it’s UDP NATting function) is the bottleneck. Nothing to worry about, but an interesting sight to get.

Generally speaking, it’s fair to say that Google Public DNS is quite a good infrastructure, a fierce competitor both to an ISP DNS provided (which has the big benefit of the network latency) or OpenDNS (which is now several years in place).

Filed under: network, , , , , ,

HTTP cannot be longer used for authenticated web sites

If you are an user of a web site that requires authentication (which means, basically, every site) you usually access it from a network you don’t have control over it, i.e. you don’t know, besides many other things, which DNS server the infrastructure guy has chosen and which version it’s running.This means that you can be exposed to the well known Dan Kaminsky’s DNS hijack attack (you can actually check for this).

Leveraging on this vulnerability (it’s still plenty of DNS that haven’t fixed) it’s possible to implement a man in the middle attack at the application level, stealing your cookies from the authenticated HTTP session: ladies and gentlemen, please welcome CookieMonster. You are exposed even if your login page is protected via HTTPS, as the auth-cookie will be passed in cleartext in every subsequent HTTP interaction.

This worst case scenario requires a flawed DNS implementation (better, a DNS implementation following the original and flawed DNS protocol) so you can be reasonably safe if you always control your DNS or at least can have some trust in the guys that are operating it, but if you are a roaming user you are completely exposed.

So, as you are a competent Linux user, you could fix this in a very simple way: install a DNS caching webserver and use, as your primary DNS, something you could trust.

If you cannot do this, you must ask to your web application provider to fix this issue (some have already done this, as an example you can force all WordPress administration pages to be accessed only via HTTPS, and I’m writing this blog entry via HTTPS so it works).

If you are a system administrator, you must check and eventually fix your DNS implementation, and probably you should take a look at an SSL accelerator, because your connection peers (i.e. users accessing web sites under your control) could be from every possible insecure networks, and my 2 cents are that this man in the middle attack will be only the first of a new kind based on an interaction of different levels on the TCP/IP stack.

Filed under: network, security, , , ,