Bits and Chaos


Between bits and chaos, a sysadmin stands.

Internet Traffic Consolidation

We had learnt and taught Internet as an hierarchy of ASN, starting from local ISP to regional to Tier 1, with traffic graciously moving between the levels.

This is no longer true, according to this presentation from NANOG47:

  • 150 ASN accounts for 50% of all Internet traffic;
  • Revenues from Internet Transit are declining, whilst revenues from Internet Advertisement compensates;
  • The new rule is 30/30: the 30 top destinations (Google, Yahoo, Facebook, …) accounts for 30% of all traffic, so if you are a provider you’d better make a deal with them: your customer would get a better Internet experience which is a commercial advantage: as a result, Youtube bandwidth bill is a lot less than you could imagine.

It’s time to rewrite some courses material.

Filed under: network, , , ,

LinkedIn and MTU settings for Linux systems

For reasons quite above my understanding, some (including mine) Linux systems are unable to access LinkedIn. Symptoms include hanging forever in the login page, i.e. you could access the authentication page, reading some and yours profile, but cannot do anymore.

This could fixed by issuing, as root:

ifconfig eth0 mtu 1360

(assuming that you reach the Internet via eth0).

It’s quite a strange setup, indeed. The only other time I had to do something like that was when I was trying to reach a Moodle server, that we had put on a LAN connected to the Internet via an ADSL consumer connection; the server was reachable from each customer of the same ISP but stuck for everyone else, everytime the poor guy requests a page whose size is more than or reaching the TCP/IP maximum payload (I guess this is for some kind of NAT magic/MPLS black magic/Peering sorcery that happens only for customers outside the AS).

I’m pretty sure that LinkedIn is not using a customer ADSL to connect itself to the Internet, and that they are seeing a constant loss of Linux customers due to this issue, which is very difficult to spot.

Filed under: network, , , ,

Google’s namebench and your name server

Google has recently announced its own public DNS server, responding at IP address and (how nice). Also, they released Namebench, a Python tool to compare DNS performances.

Namebench basically determines your current DNS setup, some DNS you could use according to your ISP and geographic region, and tests them against also, of course, Google Public DNS.

Each DNS server is tested on the resolution of the most popular 10,000 site names, according to Alexa web survey. Each DNS test is done in parallel with the others, so network latency spikes are more evenly distributed.

I gave it a try, to meausure how fast Google DNS server is, how well my ISP performs and how good is the local DNS I’m using.

Namebench produces a lot of data, in the sake of clearness I show here only the graph of the response time, trimmed to the first 200ms of response time: each resolution taking more than 200ms is out of the graph.

In all the graphs, you see that almost every DNS does a lot of caching: cache plays a role in reducing to almost zero the response time, and after a cache miss the response time increases almost linearly, as the DNS server must perform a recursive query to give the answer to the client.

I made three run of Namebench, to see how much the cache plays a role for my local DNS server, which is the standard BIND shipped with Fedora 11, configured as a caching nameserver, without chrooting.

On the first graph, you could see that my local DNS resolves about 10% of the requests extremely fast: these requests get an answer from the local cache or require little interaction with external (root) DNS nameservers. All others requests require some network interaction, and the response time increase linearly. Take into account that all the graphs are for responses requiring up to 200ms, so there are not the unlucky interactions where my local DNS take 1800ms to give an answer: the local DNS has the worst performances in these (rare) cases.

The second graph is for a run made immediately after the first, to see the effect of local DNS cache filling: about 25% of the requests are now satisfied by the cache. In this run Namebench has replaced UltraDNS with the DNS of the University of Basilicata, Italy.

On the third graph, the cache for the local DNS plays the same as the second run, so there is a cache saturation effect. The local DNS is not suffering from memory saturation, so there is not point in increasing the local cache size by the max-cache-size directive.

There is something more in the graphs. Response curves have the shape of a constant (near zero) time for some of the queries, which means that the caches are massive, then the responses time grow linearly as the data in the cache are expired and the queried name server must contact the authoritative name servers doing a recursive query. Also:

  1. Google Public DNS has a cache hit for almost 50% of the requests, and for a cache miss the response time is dominated by the network time (from the DNS server point of view, i.e. the time it takes to do a recursive query) but this time is almost constant;
  2. OpenDNS response curvers are initially linear, which could means that the network path for reaching OpenDNS servers is not as optimized as Google’s path, but after that the cache is here to do its job;
  3. My ISP DNS (labeled as Wind2-IT) has usually good performances, probably more because the network path is its friend, it’s entirely possible that the cache is not so big;
  4. Local DNS suffers when, to fulfill a request, has to made some recursive queries, as these are usually carried over UDP and the local router is not higly optimized for UDP NATting (educated guess).

It is important to stress that the tests are made over the list of the 10,000 most popular websites: it’s probably the only way to have a benchmark of the general use, but if you visit just some a bunch of sites (as it’s usually the case) you must consider how much these results could apply to your environment. Also, these websites are all treated equal, while clearly popularity plays a role every time you deal with a cache.

These benchmarks have shown that my current setup (a local DNS) is the best, but when a cache miss occurs, and there are a lot of recursive queries to be made, the local router (and it’s UDP NATting function) is the bottleneck. Nothing to worry about, but an interesting sight to get.

Generally speaking, it’s fair to say that Google Public DNS is quite a good infrastructure, a fierce competitor both to an ISP DNS provided (which has the big benefit of the network latency) or OpenDNS (which is now several years in place).

Filed under: network, , , , , ,

HTTP cannot be longer used for authenticated web sites

If you are an user of a web site that requires authentication (which means, basically, every site) you usually access it from a network you don’t have control over it, i.e. you don’t know, besides many other things, which DNS server the infrastructure guy has chosen and which version it’s running.This means that you can be exposed to the well known Dan Kaminsky’s DNS hijack attack (you can actually check for this).

Leveraging on this vulnerability (it’s still plenty of DNS that haven’t fixed) it’s possible to implement a man in the middle attack at the application level, stealing your cookies from the authenticated HTTP session: ladies and gentlemen, please welcome CookieMonster. You are exposed even if your login page is protected via HTTPS, as the auth-cookie will be passed in cleartext in every subsequent HTTP interaction.

This worst case scenario requires a flawed DNS implementation (better, a DNS implementation following the original and flawed DNS protocol) so you can be reasonably safe if you always control your DNS or at least can have some trust in the guys that are operating it, but if you are a roaming user you are completely exposed.

So, as you are a competent Linux user, you could fix this in a very simple way: install a DNS caching webserver and use, as your primary DNS, something you could trust.

If you cannot do this, you must ask to your web application provider to fix this issue (some have already done this, as an example you can force all WordPress administration pages to be accessed only via HTTPS, and I’m writing this blog entry via HTTPS so it works).

If you are a system administrator, you must check and eventually fix your DNS implementation, and probably you should take a look at an SSL accelerator, because your connection peers (i.e. users accessing web sites under your control) could be from every possible insecure networks, and my 2 cents are that this man in the middle attack will be only the first of a new kind based on an interaction of different levels on the TCP/IP stack.

Filed under: network, security, , , ,

Why we need to deploy IPv6 inter-networks, and why we need it now

As I have a background from the research, I always found IPv6 as a excellent protocol, because it solves so many problems that IPv4 has, and allows us for building new infrastructures and services on top of it.

IPv6 deployment is minimal at best, and even in the research community there are resistances. I submitted an article to a workshop which shows an architecture that has great benefits from using IPv6 as transport protocol: using IPv4 as transport protocol is affordable only for customers that can afford pay a lot of money to have dedicated circuits, when the adoption of IPv6 solve the specific problem (of which I wouldn’t say more, sorry) for everyone, by doing the correct resource allocation at the proper level. This article has been refused because, among other things, it’s using IPv6 which seems exotic, not an urgent necessity.

I’m working on an expanded version of the article, that I’ll submit to a more network-aware conference, but I see this refusal as a clear indication that some people, even the most supposed tech-savy, does not realize that we are going to hit a wall, and hit it badly.

Articles begin to appear in the specialized media to suggest people to begin thinking to switch, this reports the opinion of John Curran, Board of the ARIN in the last decade. He evaluates the problem as complex as the Y2K problem, ad he’s shifting (partially) the problem solution from the service providers (that I guess could temporarily accept a balkanized Internet) to content providers: social sites cannot accept to be segregated to serve only a part of the Internet users, and as they are valued in the order of billions, they could give some help or, at least, pushing for the adoption of the IPv6 model. Suggest this article to your web content manager, when you require some IPv6 related budget.

People more interested in the mathematical model can worry themselve by reading the details and begin the countdown: we have between 2 and 4 years.

Filed under: network, ,