Mac OS X 10.6 - Clear DNS cache

Normally I wouldn't post something relatively trivial like this. However it came to my attention today that the Mac OS X may cache out-of-date DNS entries as long as the record is still valid according to its TTL.  I might not expect this behavior because the Mac is not running a DNS server and its resolv.conf points to an authoritative DNS server that is serving up-to-date data.

I would expect a copy of BIND or another caching DNS server to cache old records until the TTL expires, but not a normal network OS running a resolver client with a resolv.conf file etc.  I'm running Mac OS X 10.6.8.  Anyway, to clear the DNS cache on my mac, I typed this:

$ dscacheutil -flushcache

Curiously a google search of "mac os x flush dns" did not turn up that simple command, so I'm putting it on my blog for posterity.

Recent server outage and VirtualBox

This server is a Linux x86_64 virtual machine hosted in a Linux x86_64 system which was running Oracle VirtualBox 4.1.8, which was the most current release the last time the server and all the VMs were rebooted. Another virtual machine also running under VirtualBox 4.1.8 apparently caused a resource exhaustion issue which took down all of the virtual machines on the physical server, and even caused some kernel application-crash tracebacks in the system log on the physical host server.  This caused my web site to be down.

You don't usually see VIRTUAL MACHINES get SCSI, SATA, or IDE disk errors when the host hard drives are fine. In this case the resource-hogging VM apparently caused enough memory issues with the host that the other VMs started having disk, bus, or other issues, like these:

ata3.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
ata3.00: failed command: WRITE FPDMA QUEUED
ata3.00: cmd 61/08:00:58:51:e6/00:00:01:00:00/40 tag 0 ncq 4096 out res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata3.00: status: { DRDY }
ata3: hard resetting link
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: qc timeout (cmd 0xec)
ata3.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata3.00: revalidation failed (errno=-5)
ata3: hard resetting link
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: qc timeout (cmd 0xec)
ata3.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata3.00: revalidation failed (errno=-5)
ata3: limiting SATA link speed to 1.5 Gbps
ata3: hard resetting link
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata3.00: qc timeout (cmd 0xec)
ata3.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata3.00: revalidation failed (errno=-5)
ata3.00: disabled
ata3.00: device reported invalid CHS sector 0
ata3: hard resetting link
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata3: EH complete
sd 2:0:0:0: [sda] Unhandled error code
sd 2:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 2:0:0:0: [sda] CDB: Write(10): 2a 00 01 e6 51 58 00 00 08 00
end_request: I/O error, dev sda, sector 31871320
Buffer I/O error on device sda3, logical block 3670315
lost page write due to I/O error on sda3

I tried to restart the virtual machines (which were all running under VBoxHeadless, not the GUI management tool) using "VBoxHeadless controlvm vm-name poweroff" and I got this error message (it was something like this, anyway):

VBoxHeadless: error: Invalid parameter: controlvm

According to a google search I did, there are no search results for anything like that error message. I guess I'm the only person in the world who's had their VirtualBox installation get into such a bad state that it couldn't even control the VMs to reset/poweroff; hence this blog post. If you get this error yourself you should probably check your VM host right away, because it's probably in a bad state and might need to be rebooted.

I updated VirtualBox to version 4.1.14 because at the moment that is the most current release. I hope it turns out to be able to handle this very heavily loaded resource-intensive virtual machine without causing problems for the other VMs and the host system.

Linux kernel/netfilter incompatibility with ip_queue and IMQ

I use an IMQ device in Linux to shape received traffic. Normally you can only bandwidth-limit or shape traffic you transmit. Looping all traffic through an IMQ device is a great workaround to let you shape traffic you receive as well as transmit. In this case we are limiting bandwidth on a 3G cellular modem connection. The iptables rule to loop received 3G traffic through the imq0 network interface looks like this:

iptables -t mangle -A PREROUTING -i 3g -j IMQ --todev 0

I found today that if the linux kernel module ip_queue was loaded then the IMQ device acted like a roach motel for packets. They check in, but they don't check out. Because tcpdump won't capture packets on an imq interface (at least, it never works for me), I was at quite a loss to troubleshoot this or to figure out where the missing packets were going.

Outbound packets were getting transmitted out the 3G interface just fine, and the replies were generated by the remote system, and they would even show up on the PPP interface (I used ppp's record option to write all PPP traffic to a file and then could open it in Wireshark). For example the TCP 3-way handshake looked like this:

  • SYN ->
  • SYN,ACK <-
  • SYN,ACK <- (retransmit)

Obviously my application was never receiving the SYN,ACK so it didn't transmit an ACK to complete the TCP 3-way handshake. Instead the application would restart the TCP 3-way handshake, and basically all communications were broken. This behavior was true of any network interface for which received traffic was looped through imq0 by the mangle/PREROUTING iptables rule.

The imq0 txqueuelen was long enough. I already know to check that because a too-short txqueuelen can cause dropped packets under heavy system load. You should watch for this especially on PPP interfaces such as PPPoE (PPP over ethernet).  If the default txqueuelen (transmit queue length) is 3 and you have megabit download speeds, the packets will get dropped inside the kernel as it tries to queue them up for transmission. I also checked and the MTU was not a problem.

The solution to my dropped packets on the imq0 interface was to remove/unload the ip_queue kernel modules, and also blacklist it so that it would never load again.

I'm not sure I understand exactly why IMQ and ip_queue conflict with each other but they both provide means of queueing and traffic management/shaping so I conclude that you can use one or the other but not both.

Happy Pi Day 2012!

It's March 14, so it must be Pi Day (3.14)! From the piday.org web site: Pi, Greek letter (π), is the symbol for the ratio of the circumference of a circle to its diameter. Pi Day is celebrated by math enthusiasts around the world on March 14th. Pi = 3.1415926535…

Since I'm a math enthusiast, we'll be celebrating Pi Day at the office with circular or spherical food. Since we're programmers, we're having pizza of course.

www.ockers.ca and www.ockers.us

I've owned my ockers.net and ockers.org domain names for decades but haven't really done anything useful with them until recently.  Part of the problem was that until the advent of really clever content management systems like Joomla, it was frankly too much work to live a busy life AND keep any sort of web content up-to-date.  Maintaining web sites is quite a bit easier these days.

I took advantage of the opportunity to register a couple of country-specific domain names some years ago.  As a dual citizen I felt it behooved me to own them both, but for the moment http://www.ockers.ca/ and http://www.ockers.us/ both point here.  Until now neither of them was hosting any sort of useful content, but at some point in the future I might have a specific purpose for them...