Bug #168

after 4+ days of uptime, dhcp server stops

Added by Dave Täht on May 19, 2011. Updated on Jun 24, 2014.
Closed Normal Dave Täht


In the only box I’ve managed to keep running this long, dhcp has stopped serving new addresses.

root@labgw:~# uptime
19:39:04 up 5 days, 15:20, load average: 0.16, 0.06, 0.05

restarting dnsmasq fixed it. There have been multiple builds since then, of course, and I’ve beaten this machine up with many, many tests, and it used to have bug [#136] which I killed off by hand.


Updated by Dave Täht on May 19, 2011.
Updated by Nick Feamster on May 19, 2011.
Strange. In the old build, I’ve got an uptime of around 20 days, and DHCP is working fine.

Let’s try a new build out in the house I’m staying in tomorrow.

I’d like to figure out how to daisy chain these switches, as well.

Updated by Dave Täht on May 22, 2011.
Assuming the bismark-testbed:Testlab remains operation, I should able to slowly debug this.

Regrettably, there seems to be no way to simulate the problem, except time and multiple routers deployed. IT MAY be related to issue #147 and #145

As for “daisy chaining, or using the mesh” please open separate bugs for those.

Updated by Dave Täht on Jun 13, 2011.
I have multiple reports of the dhcp problem now. I am hoping that the switch patch fixes it,
but to test it fully I will have to hammer a router for several days from another router (or two or three) with dhcp queries.

Thankfully with the PDU coming online, if it hangs I can reset it.

I have several other related patches queued up, if that alone doesn’t address the issue.

Updated by Dave Täht on Jul 27, 2011.
Updated by Dave Täht on Jul 28, 2011.
Updated by Dave Täht on Apr 21, 2012.
Updated by Dave Täht on Jun 24, 2014.

This is a static export of the original bufferbloat.net issue database. As such, no further commenting is possible; the information is solely here for archival purposes.
