Bug #216

Infiinite retry for ping

Added by Dave Täht about 3 years ago. Updated about 3 years ago.

Status:Closed Start date:07/28/2011
Priority:Immediate Due date:
Assignee:Andrew McGregor % Done:

50%

Category:- Spent time: 64.60 hours
Target version:1st Usable CeroWrt release

Description

ping takes 1.6 seconds to go 40 meters.

xmit.c.diff (1.8 kB) Dave Täht, 07/28/2011 08:01 pm

ath9k-latency.patch (4.8 kB) Andrew McGregor, 08/03/2011 03:44 pm

580-ath9k_lowlatency.patch (1.7 kB) David Taht, 08/06/2011 08:46 pm


Related issues

related to Cerowrt - Bug #195: ag71xx ethernet appears to be doing many unaligned transfers Closed 06/04/2011
duplicates Cerowrt - Bug #162: TX_RETRY still too high Closed 05/18/2011
blocked by Cerowrt - Bug #202: STA mode is broken Closed 07/10/2011

History

Updated by Dave Täht about 3 years ago

Andrew Mcgregor is my new god.

Updated by Dave Täht about 3 years ago

  • Project changed from ISCWRT to Cerowrt

Updated by Dave Täht about 3 years ago

Some notes from Andrew:

net/mac80211/rc80211_minstrel.c

This function:

376 static void
377 minstrel_rate_init(void *priv, struct ieee80211_supported_band *sband,
378 struct ieee80211_sta *sta, void *priv_sta)

Initialises the rate table, based on:

496 static void *
497 minstrel_alloc(struct ieee80211_hw *hw, struct dentry *debugfsdir)
520         /* maximum time that the hw is allowed to stay in one MRR segment */
521 mp->segment_size = 6000;

The unit is 'TU', an 802.11 specific unit that is a little more than a microsecond.

For 802.11n

net/mac80211/rc80211_minstrel_ht.c

459 static void
460 minstrel_calc_retransmit(struct minstrel_priv *mp, struct minstrel_ht_sta *mi,
461 int index)

Uses the same segment_size. Arguably, that's wrong. I expect that you want to add an ht_segment_size to the structure, so 11n can be tuned separately. A sensible default might be around 500 TU.

Structure is defined in net/mac80211/rc80211_minstrel.h:

70 struct minstrel_priv {
71 struct ieee80211_hw *hw;
72 bool has_mrr;
73 unsigned int cw_min;
74 unsigned int cw_max;
75 unsigned int max_retry;
76 unsigned int ewma_level;
77 unsigned int segment_size;
78 unsigned int update_interval;
79 unsigned int lookaround_rate;

Updated by Dave Täht about 3 years ago

Felix says the patch is no good as is:

"No retrans is not good. No retrans makes gaps in the client side reorder window. Gaps in client buffers is bad, makes traffic stall and latencies."

I asked if 1 retransmit was enough, as an infinite loop here is very bad... 1.6 sec ping time going 40 meters (obviously very bad conditions, but that was intentional)

Felix wrote:

"If it's doing infinite retry, then that's a bug in the per-subframe retry count tracking and should be fixed properly.

About the 1.6 seconds: Yes, it would be nice if ath9k was able to drop packets more easily, but that requires some more surgery in the xmit path, which I'm going to take care of eventually.

In the mean time I'm not going to apply any kludges that simply trade one problem for another (and have the potential to create nasty side effects)."

And Andrew replied:

"I may be able to put a few cycles in to this shortly… there's a tradeoff here, and continually recycling is definitely the wrong thing. I realise that dropping here is the wrong thing, but it's less wrong than retrying forever. With a fairly small receive window, it won't be that bad."

Updated by Dave Täht about 3 years ago

Then felix replied:

The receive window typically isn't small, it's often as big as 64 frames. The only way to prevent that from messing up the receiver state is to send a BAR for failed frames, and that means pushing even more frames to the tx queue, thus potentially making the problem even worse.

I have the following ideas for attacking the root causes of bloated queues with aggregation:

#. Delay the sequence number assignment to the point where it actually transmits a frame for the first time. That allows the driver to pick any frame that hasn't been transmitted yet for dropping when the queue gets full.
#. Keep track of the size of the sequence number gap of pending frames. That allows the driver to make a better decision of when to drop a frame and send a BAR or when to retry some more.
#. Allow the driver to either try different rates for retransmissions, or to request another rate control lookup. The code for that should probably be pushed as much as possible into mac80211 and the rate control module.

Updated by Dave Täht about 3 years ago

To which andrew replied:

1 seems like the right thing.
3 is pretty much trivial, and in fact probably already happens.

Updated by Dave Täht about 3 years ago

So I then wrote a big motivational message, as I've been chasing this problem for 16 months... and everybody went off on plane flights around the globe (and me, I missed my plane entirely, but that's another story).

And then there was silence.

I wrote:

I'm hoping that Andrew can play Freeman Dyson, to Felix's Murray Gell-man, to my Richard Feynman, to come up with solutions.

To use another analogy, taken from a wonderful book that I highly recommend y'all read, as we're encountering all the same problems all over again...

http://www.amazon.com/Where-Wizards-Stay-Up-Late/dp/0684832674

Two of the greatest engineers that have ever existed, Dave Walden & Will Crowther, at BBN, not only figured out how to create the first routing code on the first IMPs, but they figured out how to write the code with flow charts and a literal ton of green bar paper, and wrote all the assembly for it - before the first hardware was even than a blueprint.

AND: They figured out how to speed it up by 10x over the specification, and were fiercely proud of their code.

Kleinrock at UCLA took one look at it and told them it was wrong. They didn't believe him. He said "gimme an imp". They gave him one, and he managed to crash the the first one inside of sixteen packets.

He spent the next 3 years gleefully crashing the entire early arpanet, over and over again. Drove those guys nuts.

The ARPAnet was all the better for it.

This coming week I'm helping move my parents to a new house, and all my and their stuff is in boxes.

The week after, I want to ship cerowrt 1.0. I can live with it as it is, but as it is possible to do better in the time available, I'd love to do so.

in 2 hours I'd done a vulcan mind meld with andrew (and vice versa!), and explained the issues with tcp's behavior - how dropping 3 packets in a row is basically fatal, and you should give up anyway, improving conntrack to move a shaper between mice and elephants, and how to address the ANT issue using the video queue in 802.11e... and he fully explained the interrelationships of the various portions of stack and hardware to me, and found that infinite retry problem 20 minutes later...

So Drs Dyson, Gell-man, Feynman... can we get on the same page somehow? If not for this release, then the next?

I have a testlab going up next week at gatech, and one in mid-october going up in Paris.

I'll gladly test anything, especial even partial stuff, and break it, in all sorts of ways. It's what cerowrt is for.

The battle between Kleinrock and Walden/Crowther at BBN is the stuff of legend, and what we are up to now could also become so.

Updated by Dave Täht about 3 years ago

This is the patch I've been using for 8 months. With it, voip becomes feasible, without it, not. With it, I still see 1.6 second pings at 40 meters. Without it, I saw 36+ seconds in similar conditions.

~/src/cerowrt/target/linux/902-ath9k-bufferbloat.patch

I don't understand how you get past the infinite loop, even with this patch in place.


--- a/drivers/net/wireless/ath/ath9k/ath9k.h    2011-05-13 20:23:00.066497846 -0600
+++ b/drivers/net/wireless/ath/ath9k/ath9k.h    2011-05-13 20:27:11.560374910 -0600
@@ -135,11 +135,11 @@
 /***********/

 #define ATH_MAX_ANTENNA         3
-#define ATH_RXBUF               512
-#define ATH_TXBUF               512
+#define ATH_RXBUF               128
+#define ATH_TXBUF               32
 #define ATH_TXBUF_RESERVE       5
 #define ATH_MAX_QDEPTH          (ATH_TXBUF / 4 - ATH_TXBUF_RESERVE)
-#define ATH_TXMAXTRY            13
+#define ATH_TXMAXTRY            2
 #define ATH_MGT_TXMAXTRY        4

 #define TID_TO_WME_AC(_tid)                            \
@@ -542,7 +542,7 @@

 #define DEFAULT_CACHELINE       32
 #define ATH_REGCLASSIDS_MAX     10
 #define ATH_CABQ_READY_TIME     80      /* % of beacon interval */
-#define ATH_MAX_SW_RETRIES      10
+#define ATH_MAX_SW_RETRIES      2
 #define ATH_CHAN_MAX            255
 #define IEEE80211_WEP_NKID      4       /* number of key ids */

--- a/drivers/net/wireless/ath/ath9k/init.c     2011-05-13 20:23:00.066497846 -0600
--- b/drivers/net/wireless/ath/ath9k/init.c     2011-05-13 20:23:00.066497846 -0600
@@ -676,7 +676,7 @@
        hw->max_rates = 4;
        hw->channel_change_time = 5000;
        hw->max_listen_interval = 10;
-       hw->max_rate_tries = 10;
+       hw->max_rate_tries = 2;
        hw->sta_data_size = sizeof(struct ath_node);
        hw->vif_data_size = sizeof(struct ath_vif);

Updated by David Taht about 3 years ago

I am adding this discussion to the bug reporting system via cc'ing the bot
and changing the subject line to include the appropriate number.

My principal concern with exorbitant buffering and retries is that no
signalling is sent back to the tcp sender to indicate that it should maybe
slow down.

My reasoning for choosing 3 packets as a decent outer limit for a AMPDU is
that if one packet gets through, the rest can be dropped safely, which will
- yes, I totally understand - result in less than maximum wireless-n
performance - but give reasonably bounded latencies for all packets AND
provide end-to-end signaling that made sense.

I don't care about maximizing wireless-n performance in laboratory
conditions, what I care about is having decent mixed g and n performance in
fair to bad conditions, with contention from multiple APs and clients.

Felix had produced one patch early on that I thought was promising under
that scenario, treating g and n differently, that I'm trying to find.

I would like to move this discussion to the bug report if at all possible,

I certainly have some ideas towards increasing the maximum AMPDU to larger
sizes while retaining sanity on the TCP/ip side, notably by having it be
tuple sensitive, but that's something longer term than merely what we are
discussing now.

On Mon, Aug 1, 2011 at 2:42 AM, Felix Fietkau <> wrote:

I have a better idea: Instead of limiting this value to 2, why not leave it at 10, but add the number of hardware retries to the per-subframe retry counter. That way subframes that get lost during the transmission of A-MPDUs with few or no hardware retransmissions are not unnecessarily penalized, but if A-MPDUs get retransmitted often, subframes will get kicked out quickly as well.

- Felix

On 2011-08-01 3:02 AM, Andrew McGregor wrote:

So is this a more appropriate way to clear it out:

diff --git a/drivers/net/wireless/ath/**ath9k/ath9k.h b/drivers/net/wireless/ath/**ath9k/ath9k.h index 2a40fa2..47e0b99 100644 --- a/drivers/net/wireless/ath/**ath9k/ath9k.h +++ b/drivers/net/wireless/ath/**ath9k/ath9k.h @ -396,7 +396,7 @ struct ath_led { #define DEFAULT_CACHELINE 32 #define ATH_REGCLASSIDS_MAX 10 #define ATH_CABQ_READY_TIME 80 /* % of beacon interval */ -#define ATH_MAX_SW_RETRIES 10 +#define ATH_MAX_SW_RETRIES 2 #define ATH_CHAN_MAX 255 #define IEEE80211_WEP_NKID 4 /* number of key ids */

IOW, simply don't try so many SW retries? This should end up BARing it forward pretty quickly.

I think this is a pretty important tuning variable… and 10 looks like a pretty silly value, 'cause the frame already had a reasonable number of TX opportunities.

Andrew

On 29/07/2011, at 8:22 AM, Felix Fietkau wrote:

The receive window typically isn't small, it's often as big as 64

frames. The only way to prevent that from messing up the receiver state is to send a BAR for failed frames, and that means pushing even more frames to the tx queue, thus potentially making the problem even worse.

I have the following ideas for attacking the root causes of bloated queues with aggregation:

1. Delay the sequence number assignment to the point where it actually transmits a frame for the first time. That allows the driver to pick any frame that hasn't been transmitted yet for dropping when the queue gets full. 2. Keep track of the size of the sequence number gap of pending frames. That allows the driver to make a better decision of when to drop a frame and send a BAR or when to retry some more. 3. Allow the driver to either try different rates for retransmissions, or to request another rate control lookup. The code for that should probably be pushed as much as possible into mac80211 and the rate control module.

- Felix

On 2011-07-29 2:10 PM, Andrew McGregor wrote:

I may be able to put a few cycles in to this shortly… there's a tradeoff here, and continually recycling is definitely the wrong thing. I realise that dropping here is the wrong thing, but it's less wrong than retrying forever. With a fairly small receive window, it won't be that bad.

Andrew

On 29/07/2011, at 7:55 AM, Felix Fietkau wrote:

If it's doing infinite retry, then that's a bug in the per-subframe

retry count tracking and should be fixed properly.

About the 1.6 seconds: Yes, it would be nice if ath9k was able to drop packets more easily, but that requires some more surgery in the xmit path, which I'm going to take care of eventually.

In the mean time I'm not going to apply any kludges that simply trade one problem for another (and have the potential to create nasty side effects).

- Felix

On 2011-07-29 1:39 PM, Dave Taht wrote:

1 retrans good?, infinite very bad.

I was seeing 1.6 SECOND pings, man... going 40 meters....

Got a better suggestion?

http://www.bufferbloat.net/**issues/216<http://www.bufferbloat.net/issues/216>

Andrew also sent along another patch...

On Fri, Jul 29, 2011 at 2:03 AM, Felix Fietkau< <mailto:>> wrote:

No retrans not good. No retrans make gaps in client side reorder window. Gaps in client buffers bad, make traffic stall and latencies.

On 2011-07-29 2:21 AM, Dave Taht wrote:

infinite retrans bad. No retrans good. I LOVE IETF.

---------- Forwarded message ---------- From: Andrew McGregor< <mailto:> <mailto: <mailto:>>**> Date: Thu, Jul 28, 2011 at 5:37 PM Subject: Patch for stuck retransmits To: <mailto:dav**<>

<mailto:<**mailto:>>

Here's the patch, against linux-2.6.39.3/drivers/net/__**wireless/ath/ath9k/xmit.c

Updated by Dave Täht about 3 years ago

The approach andrew describes here sounds very promising. What I plan to do is a bakeoff late this week, in the lab, with Gell-man vanilla, Feynman bletcherous, and Dyson proposed patches in place, in a mixed g/n network...

Andrew wrote, commenting on my patch denoted earlier in this bug report:

"The use is at line 340 of ath9k/xmit.c, in ath_tx_complete_aggr, at least in a vanilla kernel tree, where it sets the limit on the number of passes through the SW retransmit code.

Your change that sets hw->max_rate_tries and ATH_TXMAXTRY to 2 is going to cause terrible packet loss, and probably make the latency worse; it will certainly slow TCP right down with non-congestive loss. You won't see this at short range, but in fringe coverage it will certainly suck, so effectively you're throwing away about half the range.

Similarly, to make the AMPDU code work right, you actually need a minimum of 128 TX and RX buffers, and probably 160 is more advisable to make room for two max size aggregates plus some ACK, BAR and other frames (not every buffer at this level is used for an IP packet). Any less than that is going to make the MAC do two things; one, drop packets, and two, start exercising the 'I've run out of memory' code, which is not going to be pretty. Also, that leaves room for a max size AMPDU window for only two clients; I'd actually be inclined to increase these numbers to twice or more the default, so there's plenty of room for a reordering window for a few clients. Again, these numbers DO NOT represent IP packets buffered, the maximum number of IP packets in this buffer is 64 per client, which is set by the 801.11 spec and can't easily be changed in this way (although I will look at how it could be changed).

If you want to reduce the latency, changing (in net/mac80211/rc80211_minstrel.c) mp->segment_size = 6000; to something smaller (minimum reasonable value is about 2000) and a similarly named variable in minstrel_ht from 6000 down to maybe 400. These values are approximately microseconds. That will, instead of crowbarring down the retransmits, instead cleanly reduce the maximum retransmit time window (which is this value rounded up to an integral number of transmitter shots, times four). That will gain you a lot of performance, since then the higher rates will have a loss advantage (due to having more retransmits)."

Updated by Felix Fietkau about 3 years ago

Last time I tried limiting the aggregation size that much, it ended up
reducing the throughput from 80 mbit/s down to just over 20 with strong
fluctuations.

I realize that latency is pretty much all you care about, but I'd prefer
if the focus was directed more towards reducing bufferbloat without
causing nasty throughput regressions on links with good conditions. By
the way, those good conditions that I'm talking about are not jsut
'laboratory conditions'. We've set up quite a few medium to distance
links that perform well with little retransmission and good aggregation
levels.

You can find a patch that implements my suggestion for handling swretry
here: http://nbd.name/561-ath9k_sw_retry_reduce.patch
In my tests it seems to work well without hurting throughput under
reasonably good conditions, but it needs some more testing.

- Felix

On 2011-08-01 1:41 PM, Dave Taht wrote:

I am adding this discussion to the bug reporting system via cc'ing the bot and changing the subject line to include the appropriate number.

My principal concern with exorbitant buffering and retries is that no signalling is sent back to the tcp sender to indicate that it should maybe slow down.

My reasoning for choosing 3 packets as a decent outer limit for a AMPDU is that if one packet gets through, the rest can be dropped safely, which will - yes, I totally understand - result in less than maximum wireless-n performance - but give reasonably bounded latencies for all packets AND provide end-to-end signaling that made sense.

I don't care about maximizing wireless-n performance in laboratory conditions, what I care about is having decent mixed g and n performance in fair to bad conditions, with contention from multiple APs and clients.

Felix had produced one patch early on that I thought was promising under that scenario, treating g and n differently, that I'm trying to find.

I would like to move this discussion to the bug report if at all possible,

I certainly have some ideas towards increasing the maximum AMPDU to larger sizes while retaining sanity on the TCP/ip side, notably by having it be tuple sensitive, but that's something longer term than merely what we are discussing now.

On Mon, Aug 1, 2011 at 2:42 AM, Felix Fietkau < <mailto:>> wrote:

I have a better idea: Instead of limiting this value to 2, why not leave it at 10, but add the number of hardware retries to the per-subframe retry counter. That way subframes that get lost during the transmission of A-MPDUs with few or no hardware retransmissions are not unnecessarily penalized, but if A-MPDUs get retransmitted often, subframes will get kicked out quickly as well.

- Felix

On 2011-08-01 3:02 AM, Andrew McGregor wrote:

So is this a more appropriate way to clear it out:

diff --git a/drivers/net/wireless/ath/__ath9k/ath9k.h b/drivers/net/wireless/ath/__ath9k/ath9k.h index 2a40fa2..47e0b99 100644 --- a/drivers/net/wireless/ath/__ath9k/ath9k.h +++ b/drivers/net/wireless/ath/__ath9k/ath9k.h @ -396,7 +396,7 @ struct ath_led { #define DEFAULT_CACHELINE 32 #define ATH_REGCLASSIDS_MAX 10 #define ATH_CABQ_READY_TIME 80 /* % of beacon interval */ -#define ATH_MAX_SW_RETRIES 10 +#define ATH_MAX_SW_RETRIES 2 #define ATH_CHAN_MAX 255 #define IEEE80211_WEP_NKID 4 /* number of key ids */

IOW, simply don't try so many SW retries? This should end up BARing it forward pretty quickly.

I think this is a pretty important tuning variable… and 10 looks like a pretty silly value, 'cause the frame already had a reasonable number of TX opportunities.

Andrew

On 29/07/2011, at 8:22 AM, Felix Fietkau wrote:

The receive window typically isn't small, it's often as big as 64 frames. The only way to prevent that from messing up the receiver state is to send a BAR for failed frames, and that means pushing even more frames to the tx queue, thus potentially making the problem even worse.

I have the following ideas for attacking the root causes of bloated queues with aggregation:

1. Delay the sequence number assignment to the point where it actually transmits a frame for the first time. That allows the driver to pick any frame that hasn't been transmitted yet for dropping when the queue gets full. 2. Keep track of the size of the sequence number gap of pending frames. That allows the driver to make a better decision of when to drop a frame and send a BAR or when to retry some more. 3. Allow the driver to either try different rates for retransmissions, or to request another rate control lookup. The code for that should probably be pushed as much as possible into mac80211 and the rate control module.

- Felix

On 2011-07-29 2:10 PM, Andrew McGregor wrote:

I may be able to put a few cycles in to this shortly… there's a tradeoff here, and continually recycling is definitely the wrong thing. I realise that dropping here is the wrong thing, but it's less wrong than retrying forever. With a fairly small receive window, it won't be that bad.

Andrew

On 29/07/2011, at 7:55 AM, Felix Fietkau wrote:

If it's doing infinite retry, then that's a bug in the per-subframe retry count tracking and should be fixed properly.

About the 1.6 seconds: Yes, it would be nice if ath9k was able to drop packets more easily, but that requires some more surgery in the xmit path, which I'm going to take care of eventually.

In the mean time I'm not going to apply any kludges that simply trade one problem for another (and have the potential to create nasty side effects).

- Felix

On 2011-07-29 1:39 PM, Dave Taht wrote:

1 retrans good?, infinite very bad.

I was seeing 1.6 SECOND pings, man... going 40 meters....

Got a better suggestion?

http://www.bufferbloat.net/__issues/216 <http://www.bufferbloat.net/issues/216>

Andrew also sent along another patch...

On Fri, Jul 29, 2011 at 2:03 AM, Felix Fietkau< <mailto:> <mailto: <mailto:>>> wrote:

No retrans not good. No retrans make gaps in client side reorder window. Gaps in client buffers bad, make traffic stall and latencies.

On 2011-07-29 2:21 AM, Dave Taht wrote:

infinite retrans bad. No retrans good. I LOVE IETF.

---------- Forwarded message ---------- From: *Andrew McGregor*< <mailto:> <mailto: <mailto:>> <mailto: <mailto:> <mailto: <mailto:>>>__> Date: Thu, Jul 28, 2011 at 5:37 PM Subject: Patch for stuck retransmits To: <mailto:><mailto: <mailto:>> <mailto: <mailto:><__mailto: <mailto:>>>

Here's the patch, against

linux-2.6.39.3/drivers/net/____wireless/ath/ath9k/xmit.c

Updated by David Taht about 3 years ago

Excellent. So it sounds like your patch, along with andrew's latency
reducing suggestions, and increasing ATH_RXBUF and ATH_TXBUF to 160 or so,
would be the strongest candidate in the bakeoff?

I am still quite allergic to having more than 32 packets
living inside a driver buffer... but am willing to do extensive
testing of all approaches that make sense at this point.

Unfortunately I now find myself blocked by bug #202. I can
certainly bring debloat-testing back in, and many other types of clients,
as well as maybe (?) patch up the SR-71 driver I have for cardbus,
but really want to just do cero to cero testing to eliminate all other
variables.

On Mon, Aug 1, 2011 at 6:29 AM, Felix Fietkau <> wrote:

Last time I tried limiting the aggregation size that much, it ended up reducing the throughput from 80 mbit/s down to just over 20 with strong fluctuations.

I realize that latency is pretty much all you care about, but I'd prefer if the focus was directed more towards reducing bufferbloat without causing nasty throughput regressions on links with good conditions. By the way, those good conditions that I'm talking about are not jsut 'laboratory conditions'. We've set up quite a few medium to distance links that perform well with little retransmission and good aggregation levels.

You can find a patch that implements my suggestion for handling swretry here: http://nbd.name/561-ath9k_sw_**retry_reduce.patch<http://nbd.name/561-ath9k_sw_retry_reduce.patch> In my tests it seems to work well without hurting throughput under reasonably good conditions, but it needs some more testing.

- Felix

On 2011-08-01 1:41 PM, Dave Taht wrote:

I am adding this discussion to the bug reporting system via cc'ing the bot and changing the subject line to include the appropriate number.

My principal concern with exorbitant buffering and retries is that no signalling is sent back to the tcp sender to indicate that it should maybe slow down.

My reasoning for choosing 3 packets as a decent outer limit for a AMPDU is that if one packet gets through, the rest can be dropped safely, which will - yes, I totally understand - result in less than maximum wireless-n performance - but give reasonably bounded latencies for all packets AND provide end-to-end signaling that made sense.

I don't care about maximizing wireless-n performance in laboratory conditions, what I care about is having decent mixed g and n performance in fair to bad conditions, with contention from multiple APs and clients.

Felix had produced one patch early on that I thought was promising under that scenario, treating g and n differently, that I'm trying to find.

I would like to move this discussion to the bug report if at all possible,

I certainly have some ideas towards increasing the maximum AMPDU to larger sizes while retaining sanity on the TCP/ip side, notably by having it be tuple sensitive, but that's something longer term than merely what we are discussing now.

On Mon, Aug 1, 2011 at 2:42 AM, Felix Fietkau < <mailto:>> wrote:

I have a better idea: Instead of limiting this value to 2, why not leave it at 10, but add the number of hardware retries to the per-subframe retry counter. That way subframes that get lost during the transmission of A-MPDUs with few or no hardware retransmissions are not unnecessarily penalized, but if A-MPDUs get retransmitted often, subframes will get kicked out quickly as well.

- Felix

On 2011-08-01 3:02 AM, Andrew McGregor wrote:

So is this a more appropriate way to clear it out:

diff --git a/drivers/net/wireless/ath/__**ath9k/ath9k.h b/drivers/net/wireless/ath/__**ath9k/ath9k.h index 2a40fa2..47e0b99 100644 --- a/drivers/net/wireless/ath/__**ath9k/ath9k.h +++ b/drivers/net/wireless/ath/__**ath9k/ath9k.h @ -396,7 +396,7 @ struct ath_led { #define DEFAULT_CACHELINE 32 #define ATH_REGCLASSIDS_MAX 10 #define ATH_CABQ_READY_TIME 80 /* % of beacon interval */ -#define ATH_MAX_SW_RETRIES 10 +#define ATH_MAX_SW_RETRIES 2 #define ATH_CHAN_MAX 255 #define IEEE80211_WEP_NKID 4 /* number of key ids */

IOW, simply don't try so many SW retries? This should end up BARing it forward pretty quickly.

I think this is a pretty important tuning variable… and 10 looks like a pretty silly value, 'cause the frame already had a reasonable number of TX opportunities.

Andrew

On 29/07/2011, at 8:22 AM, Felix Fietkau wrote:

The receive window typically isn't small, it's often as big as 64 frames. The only way to prevent that from messing up the receiver state is to send a BAR for failed frames, and that means pushing even more frames to the tx queue, thus potentially making the problem even worse.

I have the following ideas for attacking the root causes of bloated queues with aggregation:

1. Delay the sequence number assignment to the point where it actually transmits a frame for the first time. That allows the driver to pick any frame that hasn't been transmitted yet for dropping when the queue gets full. 2. Keep track of the size of the sequence number gap of pending frames. That allows the driver to make a better decision of when to drop a frame and send a BAR or when to retry some more. 3. Allow the driver to either try different rates for retransmissions, or to request another rate control lookup. The code for that should probably be pushed as much as possible into mac80211 and the rate control module.

- Felix

On 2011-07-29 2:10 PM, Andrew McGregor wrote:

I may be able to put a few cycles in to this shortly… there's a tradeoff here, and continually recycling is definitely the wrong thing. I realise that dropping here is the wrong thing, but it's less wrong than retrying forever. With a fairly small receive window, it won't be that bad.

Andrew

On 29/07/2011, at 7:55 AM, Felix Fietkau wrote:

If it's doing infinite retry, then that's a bug in the per-subframe retry count tracking and should be fixed properly.

About the 1.6 seconds: Yes, it would be nice if ath9k was able to drop packets more easily, but that requires some more surgery in the xmit path, which I'm going to take care of eventually.

In the mean time I'm not going to apply any kludges that simply trade one problem for another (and have the potential to create nasty side effects).

- Felix

On 2011-07-29 1:39 PM, Dave Taht wrote:

1 retrans good?, infinite very bad.

I was seeing 1.6 SECOND pings, man... going 40 meters....

Got a better suggestion?

http://www.bufferbloat.net/__**issues/216<http://www.bufferbloat.net/__issues/216> <http://www.bufferbloat.net/**issues/216<http://www.bufferbloat.net/issues/216>

Andrew also sent along another patch...

On Fri, Jul 29, 2011 at 2:03 AM, Felix Fietkau< <mailto:> <mailto:

<mailto:>>> wrote:

No retrans not good. No retrans make gaps in client side reorder window. Gaps in client buffers bad, make traffic stall and latencies.

On 2011-07-29 2:21 AM, Dave Taht wrote:

infinite retrans bad. No retrans good. I LOVE IETF.

---------- Forwarded message ---------- From: *Andrew McGregor*< <mailto:> <mailto: <mailto:>> <mailto: <mailto:> <mailto: <mailto:>>**>__> Date: Thu, Jul 28, 2011 at 5:37 PM Subject: Patch for stuck retransmits To: <mailto:><**mailto:

<mailto:>> <mailto: <mailto:><_**_mailto:

<mailto:>>>

Here's the patch, against

linux-2.6.39.3/drivers/net/___** _wireless/ath/ath9k/xmit.c

Updated by Felix Fietkau about 3 years ago

I've also been thinking about changing the rate control parameters but I
think we need to take a different approach here. The main issue is that
A-MPDU rate control needs to be handled completely different from
single-frame rate control. Unfortunately at the point in time where rate
control runs, the decision about A-MPDU transmission vs single
transmission has not been made yet, so I will probably have to make some
rate control API changes to give the driver a chance to interact with
the RC directly.

Another issue is that test results for g/n interop from all our previous
attempts at limiting queue size are pretty much meaningless.

Dropping packets based on internal per-tid queue counters currently is
too bursty for TCP to adapt properly. The debloat-testing eBDP code
completely ignores the inner workings of ath9k's queueing, so it cannot
properly distinguish between aggregated and unaggregated traffic, which
need completely different queueing characteristics.

What we need to be able to produce meaningful test results is proper
queue management on the internal ath9k per-TID queues (plus the
non-aggregated tx queue for legacy or VI traffic).

This will definitely take some more time to develop, but I think without
that we should not jump to any conclusions based on results from random
header file hack jobs.

- Felix

On 2011-08-01 2:47 PM, Dave Taht wrote:

Excellent. So it sounds like your patch, along with andrew's latency reducing suggestions, and increasing ATH_RXBUF and ATH_TXBUF to 160 or so, would be the strongest candidate in the bakeoff?

|I am still quite allergic to having more than 32 packets

living inside a driver buffer... but am willing to do extensive testing of all approaches that make sense at this point.

Unfortunately I now find myself blocked by bug #202. I can certainly bring debloat-testing back in, and many other types of clients,

as well as maybe (?) patch up the SR-71 driver I have for cardbus, but really want to just do cero to cero testing to eliminate all other variables. |

On Mon, Aug 1, 2011 at 6:29 AM, Felix Fietkau < <mailto:>> wrote:

Last time I tried limiting the aggregation size that much, it ended up reducing the throughput from 80 mbit/s down to just over 20 with strong fluctuations.

I realize that latency is pretty much all you care about, but I'd prefer if the focus was directed more towards reducing bufferbloat without causing nasty throughput regressions on links with good conditions. By the way, those good conditions that I'm talking about are not jsut 'laboratory conditions'. We've set up quite a few medium to distance links that perform well with little retransmission and good aggregation levels.

You can find a patch that implements my suggestion for handling swretry here: http://nbd.name/561-ath9k_sw___retry_reduce.patch <http://nbd.name/561-ath9k_sw_retry_reduce.patch> In my tests it seems to work well without hurting throughput under reasonably good conditions, but it needs some more testing.

- Felix

On 2011-08-01 1:41 PM, Dave Taht wrote:

I am adding this discussion to the bug reporting system via cc'ing the bot and changing the subject line to include the appropriate number.

My principal concern with exorbitant buffering and retries is that no signalling is sent back to the tcp sender to indicate that it should maybe slow down.

My reasoning for choosing 3 packets as a decent outer limit for a AMPDU is that if one packet gets through, the rest can be dropped safely, which will - yes, I totally understand - result in less than maximum wireless-n performance - but give reasonably bounded latencies for all packets AND provide end-to-end signaling that made sense.

I don't care about maximizing wireless-n performance in laboratory conditions, what I care about is having decent mixed g and n performance in fair to bad conditions, with contention from multiple APs and clients.

Felix had produced one patch early on that I thought was promising under that scenario, treating g and n differently, that I'm trying to find.

I would like to move this discussion to the bug report if at all possible,

I certainly have some ideas towards increasing the maximum AMPDU to larger sizes while retaining sanity on the TCP/ip side, notably by having it be tuple sensitive, but that's something longer term than merely what we are discussing now.

On Mon, Aug 1, 2011 at 2:42 AM, Felix Fietkau < <mailto:> <mailto: <mailto:>>> wrote:

I have a better idea: Instead of limiting this value to 2, why not leave it at 10, but add the number of hardware retries to the per-subframe retry counter. That way subframes that get lost during the transmission of A-MPDUs with few or no hardware retransmissions are not unnecessarily penalized, but if A-MPDUs get retransmitted often, subframes will get kicked out quickly as well.

- Felix

On 2011-08-01 3:02 AM, Andrew McGregor wrote:

So is this a more appropriate way to clear it out:

diff --git a/drivers/net/wireless/ath/____ath9k/ath9k.h b/drivers/net/wireless/ath/____ath9k/ath9k.h index 2a40fa2..47e0b99 100644 --- a/drivers/net/wireless/ath/____ath9k/ath9k.h +++ b/drivers/net/wireless/ath/____ath9k/ath9k.h @ -396,7 +396,7 @ struct ath_led { #define DEFAULT_CACHELINE 32 #define ATH_REGCLASSIDS_MAX 10 #define ATH_CABQ_READY_TIME 80 /* % of beacon interval */ -#define ATH_MAX_SW_RETRIES 10 +#define ATH_MAX_SW_RETRIES 2 #define ATH_CHAN_MAX 255 #define IEEE80211_WEP_NKID 4 /* number of key ids */

IOW, simply don't try so many SW retries? This should end up BARing it forward pretty quickly.

I think this is a pretty important tuning variable… and 10 looks like a pretty silly value, 'cause the frame already had a reasonable number of TX opportunities.

Andrew

On 29/07/2011, at 8:22 AM, Felix Fietkau wrote:

The receive window typically isn't small, it's often as big as 64 frames. The only way to prevent that from messing up the receiver state is to send a BAR for failed frames, and that means pushing even more frames to the tx queue, thus potentially making the problem even worse.

I have the following ideas for attacking the root causes of bloated queues with aggregation:

1. Delay the sequence number assignment to the point where it actually transmits a frame for the first time. That allows the driver to pick any frame that hasn't been transmitted yet for dropping when the queue gets full. 2. Keep track of the size of the sequence number gap of pending frames. That allows the driver to make a better decision of when to drop a frame and send a BAR or when to retry some more. 3. Allow the driver to either try different rates for retransmissions, or to request another rate control lookup. The code for that should probably be pushed as much as possible into mac80211 and the rate control module.

- Felix

On 2011-07-29 2:10 PM, Andrew McGregor wrote:

I may be able to put a few cycles in to this shortly… there's a tradeoff here, and continually recycling is definitely the wrong thing. I realise that dropping here is the wrong thing, but it's less wrong than retrying forever. With a fairly small receive window, it won't be that bad.

Andrew

On 29/07/2011, at 7:55 AM, Felix Fietkau wrote:

If it's doing infinite retry, then that's a bug in the per-subframe retry count tracking and should be fixed properly.

About the 1.6 seconds: Yes, it would be nice if ath9k was able to drop packets more easily, but that requires some more surgery in the xmit path, which I'm going to take care of eventually.

In the mean time I'm not going to apply any kludges that simply trade one problem for another (and have the potential to create nasty side effects).

- Felix

On 2011-07-29 1:39 PM, Dave Taht wrote:

1 retrans good?, infinite very bad.

I was seeing 1.6 SECOND pings, man... going 40 meters....

Got a better suggestion?

http://www.bufferbloat.net/____issues/216 <http://www.bufferbloat.net/__issues/216> <http://www.bufferbloat.net/__issues/216 <http://www.bufferbloat.net/issues/216>>

Andrew also sent along another patch...

On Fri, Jul 29, 2011 at 2:03 AM, Felix Fietkau< <mailto:> <mailto: <mailto:>> <mailto: <mailto:>

<mailto: <mailto:>>>> wrote:

No retrans not good. No retrans make gaps in client side reorder window. Gaps in client buffers bad, make traffic stall and latencies.

On 2011-07-29 2:21 AM, Dave Taht wrote:

infinite retrans bad. No retrans good. I LOVE IETF.

---------- Forwarded message
From: *Andrew McGregor*< <mailto:> <mailto: <mailto:>> <mailto: <mailto:> <mailto: <mailto:>>> <mailto: <mailto:> <mailto: <mailto:>> <mailto: <mailto:> <mailto: <mailto:>>>_>_> Date: Thu, Jul 28, 2011 at 5:37 PM Subject: Patch for stuck retransmits To: <mailto:> <mailto: <mailto:>><__mailto: <mailto:>

<mailto: <mailto:>>> <mailto: <mailto:> <mailto: <mailto:>><____mailto: <mailto:>

<mailto: <mailto:>>>>

Here's the patch, against

linux-2.6.39.3/drivers/net/______wireless/ath/ath9k/xmit.c

Updated by David Taht about 3 years ago

OK, so you are suggesting that the "gell-man version" for testing contains
no header file hack-jobs?

My tests basically include 3-9 routers, routing wirelessly and wired through
each other in various combinations, using netperf as the principal test
driver, also simultaneously, while simultaneously fpinging the routers.
There are numerous other tests in the suite, and I can develop more, given
some suggestions?

http://www.bufferbloat.net/projects/cerowrt/wiki/Testlab

Some of the tests I ran last time (while fighting the wired driver) are
documented here.

http://www.bufferbloat.net/projects/cerowrt/wiki/Experiment_-_QoS

Updated by Dave Täht about 3 years ago

David Taht wrote:

I am adding this discussion to the bug reporting system via cc'ing the bot and changing the subject line to include the appropriate number.

My principal concern with exorbitant buffering and retries is that no signalling is sent back to the tcp sender to indicate that it should maybe slow down.

My reasoning for choosing 3 packets as a decent outer limit for a AMPDU is that if one packet gets through, the rest can be dropped safely, which will - yes, I totally understand - result in less than maximum wireless-n performance - but give reasonably bounded latencies for all packets AND provide end-to-end signaling that made sense.

I don't care about maximizing wireless-n performance in laboratory conditions, what I care about is having decent mixed g and n performance in fair to bad conditions, with contention from multiple APs and clients.

Felix had produced one patch early on that I thought was promising under that scenario, treating g and n differently, that I'm trying to find.

I would like to move this discussion to the bug report if at all possible,

I certainly have some ideas towards increasing the maximum AMPDU to larger sizes while retaining sanity on the TCP/ip side, notably by having it be IP/port IP/port tuple sensitive, but that's something longer term than merely what we are discussing now.

On Mon, Aug 1, 2011 at 2:42 AM, Felix Fietkau <> wrote:

I have a better idea: Instead of limiting this value to 2, why not leave it at 10, but add the number of hardware retries to the per-subframe retry counter. That way subframes that get lost during the transmission of A-MPDUs with few or no hardware retransmissions are not unnecessarily penalized, but if A-MPDUs get retransmitted often, subframes will get kicked out quickly as well.

- Felix

On 2011-08-01 3:02 AM, Andrew McGregor wrote:

So is this a more appropriate way to clear it out:

diff --git a/drivers/net/wireless/ath/**ath9k/ath9k.h b/drivers/net/wireless/ath/**ath9k/ath9k.h index 2a40fa2..47e0b99 100644 --- a/drivers/net/wireless/ath/**ath9k/ath9k.h +++ b/drivers/net/wireless/ath/**ath9k/ath9k.h @ -396,7 +396,7 @ struct ath_led { #define DEFAULT_CACHELINE 32 #define ATH_REGCLASSIDS_MAX 10 #define ATH_CABQ_READY_TIME 80 /* % of beacon interval */ -#define ATH_MAX_SW_RETRIES 10 +#define ATH_MAX_SW_RETRIES 2 #define ATH_CHAN_MAX 255 #define IEEE80211_WEP_NKID 4 /* number of key ids */

IOW, simply don't try so many SW retries? This should end up BARing it forward pretty quickly.

I think this is a pretty important tuning variable… and 10 looks like a pretty silly value, 'cause the frame already had a reasonable number of TX opportunities.

Andrew

On 29/07/2011, at 8:22 AM, Felix Fietkau wrote:

The receive window typically isn't small, it's often as big as 64

frames. The only way to prevent that from messing up the receiver state is to send a BAR for failed frames, and that means pushing even more frames to the tx queue, thus potentially making the problem even worse.

I have the following ideas for attacking the root causes of bloated queues with aggregation:

1. Delay the sequence number assignment to the point where it actually transmits a frame for the first time. That allows the driver to pick any frame that hasn't been transmitted yet for dropping when the queue gets full. 2. Keep track of the size of the sequence number gap of pending frames. That allows the driver to make a better decision of when to drop a frame and send a BAR or when to retry some more. 3. Allow the driver to either try different rates for retransmissions, or to request another rate control lookup. The code for that should probably be pushed as much as possible into mac80211 and the rate control module.

- Felix

On 2011-07-29 2:10 PM, Andrew McGregor wrote:

I may be able to put a few cycles in to this shortly… there's a tradeoff here, and continually recycling is definitely the wrong thing. I realise that dropping here is the wrong thing, but it's less wrong than retrying forever. With a fairly small receive window, it won't be that bad.

Andrew

On 29/07/2011, at 7:55 AM, Felix Fietkau wrote:

If it's doing infinite retry, then that's a bug in the per-subframe

retry count tracking and should be fixed properly.

About the 1.6 seconds: Yes, it would be nice if ath9k was able to drop packets more easily, but that requires some more surgery in the xmit path, which I'm going to take care of eventually.

In the mean time I'm not going to apply any kludges that simply trade one problem for another (and have the potential to create nasty side effects).

- Felix

On 2011-07-29 1:39 PM, Dave Taht wrote:

1 retrans good?, infinite very bad.

I was seeing 1.6 SECOND pings, man... going 40 meters....

Got a better suggestion?

http://www.bufferbloat.net/**issues/216<http://www.bufferbloat.net/issues/216>

Andrew also sent along another patch...

On Fri, Jul 29, 2011 at 2:03 AM, Felix Fietkau< <mailto:>> wrote:

No retrans not good. No retrans make gaps in client side reorder window. Gaps in client buffers bad, make traffic stall and latencies.

On 2011-07-29 2:21 AM, Dave Taht wrote:

infinite retrans bad. No retrans good. I LOVE IETF.

---------- Forwarded message ---------- From: Andrew McGregor< <mailto:> <mailto: <mailto:>>**> Date: Thu, Jul 28, 2011 at 5:37 PM Subject: Patch for stuck retransmits To: <mailto:dav**<>

<mailto:<**mailto:>>

Here's the patch, against linux-2.6.39.3/drivers/net/__**wireless/ath/ath9k/xmit.c

Updated by Dave Täht about 3 years ago

Sorry for the noise on the previous comment, all I'd meant to do was change the line to be more clear about what I meant about tuples. This is even more clear:

"I certainly have some ideas towards increasing the maximum AMPDU to larger amounts (than 3) while retaining sanity on the TCP/ip side, notably by having it be IP/port IP/port tuple sensitive, would get single station performance back up into levels where up to 64 packets could be aggregated but that's something longer term than merely what we are discussing now, and requires far more knowledge of the raw packets than the driver should know about.

A smarter qdisc could do a somewhat saner form of fair queueing and get more packets into aggregation than we currently see with the existing wireless qdiscs, but it needs to be more of a bi-directional relationship between that layer and this.

I've also pretty much come to the conclusion that what we've been calling "Ants" - system control packets that arent TCP mice or elephants, need to go in the wireless VI queue and out of BE, entirely.

All that said, this is somewhat separate of the ongoing thread.

Updated by John Linville about 3 years ago

This bug report is a bit hard to read...

Is there an infinite retry? Or just a long one? Under which circumstances does it occur?

What does the patch at the top do? What problem is it addressing?

The patch is for ath9k, but the discussion seems to be about Minstrel. Where does the problem reside?

Updated by Dave Täht about 3 years ago

On Mon, Aug 1, 2011 at 8:56 AM, Andrew McGregor <> wrote:

Or, even better again… do add the number of hardware retries
to the per-subframe counter, but make the limit be the same
as the Minstrel work limit for the rate, so as to be completely
fair to the frame in terms of TU of transmitter effort?
That seems most like the right thing.
On 1/08/2011, at 4:42 AM, Felix Fietkau wrote:
I have a better idea: Instead of limiting this value to 2
, why not leave it at 10, but add the number of hardware retries
to the per-subframe retry counter. That way subframes that get
lost during the transmission of A-MPDUs with few or no hardware
retransmissions are not unnecessarily penalized, but if A-MPDUs
get retransmitted often, subframes will get kicked out quickly as well.

--
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com

Updated by Dave Täht about 3 years ago

@ John

John Linville wrote:

This bug report is a bit hard to read...

Is there an infinite retry? Or just a long one? Under which circumstances does it occur?

If I move to purposely bad conditions - 40 meters or so away from the AP, I can get some pings through, but they take 1.6 seconds. This is from debloat-testing to cerowrt rc1.

What does the patch at the top do? What problem is it addressing?

An infinite loop in the minstrel-ht code.

The patch is for ath9k, but the discussion seems to be about Minstrel. Where does the problem reside?

Everywhere.

Updated by Dave Täht about 3 years ago

Andrew - still not registered on the wiki, writes in:

"Unfortunately, you can't have less than 64 plus a few packets living inside a driver buffer in the worst-case situation with aggregation, not without somehow clamping the window, and I can't see a way to do that within the MAC protocol right now. Remember that there can be quite a few 802.11 control frames living in those buffers too, not just data frames, and running out of buffers for control frames is very, very bad news.

I think it is possible to get pretty much optimal range, throughput and latency in the same driver at the same time, but it will take some tuning to get there.

Lab conditions, that being two devices just outside front-end saturation range on a workbench in an all-wooden building, are irrelevant. We're all talking about things that happen in the real world, it's just that for repeatability's sake you need to have a fairly stable test environment. Bad links are the rule, not the exception, in the wifi world, and robustness is a very important goal; I think it's worth sacrificing just a little latency for a degree of robustness, because broken links and TCP stalls are just as annoying as the latency getting out of hand."

Updated by David Taht about 3 years ago

This is why in 1998, based on the mosquitonet research, I decided that split
tcp approaches were best for wireless designs, and have used a web proxy
ever since. I am very mad at myself for not publishing and fully describing
the need for such back then, and it wasn't until I worked on this patent
case, and read over a huge amount of old emails, how important it really
was.

http://the-edge.blogspot.com/2010/10/who-invented-embedded-linux-based.html

I didn't realize nobody else was using web proxies anymore until I got out
into the 'real world' a few years ago.

All that said, it's water under the bridge now: but I hope not to have to
bear the same pain Vint Cerf does about choosing only 32 bits for ipv4, for
anywhere near as long.

And all that said, yes, some compromise between all these variables is
feasible using the sort of monte carlo techniques in minstrel, that I
admire.

but I'd sure like it tunable, so that other approaches (using the polipo web
proxy in cerowrt's case with westwood+ on the inside, AND ECN) can be
effectively tried.

If you want to read some good, yet old papers on mosquitonet and some of the
work on aggregation that was done at the university of colorado, and some
more recent work by comcast, I'll try and dig those up.

On Mon, Aug 1, 2011 at 9:35 AM, Andrew McGregor <>wrote:

Unfortunately, you can't have less than 64 plus a few packets living inside a driver buffer in the worst-case situation with aggregation, not without somehow clamping the window, and I can't see a way to do that within the MAC protocol right now. Remember that there can be quite a few 802.11 control frames living in those buffers too, not just data frames, and running out of buffers for control frames is very, very bad news.

I think it is possible to get pretty much optimal range, throughput and latency in the same driver at the same time, but it will take some tuning to get there.

Lab conditions, that being two devices just outside front-end saturation range on a workbench in an all-wooden building, are irrelevant. We're all talking about things that happen in the real world, it's just that for repeatability's sake you need to have a fairly stable test environment. Bad links are the rule, not the exception, in the wifi world, and robustness is a very important goal; I think it's worth sacrificing just a little latency for a degree of robustness, because broken links and TCP stalls are just as annoying as the latency getting out of hand.

On 1/08/2011, at 8:47 AM, Dave Taht wrote:

Excellent. So it sounds like your patch, along with andrew's latency reducing suggestions, and increasing ATH_RXBUF and ATH_TXBUF to 160 or so, would be the strongest candidate in the bakeoff?

I am still quite allergic to having more than 32 packets

living inside a driver buffer... but am willing to do extensive testing of all approaches that make sense at this point.

Unfortunately I now find myself blocked by bug #202. I can certainly bring debloat-testing back in, and many other types of clients,

as well as maybe (?) patch up the SR-71 driver I have for cardbus, but really want to just do cero to cero testing to eliminate all other variables.

On Mon, Aug 1, 2011 at 6:29 AM, Felix Fietkau <> wrote:

Last time I tried limiting the aggregation size that much, it ended up reducing the throughput from 80 mbit/s down to just over 20 with strong fluctuations.

I realize that latency is pretty much all you care about, but I'd prefer if the focus was directed more towards reducing bufferbloat without causing nasty throughput regressions on links with good conditions. By the way, those good conditions that I'm talking about are not jsut 'laboratory conditions'. We've set up quite a few medium to distance links that perform well with little retransmission and good aggregation levels.

You can find a patch that implements my suggestion for handling swretry here: http://nbd.name/561-ath9k_sw_**retry_reduce.patch<http://nbd.name/561-ath9k_sw_retry_reduce.patch> In my tests it seems to work well without hurting throughput under reasonably good conditions, but it needs some more testing.

- Felix

On 2011-08-01 1:41 PM, Dave Taht wrote:

I am adding this discussion to the bug reporting system via cc'ing the bot and changing the subject line to include the appropriate number.

My principal concern with exorbitant buffering and retries is that no signalling is sent back to the tcp sender to indicate that it should maybe slow down.

My reasoning for choosing 3 packets as a decent outer limit for a AMPDU is that if one packet gets through, the rest can be dropped safely, which will - yes, I totally understand - result in less than maximum wireless-n performance - but give reasonably bounded latencies for all packets AND provide end-to-end signaling that made sense.

I don't care about maximizing wireless-n performance in laboratory conditions, what I care about is having decent mixed g and n performance in fair to bad conditions, with contention from multiple APs and clients.

Felix had produced one patch early on that I thought was promising under that scenario, treating g and n differently, that I'm trying to find.

I would like to move this discussion to the bug report if at all possible,

I certainly have some ideas towards increasing the maximum AMPDU to larger sizes while retaining sanity on the TCP/ip side, notably by having it be tuple sensitive, but that's something longer term than merely what we are discussing now.

On Mon, Aug 1, 2011 at 2:42 AM, Felix Fietkau < <mailto:>> wrote:

I have a better idea: Instead of limiting this value to 2, why not leave it at 10, but add the number of hardware retries to the per-subframe retry counter. That way subframes that get lost during the transmission of A-MPDUs with few or no hardware retransmissions are not unnecessarily penalized, but if A-MPDUs get retransmitted often, subframes will get kicked out quickly as well.

- Felix

On 2011-08-01 3:02 AM, Andrew McGregor wrote:

So is this a more appropriate way to clear it out:

diff --git a/drivers/net/wireless/ath/__**ath9k/ath9k.h b/drivers/net/wireless/ath/__**ath9k/ath9k.h index 2a40fa2..47e0b99 100644 --- a/drivers/net/wireless/ath/__**ath9k/ath9k.h +++ b/drivers/net/wireless/ath/__**ath9k/ath9k.h @ -396,7 +396,7 @ struct ath_led { #define DEFAULT_CACHELINE 32 #define ATH_REGCLASSIDS_MAX 10 #define ATH_CABQ_READY_TIME 80 /* % of beacon interval */ -#define ATH_MAX_SW_RETRIES 10 +#define ATH_MAX_SW_RETRIES 2 #define ATH_CHAN_MAX 255 #define IEEE80211_WEP_NKID 4 /* number of key ids */

IOW, simply don't try so many SW retries? This should end up BARing it forward pretty quickly.

I think this is a pretty important tuning variable… and 10 looks like a pretty silly value, 'cause the frame already had a reasonable number of TX opportunities.

Andrew

On 29/07/2011, at 8:22 AM, Felix Fietkau wrote:

The receive window typically isn't small, it's often as big as 64 frames. The only way to prevent that from messing up the receiver state is to send a BAR for failed frames, and that means pushing even more frames to the tx queue, thus potentially making the problem even worse.

I have the following ideas for attacking the root causes of bloated queues with aggregation:

1. Delay the sequence number assignment to the point where it actually transmits a frame for the first time. That allows the driver to pick any frame that hasn't been transmitted yet for dropping when the queue gets full. 2. Keep track of the size of the sequence number gap of pending frames. That allows the driver to make a better decision of when to drop a frame and send a BAR or when to retry some more. 3. Allow the driver to either try different rates for retransmissions, or to request another rate control lookup. The code for that should probably be pushed as much as possible into mac80211 and the rate control module.

- Felix

On 2011-07-29 2:10 PM, Andrew McGregor wrote:

I may be able to put a few cycles in to this shortly… there's a tradeoff here, and continually recycling is definitely the wrong thing. I realise that dropping here is the wrong thing, but it's less wrong than retrying forever. With a fairly small receive window, it won't be that bad.

Andrew

On 29/07/2011, at 7:55 AM, Felix Fietkau wrote:

If it's doing infinite retry, then that's a bug in the per-subframe retry count tracking and should be fixed properly.

About the 1.6 seconds: Yes, it would be nice if ath9k was able to drop packets more easily, but that requires some more surgery in the xmit path, which I'm going to take care of eventually.

In the mean time I'm not going to apply any kludges that simply trade one problem for another (and have the potential to create nasty side effects).

- Felix

On 2011-07-29 1:39 PM, Dave Taht wrote:

1 retrans good?, infinite very bad.

I was seeing 1.6 SECOND pings, man... going 40 meters....

Got a better suggestion?

http://www.bufferbloat.net/__**issues/216<http://www.bufferbloat.net/__issues/216> <http://www.bufferbloat.net/**issues/216<http://www.bufferbloat.net/issues/216>

Andrew also sent along another patch...

On Fri, Jul 29, 2011 at 2:03 AM, Felix Fietkau< <mailto:> <mailto:

<mailto:>>> wrote:

No retrans not good. No retrans make gaps in client side reorder window. Gaps in client buffers bad, make traffic stall and latencies.

On 2011-07-29 2:21 AM, Dave Taht wrote:

infinite retrans bad. No retrans good. I LOVE IETF.

---------- Forwarded message ---------- From: *Andrew McGregor*< <mailto:> <mailto: <mailto:>> <mailto: <mailto:> <mailto: <mailto:>>**>__> Date: Thu, Jul 28, 2011 at 5:37 PM Subject: Patch for stuck retransmits To: <mailto:><**mailto:

<mailto:>> <mailto: <mailto:><_**_mailto:

<mailto:>>>

Here's the patch, against

linux-2.6.39.3/drivers/net/___** _wireless/ath/ath9k/xmit.c

Updated by Felix Fietkau about 3 years ago

I want the algorithm to know whether it's making a decision for an
aggregate or a single, and possibly also how big the aggregate is.

Also, the rate control could tune for more reliable rates and smaller
aggregates if the sequence number gap is bigger, to prevent more
unnecessary receiver side latency.

- Felix

On 2011-08-01 5:38 PM, Andrew McGregor wrote:

Why not just ask rate control for a new decision when you make the aggregate? None of the rate algorithms are stateful based on tx rate queries from the driver, so that's a perfectly safe and sensible thing to do.

Or, are you thinking that the rate decision API needs to know that it is being asked for a rate set for an aggregate?

On 1/08/2011, at 9:06 AM, Felix Fietkau wrote:

I've also been thinking about changing the rate control parameters but I think we need to take a different approach here. The main issue is that A-MPDU rate control needs to be handled completely different from single-frame rate control. Unfortunately at the point in time where rate control runs, the decision about A-MPDU transmission vs single transmission has not been made yet, so I will probably have to make some rate control API changes to give the driver a chance to interact with the RC directly.

Another issue is that test results for g/n interop from all our previous attempts at limiting queue size are pretty much meaningless.

Dropping packets based on internal per-tid queue counters currently is too bursty for TCP to adapt properly. The debloat-testing eBDP code completely ignores the inner workings of ath9k's queueing, so it cannot properly distinguish between aggregated and unaggregated traffic, which need completely different queueing characteristics.

What we need to be able to produce meaningful test results is proper queue management on the internal ath9k per-TID queues (plus the non-aggregated tx queue for legacy or VI traffic).

This will definitely take some more time to develop, but I think without that we should not jump to any conclusions based on results from random header file hack jobs.

- Felix

On 2011-08-01 2:47 PM, Dave Taht wrote:

Excellent. So it sounds like your patch, along with andrew's latency reducing suggestions, and increasing ATH_RXBUF and ATH_TXBUF to 160 or so, would be the strongest candidate in the bakeoff?

|I am still quite allergic to having more than 32 packets

living inside a driver buffer... but am willing to do extensive testing of all approaches that make sense at this point.

Unfortunately I now find myself blocked by bug #202. I can certainly bring debloat-testing back in, and many other types of clients,

as well as maybe (?) patch up the SR-71 driver I have for cardbus, but really want to just do cero to cero testing to eliminate all other variables. |

On Mon, Aug 1, 2011 at 6:29 AM, Felix Fietkau< <mailto:>> wrote:

Last time I tried limiting the aggregation size that much, it ended up reducing the throughput from 80 mbit/s down to just over 20 with strong fluctuations.

I realize that latency is pretty much all you care about, but I'd prefer if the focus was directed more towards reducing bufferbloat without causing nasty throughput regressions on links with good conditions. By the way, those good conditions that I'm talking about are not jsut 'laboratory conditions'. We've set up quite a few medium to distance links that perform well with little retransmission and good aggregation levels.

You can find a patch that implements my suggestion for handling swretry here: http://nbd.name/561-ath9k_sw___retry_reduce.patch <http://nbd.name/561-ath9k_sw_retry_reduce.patch> In my tests it seems to work well without hurting throughput under reasonably good conditions, but it needs some more testing.

- Felix

On 2011-08-01 1:41 PM, Dave Taht wrote:

I am adding this discussion to the bug reporting system via cc'ing the bot and changing the subject line to include the appropriate number.

My principal concern with exorbitant buffering and retries is that no signalling is sent back to the tcp sender to indicate that it should maybe slow down.

My reasoning for choosing 3 packets as a decent outer limit for a AMPDU is that if one packet gets through, the rest can be dropped safely, which will - yes, I totally understand - result in less than maximum wireless-n performance - but give reasonably bounded latencies for all packets AND provide end-to-end signaling that made sense.

I don't care about maximizing wireless-n performance in laboratory conditions, what I care about is having decent mixed g and n performance in fair to bad conditions, with contention from multiple APs and clients.

Felix had produced one patch early on that I thought was promising under that scenario, treating g and n differently, that I'm trying to find.

I would like to move this discussion to the bug report if at all possible,

I certainly have some ideas towards increasing the maximum AMPDU to larger sizes while retaining sanity on the TCP/ip side, notably by having it be tuple sensitive, but that's something longer term than merely what we are discussing now.

On Mon, Aug 1, 2011 at 2:42 AM, Felix Fietkau< <mailto:> <mailto:<mailto:>>> wrote:

I have a better idea: Instead of limiting this value to 2, why not leave it at 10, but add the number of hardware retries to the per-subframe retry counter. That way subframes that get lost during the transmission of A-MPDUs with few or no hardware retransmissions are not unnecessarily penalized, but if A-MPDUs get retransmitted often, subframes will get kicked out quickly as well.

- Felix

On 2011-08-01 3:02 AM, Andrew McGregor wrote:

So is this a more appropriate way to clear it out:

diff --git a/drivers/net/wireless/ath/____ath9k/ath9k.h b/drivers/net/wireless/ath/____ath9k/ath9k.h index 2a40fa2..47e0b99 100644 --- a/drivers/net/wireless/ath/____ath9k/ath9k.h +++ b/drivers/net/wireless/ath/____ath9k/ath9k.h @ -396,7 +396,7 @ struct ath_led { #define DEFAULT_CACHELINE 32 #define ATH_REGCLASSIDS_MAX 10 #define ATH_CABQ_READY_TIME 80 /* % of beacon interval */ -#define ATH_MAX_SW_RETRIES 10 +#define ATH_MAX_SW_RETRIES 2 #define ATH_CHAN_MAX 255 #define IEEE80211_WEP_NKID 4 /* number of key ids */

IOW, simply don't try so many SW retries? This should end up BARing it forward pretty quickly.

I think this is a pretty important tuning variable… and 10 looks like a pretty silly value, 'cause the frame already had a reasonable number of TX opportunities.

Andrew

On 29/07/2011, at 8:22 AM, Felix Fietkau wrote:

The receive window typically isn't small, it's often as big as 64 frames. The only way to prevent that from messing up the receiver state is to send a BAR for failed frames, and that means pushing even more frames to the tx queue, thus potentially making the problem even worse.

I have the following ideas for attacking the root causes of bloated queues with aggregation:

1. Delay the sequence number assignment to the point where it actually transmits a frame for the first time. That allows the driver to pick any frame that hasn't been transmitted yet for dropping when the queue gets full. 2. Keep track of the size of the sequence number gap of pending frames. That allows the driver to make a better decision of when to drop a frame and send a BAR or when to retry some more. 3. Allow the driver to either try different rates for retransmissions, or to request another rate control lookup. The code for that should probably be pushed as much as possible into mac80211 and the rate control module.

- Felix

On 2011-07-29 2:10 PM, Andrew McGregor wrote:

I may be able to put a few cycles in to this shortly… there's a tradeoff here, and continually recycling is definitely the wrong thing. I realise that dropping here is the wrong thing, but it's less wrong than retrying forever. With a fairly small receive window, it won't be that bad.

Andrew

On 29/07/2011, at 7:55 AM, Felix Fietkau wrote:

If it's doing infinite retry, then that's a bug in the per-subframe retry count tracking and should be fixed properly.

About the 1.6 seconds: Yes, it would be nice if ath9k was able to drop packets more easily, but that requires some more surgery in the xmit path, which I'm going to take care of eventually.

In the mean time I'm not going to apply any kludges that simply trade one problem for another (and have the potential to create nasty side effects).

- Felix

On 2011-07-29 1:39 PM, Dave Taht wrote:

1 retrans good?, infinite very bad.

I was seeing 1.6 SECOND pings, man... going 40 meters....

Got a better suggestion?

http://www.bufferbloat.net/____issues/216 <http://www.bufferbloat.net/__issues/216> <http://www.bufferbloat.net/__issues/216 <http://www.bufferbloat.net/issues/216>>

Andrew also sent along another patch...

On Fri, Jul 29, 2011 at 2:03 AM, Felix Fietkau< <mailto:> <mailto: <mailto:>> <mailto:<mailto:>

<mailto:<mailto:>>>> wrote:

No retrans not good. No retrans make gaps in client side reorder window. Gaps in client buffers bad, make traffic stall and latencies.

On 2011-07-29 2:21 AM, Dave Taht wrote:

infinite retrans bad. No retrans good. I LOVE IETF.

---------- Forwarded message
From: *Andrew McGregor*< <mailto:> <mailto:<mailto:>> <mailto:<mailto:> <mailto:<mailto:>>> <mailto:<mailto:> <mailto:<mailto:>> <mailto:<mailto:> <mailto:<mailto:>>>_>_> Date: Thu, Jul 28, 2011 at 5:37 PM Subject: Patch for stuck retransmits To: <mailto:> <mailto: <mailto:>><__mailto: <mailto:>

<mailto:<mailto:>>> <mailto:<mailto:> <mailto: <mailto:>><____mailto: <mailto:>

<mailto:<mailto:>>>>

Here's the patch, against

linux-2.6.39.3/drivers/net/______wireless/ath/ath9k/xmit.c

Updated by Dave Täht about 3 years ago

Continually repasting andrew's threads until he bothers to register for the bufferbloat site, so it stops rejecting his emails:

He wrote, re http://www.bufferbloat.net/issues/216#note-21

"Which is also why at IndraNet, we deliberately used OpenVPN over the wireless, with Westwood on the outside, to protect the end-user's TCP from loss over multiple wireless hops. Combine that with Jana Iyengar's TCP-minion and you'd get QoS working right as well."

Updated by Dave Täht about 3 years ago

And re felix's last

Andrew wrote:

"Ah, yes, that makes a lot of sense. It is certainly possible to get Minstrel (at least) to go for highest delivery probability ahead of latency, and that probably should happen if doing software retransmit."

Updated by Jim Gettys about 3 years ago

The point of aggregation, as I understand it, is to amortize the overhead of the "on the air" framing over multiple packets.

So just because the hardware can aggregate 64 packets, doesn't mean it should do so; the biggest gain is at an aggregation of 2, and drops from there. Most of the potential benefit is done by the time 4 or 8 packets have been aggregated, and the percentage gain continues to drop.

But can we have our cake and eat it to?

It may be that the amount of aggregation should be varied as the channel itself varies: when you have a low loss channel, that's when aggregation is most possible and most desirable to get the maximium bandwidth with somewhat decently low delay. But when the air is lossy and we're having severe trouble getting wireless frames through, that's when we're likely to find shorter frames do better (as I understand it), and would be much better off controlling latency by shorter frames anyway.

Or am I all wet about this?

Updated by Jonathan Morton about 3 years ago

Stupid question: once several IP packets have been aggregated into one frame, can individual packets still be separated in the reception success/acknowledgement/retransmit stuff? Or is the entire frame sink or swim together?

Updated by David Taht about 3 years ago

What an interesting bit of history behind all this! Did your original code
also compensate for the multicast rate on wireless being so much slower than
the peak rate?

The monstrousness of multicast packets on wireless completely mess up all
existing qdisc's estimators...

On Mon, Aug 1, 2011 at 1:44 PM, Simon Barber <> wrote:

I did the original implementation of wireless queuing in mac80211. The original implementation installed a special wireless root qdisc on the interface in order to model the hardware queues correctly, and allow leaf qdiscs to be installed. Ideally the MAC queues would also have been moved into this qdisc. The wireless root qdisc sat on an interface that represented the physical radio and had an 802.11 frame type. There were one or more 802.3 format virtual interfaces that represented wireless client or APs. The wireless qdisc on the native 802.11 interface saw frames in 802.11 format, and hence could calculate how much airtime would be taken transmitting the frame - this would allow development of qdiscs to do things like share airtime rather than bandwidth. The wireless root qdisc exposed the native hardware queues that exist in 802.11 chip sets and bypassed the normal single queue netdev interface. This allowed the hardware queues to be fed independently.

Unfortunately the native 802.11 interface was deemed confusing to users, and this code got pulled out of the kernel, thus preventing the qdisc system from ever properly working when more than one virtual Ethernet interface is used with a wireless card - such as when you run multiple virtual APs on a wireless card, or simultaneous AP and client modes. This concept of a special wireless root qdisc is necessary to model the hardware queues and the MAC queues in 802.11 correctly - you're not queuing lots of frames in the driver, rather extending the queuing system to implement parts of the MAC.

Simon

On 08/01/2011 06:30 AM, John W. Linville wrote:

On Mon, Aug 01, 2011 at 06:06:43AM -0600, Dave Taht wrote:

If anyone can spare some eyeballs for the discussion of wireless-n AMPDUs and their interaction with bufferbloat, and groks TCP's behavior and/or wireless's behavior, I'd appreciate some comment on:

http://www.bufferbloat.net/**issues/216<http://www.bufferbloat.net/issues/216>

Losing SOME appropriate packets, somewhere in the wireless stack, is necessary.

It seems appropriate to Cc: for this...

______________________**_________________ Bloat mailing list https://lists.bufferbloat.net/**listinfo/bloat<https://lists.bufferbloat.net/listinfo/bloat>

Updated by Dave Täht about 3 years ago

Some exchanges on what to patch in and test against next:

<nbd> dtaht: i think what might be useful is a combination of current trunk
mac80211 + my sw retries patch + capping the hw max rate tries to 3
[09:45]
<nbd> dtaht: btw. i have a simple little tcp splitter here:
http://nbd.name/gitweb.cgi?p=tcptrans.git;a=summary [09:50]
<nbd> transparent tcp proxy, uses netfilter to grab the connections and
creates new outgoing connections to the original destinations
<dtaht> excellent splitter [09:53]
<dtaht> so use rc4 with my bletcherous patches as a baseline and try the
above?
<dtaht> to compare?
<dtaht> (not the tcp split stuff, again I expect to be in pain from that
mistake for decades) - the patches you suggested? [09:54]
<nbd> i don't know what's in rc4 [09:55]
<nbd> haven't looked at it
<nbd> just copy mac80211 from latest trunk
<dtaht> nbd: heh - neat, can hack a programmable ecn into this splitter maybe
<nbd> including the commit i made today
<dtaht> rc4 has my bletcherous patch in it
<nbd> i don't know what patch that is
<dtaht> http://www.bufferbloat.net/issues/216#note-8 [09:58]
<nbd> ah, ok. that one needs to be kicked out [09:59]
<nbd> except for the hw->max_rate_tries change [10:00]
<nbd> you can keep that
<nbd> but combined with my other patch, ATH_MAX_SW_RETRIES needs to stay at 10
<nbd> because my patch changes the meaning of that limit
<dtaht> Why not just pull from head
<nbd> go for it

Updated by Dave Täht about 3 years ago

Simon wrote in on the original, much saner sounding, 802.11 stack:

Hi Dave,

You can see the wireless qdisc in the initial import of mac80211 to the kernel - checkout f0706e828e96d0fa4e80c0d25aa98523f6d589a0, and look at net/mac80211/wme.c

A quick review of 'git log net/mac80211/wme.c' reveals:

It looks like select_queue() got added at 51cb6db0f5654f08a4a6bfa3888dc36a51c2df3e, and that is where the qdisc disappeared. The (required) native Wi-Fi interface disappeared at 3b8d81e020f77c9da8b85b0685c8cd2ca7c7b150.

Updated by David Taht about 3 years ago

Any extra mojo would be good.

---------- Forwarded message ----------
From: Rick Jones <>
Date: Mon, Aug 1, 2011 at 12:07 PM
Subject: Re: what sort of netperf parameters can help lay bare the problems
in 216?
To: Dave Taht <>, Richard Jones <>

I'm not sure what specfic netperf options would be helpful there. I suppose
the histogram of round-trip-times might be interesting along with the
distribution statistics, but netperf won't have direct visibility into what
is happening at the wireless.

rick

Updated by Simon Barber about 3 years ago

Jonathan Morton wrote:

Stupid question: once several IP packets have been aggregated into one frame, can individual packets still be separated in the reception success/acknowledgement/retransmit stuff? Or is the entire frame sink or swim together?

There are 2 types of aggregation in 11n - MSDU and MPDU. With MSDU aggregation the aggregate is acknowledged as a whole, it all sinks or swims together. Multiple MSDUs (ethernet frames from above the mac) get turned into a single aggregate MSDU, and then sent down. This does not need any low level MAC control frame changes, and something very similar was part of Atheros's 'Super G'. With MPDU aggregation each individual aggregated frame is separately acknowledged and retried (although the acknowledgements are combined into a block ack). They can both be used together - although that would be a little silly.

Updated by Andrew McGregor about 3 years ago

Jim Gettys wrote:

The point of aggregation, as I understand it, is to amortize the overhead of the "on the air" framing over multiple packets.

So just because the hardware can aggregate 64 packets, doesn't mean it should do so; the biggest gain is at an aggregation of 2, and drops from there. Most of the potential benefit is done by the time 4 or 8 packets have been aggregated, and the percentage gain continues to drop.

But can we have our cake and eat it to?

It may be that the amount of aggregation should be varied as the channel itself varies: when you have a low loss channel, that's when aggregation is most possible and most desirable to get the maximium bandwidth with somewhat decently low delay. But when the air is lossy and we're having severe trouble getting wireless frames through, that's when we're likely to find shorter frames do better (as I understand it), and would be much better off controlling latency by shorter frames anyway.

Or am I all wet about this?

I'm inclined to agree with you... aggregation has a cost as well as a benefit, and it gets less valuable as the channel gets lossy. And I think we can get a hint from Minstrel as to how much aggregation is a win. I'll have a think about the math.

Updated by Felix Fietkau about 3 years ago

Simon Barber wrote:

There are 2 types of aggregation in 11n - MSDU and MPDU. With MSDU aggregation the aggregate is acknowledged as a whole, it all sinks or swims together. Multiple MSDUs (ethernet frames from above the mac) get turned into a single aggregate MSDU, and then sent down. This does not need any low level MAC control frame changes, and something very similar was part of Atheros's 'Super G'. With MPDU aggregation each individual aggregated frame is separately acknowledged and retried (although the acknowledgements are combined into a block ack). They can both be used together - although that would be a little silly.

Actually, I don't think combining the two is silly. AMSDU can be used to combine multiple small packets (e.g. TCP ACKs) into one normal sized packet. AMPDU then aggregates that with the other packets in the queue.

Updated by Andrew McGregor about 3 years ago

So, I tried adding this patch to a stock openwrt trunk, turned UP the retransmit limit from 10 to 50 to account for the effects of the semantic change, and also a) reduced the Minstrel segment size from 6ms to 1ms, and b) reduced the AMPDU aggregation limit from 4ms to 1ms. I did this in steps to observe the effect of each change (and backed out a few other changes because they made things worse).

Net result, I see an improvement in throughput from about 70 Mbps to about 130 Mbps, and simultaneously an improvement in latency under load.

Workload is this:

netperf t OMNI -H 192.168.1.1 - d send\|recv -r 256,256K &
netperf -t OMNI -H 192.168.1.1 -
d send\|recv -r 256K,256 &
netperf -H 192.168.1.1 &
netperf -H 192.168.1.1 &
netperf -t OMNI -H 192.168.1.1 -
-d send\|recv -r 64,64 &
netperf -H 192.168.1.1 &
netperf -H 192.168.1.1 &

Test setup is a WNDR3700v2 at around one meter range talking to a 2011 MacBook Pro (running OS X Lion).

This particular OpenWRT is set up like this:
root@OpenWrt:~# tc qdisc show
qdisc pfifo_fast 0: dev eth0 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc mq 0: dev wlan0 root
qdisc mq 0: dev mon.wlan0 root
qdisc mq 0: dev wlan1 root
qdisc mq 0: dev mon.wlan1 root

I'm looking at the transactions per second returned by the 64 byte send|recv netperf. It goes up from around 45 tps to around 170 tps with those simple changes. Throughput I'm getting from the GUI on the router.

Then I tried reducing the txqueuelen with ip link set dev wlan1 txqueuelen x.

You know what? It NEEDS that thousand packet txqueue and the 512 txbufs in the driver… if I reduce either, I get both worse throughput and a latency INCREASE. Who knew?

On 1/08/2011, at 5:29 AM, Felix Fietkau wrote:

Last time I tried limiting the aggregation size that much, it ended up reducing the throughput from 80 mbit/s down to just over 20 with strong fluctuations.

I realize that latency is pretty much all you care about, but I'd prefer if the focus was directed more towards reducing bufferbloat without causing nasty throughput regressions on links with good conditions. By the way, those good conditions that I'm talking about are not jsut 'laboratory conditions'. We've set up quite a few medium to distance links that perform well with little retransmission and good aggregation levels.

You can find a patch that implements my suggestion for handling swretry here: http://nbd.name/561-ath9k_sw_retry_reduce.patch In my tests it seems to work well without hurting throughput under reasonably good conditions, but it needs some more testing.

- Felix

Updated by David Taht about 3 years ago

move to bad conditions - say 30 meters from the access point, and repeat.

Updated by Andrew McGregor about 3 years ago

David Taht wrote:

move to bad conditions - say 30 meters from the access point, and repeat.

Makes no difference to the relative merits of each change, I've been doing that all along (at about 15 meters with three walls in the way). Of course, it goes slower out in fringe coverage.

Later I might try from the other side of the street, just to see.

Updated by David Taht about 3 years ago

I also note that I've been setting up a new build of the latest and greatest
in andrew's dir on huchra.[1]

I have ripped out 98% of the 902-bufferbloat patch, pulled felixes latest
for bug #195 - and all of openwrt head actually, current packages, and
reintegrated various other cerowrt bits... and am doing a smoke test build
now.

Which has failed twice now, which I'm fixing. uclibc changes... aggh...

I find myself totally confused now about what, exactly is to be patched into
the ath9k driver.

Andrew, could you jump on huchra and patch in whatever you feel is
appropriate from yours and felix's patch serieses for the ath9k?

I do think you need to be testing the worst case scenarios - along with
multiple clients and APs - before being overly overjoyed.

I am loving what felix just pulled off on #195 tho.

Updated by Dave Täht about 3 years ago

Also, test against g. Which I can do, as soon as I get a build

Updated by David Taht about 3 years ago

And I'm building against 2.6.39.2 again rather than 3, although I swear that
patch is in cerowrt head.... it's going to be a while before I get a build
I'm happy with, so... a clean patches for ath9k highly desired to add to the
stuff I have flying in loose formation.

On Tue, Aug 2, 2011 at 6:12 PM, Dave Taht <> wrote:

I also note that I've been setting up a new build of the latest and greatest in andrew's dir on huchra.[1]

I have ripped out 98% of the 902-bufferbloat patch, pulled felixes latest for bug #195 - and all of openwrt head actually, current packages, and reintegrated various other cerowrt bits... and am doing a smoke test build now.

Which has failed twice now, which I'm fixing. uclibc changes... aggh...

I find myself totally confused now about what, exactly is to be patched into the ath9k driver.

Andrew, could you jump on huchra and patch in whatever you feel is appropriate from yours and felix's patch serieses for the ath9k?

I do think you need to be testing the worst case scenarios - along with multiple clients and APs - before being overly overjoyed.

I am loving what felix just pulled off on #195 tho.

Updated by David Taht about 3 years ago

---------- Forwarded message ----------
From: Andrew McGregor <>
Date: Tue, Aug 2, 2011 at 6:23 PM
Subject: Re: Patch for stuck retransmits [#216]
To: Dave Taht <>

David Taht wrote:

I also note that I've been setting up a new build of the latest and
greatest in andrew's dir on huchra.[1]

I find myself totally confused now about what, exactly is to be patched
into the ath9k driver.

Andrew, could you jump on huchra and patch in whatever you feel is appropriate from yours and felix's patch serieses for the ath9k?

I do think you need to be testing the worst case scenarios - along with multiple clients and APs - before being overly overjoyed.

I am loving what felix just pulled off on #195 tho.

I'll generate some patches on monster (big 24 core box I've been building
on) and ship them across, might be tomorrow morning before I get to that.

I note that I'm testing in a residential area in Palo Alto, there's not
exactly a shortage of other APs around here, but I only have one client to
be playing with; multi-client load testing will have to wait till I get
home, where I can beat it up with 3 computers, two iPhones, and the Airport
Express on the TV network that has three gadgets behind it... HD streaming
is a really good test for TCP stalls.

On 2/08/2011, at 5:20 PM, Dave Taht wrote:

And what is your latency under load, exactly, under what conditions,
exactly?

Averages 5.8 ms transaction latency over TCP (that netperf 64 bytes each way
measurement) sitting right on top of the AP, degrades to about 13 ms at 15 m
behind 3 walls.

Andrew

Updated by David Taht about 3 years ago

---------- Forwarded message ----------
From: Dave Taht <>
Date: Tue, Aug 2, 2011 at 6:28 PM
Subject: Re: Patch for stuck retransmits [#216]
To: Andrew McGregor <>

Why will 1000 txqueues and 512 driver buffers not suck rocks on wireless-g?
I sort of understand that infinite retry is now gone, but lacking patches,
it's hard to keep up.

On Tue, Aug 2, 2011 at 6:23 PM, Andrew McGregor <>wrote:

I'll generate some patches on monster (big 24 core box I've been building on) and ship them across, might be tomorrow morning before I get to that.


OK, well, I can test 3 APs all night here, with patches...

I note that I'm testing in a residential area in Palo Alto, there's not exactly a shortage of other APs around here, but I only have one client to be playing with; multi-client load testing will have to wait till I get home, where I can beat it up with 3 computers, two iPhones, and the Airport Express on the TV network that has three gadgets behind it... HD streaming is a really good test for TCP stalls.

I hope to have a 2.6.39.3 build done in about an hour. I then have to look
into whatever's wrong with #202

Updated by David Taht about 3 years ago

In looking over your tests I suspect the reason your latency under load
remains good is that running netperf on the router itself is eating so much
cpu that you can no longer hit your new outer limits.

I suggest running netperf THROUGH the router, to another box entirely.

Heisenbugs suck. See what sort of cpu usage you have during the tests as run
previously.

I note #195 is amazing. are you running with commit
8fb4b275c205bd5f9cedc052f846d66245c63df1

already?

Updated by David Taht about 3 years ago

OK: I am tossing out the rc4 tree entirely, going back to openwrt head,
reintegrating the patches that I'd meant to push back to openwrt 3 weeks ago
before being interrupted (notably iproute 2.6.39 and iptables 1.4.12), and
doing all that work in andrew's build dir.

This will likely take a while. I look forward to getting something mildly
testable tonight, and look forward to the ath9k and minstrel patches landing
tomorrow.

Amazing couple of days, folks!

Updated by Andrew McGregor about 3 years ago

I just checked, the CPU usage is about 35%, so I think that's fine.

On 2/08/2011, at 5:39 PM, Dave Taht wrote:

In looking over your tests I suspect the reason your latency under load remains good is that running netperf on the router itself is eating so much cpu that you can no longer hit your new outer limits.

I suggest running netperf THROUGH the router, to another box entirely.

Heisenbugs suck. See what sort of cpu usage you have during the tests as run previously.

I note #195 is amazing. are you running with commit 8fb4b275c205bd5f9cedc052f846d66245c63df1

already?

Updated by David Taht about 3 years ago

A build of cerowrt on top of openwrt head, with the fixes to #195 in there,
and the old bufferbloat patch ripped out, should be available in a half hour
from:

http://huchra.bufferbloat.net/~andrewm/cerowrt/

It lacks the new ath9k/minstrel stuff andrew is talking about, and the
current 'debloat' script would need to be changed to not muck with
txqueuelen on wireless... but seriously folks I think g will still be very
bad in bad conditions at txqueuelen 1000 + driver buffering 512. I retain an
open mind, however.

I also forgot, once again, to add in oprofile unaligned trap counters.

So hopefully andrew will land those patches tomorrow morning, and I'll do a
new build and test.

I'm going to stay another day in toronto to do so. I'd prefer to just get on
towards home...

I would love to know if the diffserv stuff I did here:

https://github.com/dtaht/Diffserv

still ran as slowly as it did prior to #195. The scripts are ame_dbg and
diffserv_dbg - and the easy test is to move the iperf classifier to the end
of the string of iptables rules and run iperf.

script runs on both openwrt and regular linux. I think ame is mildly more
modern than diffserv. I gave up, because the performance was so bad, and
wrote up how to make it fast, here:

http://www.bufferbloat.net/projects/bloat/wiki/Diffserv_RFC

Updated by David Taht about 3 years ago

Actually, that build just popped out. Anybody feel ambitious enough to try
it?

Or Andrew, feel ambitious enough to fold in your patches?

Updated by Dave Täht about 3 years ago

OK trying it...

Updated by Dave Täht about 3 years ago

  • Assignee changed from Dave Täht to Andrew McGregor
  • % Done changed from 0 to 50

The wired side is working out really well, and I'm connected wirelessly, writing this, doing massive amounts of tests, with the old wireless bufferbloat patch backed out completely.

Still have a low txqueuelen, will test with the vastly larger ones thus far recomended, against this base, on wireless-g

I have found a dscp regression with iptables 1.4.12. The --dscp match cannot be inverted

Updated by Dave Täht about 3 years ago

  • Status changed from New to Closed

basically #216 covers this issue.

Updated by Dave Täht about 3 years ago

  • Status changed from Closed to In Progress

oops, closed the wrong bug.

Updated by Andrew McGregor about 3 years ago

David Taht wrote:

In looking over your tests I suspect the reason your latency under load remains good is that running netperf on the router itself is eating so much cpu that you can no longer hit your new outer limits.

I suggest running netperf THROUGH the router, to another box entirely.

Heisenbugs suck. See what sort of cpu usage you have during the tests as run previously.

CPU is around 50%, so I'm fairly sure I'm fine.

I also tried running a tickless kernel with HZ=1000, and that made a fairly noticeable difference too.

Updated by Dave Täht about 3 years ago

I've often thought that embedded MIPS was ready for tickless. There used to be a problem with HTB for example, that lacking support for highres timers, had 'interesting' granularities on the 10ms tick.

While I'm told that's fixed, and I'm very leeery of switching to tickless or a more rapid clock interval at this late date, I feel it is the right thing to do long term. As for the shorter term...

what sort of 'better' results are you getting? :)

Updated by Dave Täht about 3 years ago

oh, and I ran both routers, all night, with netperf, both on udp and tcp, with a 30 second cycle across the wired interfaces.

No kernel oopses, and with tcp_low_latency and tcp cubic I saw the fastest performance I've ever got out of these things, by far, and I'm tickled pink...

Ping times stayed flat during tcp transfers, but during the udp test (through two routers) spiked as high as 33ms. I note that to complicate matters, my laptop runs an older version of debloat-testing, so I am leery of calling that a real problem - but I did also see netperf blow up at a 2 minute duration for udp_stream, but haven't analyzed it. Perhaps you can duplicate my results on your build and hardware?

Updated by Andrew McGregor about 3 years ago

Tickless was invented by SGI, and the MIPS CPU architecture was designed to run tickless, so it's nearly optimal for the architecture.

Anyway, it seems to give me about 15% on 11n transactions-per-second.

11g is sucking horribly for me at the moment, so I have a little more to do yet.

On 3/08/2011, at 8:47 AM, wrote:

Issue #216 has been updated by Dave Täht.

I've often thought that embedded was ready for tickless. There used to be a problem with HTB for example, that lacking support for highres timers, had 'interesting' granularities on the 10ms tick.

While I'm told that's fixed, and I'm very leeery of switching to tickless or a more rapid clock interval at this late date, I feel it is the right thing to do long term. As for the shorter term...

what sort of 'better' results are you getting? :)
Bug #216: Infiinite retry for ping https://www.bufferbloat.net/issues/216

Author: Dave Täht Status: In Progress Priority: Immediate Assignee: Andrew McGregor Category: Target version: 1st Public Cerowrt release

ping takes 1.6 seconds to go 40 meters.

Updated by Dave Täht about 3 years ago

Andrew McGregor wrote:

Tickless was invented by SGI, and the MIPS CPU architecture was designed to run tickless, so it's nearly optimal for the architecture.

Anyway, it seems to give me about 15% on 11n transactions-per-second.

11g is sucking horribly for me at the moment, so I have a little more to do yet.

You have a penchant for understatement. g+n is sucking horribly, worldwide, on 10s of millions of devices.

Updated by Dave Täht about 3 years ago

Clients have an obligation to send as much data at a time as they possibly can.

But:

An Access Point has an obligation to be fair. Both to newfangled 'n' clients and old fashioned 'g'. If it seriously compromises n performance to be fair to a g client, then so be it. It creates a compelling reason for a user to migrate to pure n. In fact, I have no problem making the 5ghz radio be pure n, and 2.4 being mixed g+n...

Until that Frabjous day where Wireless-g can be retired, we have to deal with it, and it sounds like the instant a g node is on an n network that the buffer sizes need to drop to like 16 txqueuelen and 8 driver buffers, or some other solution found.

"For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled."

You have no idea how much it pains me to recall the circumstances under which Feynman wrote that.

Updated by David Taht about 3 years ago

Felix just finished up some new BAR code, and is working on queue limiting.
He says he understands what you've done so far, but I don't.

I'd love to be sitting here, in this hotel room, in a foreign land, testing
stuff out...

... I worry especially that even if tickless works, I worry that it will
break something else. For which I can test.

can you toss together your patches thus far? (I don't care that g sucks
right now, it sounds like felix is on top of that?)...
felix is in germany and although he keeps late hours....

Updated by Andrew McGregor about 3 years ago

Just starting to roll them up.

I got g to suck much less.

Note that although I have not reduced ATH_TXBUF in these, it's still at 512, the driver will only allow 34 packets (plus 5 management frames) to be untransmitted at any one time, which is just enough to generate a couple of aggregates and keep the radio busy and no more. That's ATH_MAX_QDEPTH. The other 512-34-5 buffers are there to keep in-window frames around while waiting for link-layer acknowledgements, and do not contribute to queueing delay. That's enough for 14 full-size aggregates, and might not be enough if there are a lot of clients, so I might suggest setting ATH_MAX_QDEPTH explicitly to a few more than 32 and being done with it, rather than the current (ATH_TXBUF/13 - 5), which I found empirically.

On 3/08/2011, at 1:36 PM, Dave Taht wrote:

Felix just finished up some new BAR code, and is working on queue limiting. He says he understands what you've done so far, but I don't.

I'd love to be sitting here, in this hotel room, in a foreign land, testing stuff out...

... I worry especially that even if tickless works, I worry that it will break something else. For which I can test.

can you toss together your patches thus far? (I don't care that g sucks right now, it sounds like felix is on top of that?)... felix is in germany and although he keeps late hours....

Updated by Andrew McGregor about 3 years ago

Here's what I've been working on. Note that this includes a version of Felix' patch that changes the interpretation of ATH_MAX_SW_RETRIES, which explains the much higher value; in the old interpretation, this actually would turn it down to around 4. Seems to give me decent behaviour in BG, BGN and AN modes. Also note that despite ATH_TX_BUF being 512, the actual number of transmit buffers allowed to be outstanding without a transmission attempt is 34, which is the least you can go to without starting to lose performance due to packet loss (this might be sensitive to experimental setup). The remainder of the buffers are held post-transmission to allow for retransmits.

General theory: 4ms is too long an aggregate frame, so I've turned it down to 1ms. 6ms is too long to hold up the transmitter for MCS rates, so given that .11n is around 6x as fast as .11g, I've divided all the TX time allowances by 6 if we're working on N transmissions. There were about 4x as many outstanding transmit buffers as needed, so I found the minimum.

This version also seems not to be sensitive to txqueuelen on a WNDR3700, so I think I've reached the limit of what I can see without routing through it. Although, I can show that txqueuelen of less than around 40 does hurt performance (even latency, due to causing more TCP retransmissions and therefore having to send more data, as well as causing the radio to go idle as the queue empties from time to time).

Updated by David Taht about 3 years ago

Andrew:

Felix pushed a whole bunch of stuff into openwrt today, that conflicts with
yours. I LIKE very much reducing the 4 to 1 and 6 to 1 ratios overall, but
haven't sorted through your patch yet to figure out the right approach, and
probably am too dumb to do so.

He's just gone to bed.

He didn't like reducing the ratios as it wasn't generic enough for openwrt,
but for me,
going with the best possible options for lowered latency on cerowrt is
best...

so I sat on merging yours and his stuff for the next build. ideas?

lastly, I enabled tickless, and 1000HZ, and didn't know if you went with
server, low latency or desktop pre-emption. I went with server.

Building now. Goin to dinner.

Updated by David Taht about 3 years ago

Andrew:

I will review your patch over dinner. I see stuff I like.

Updated by David Taht about 3 years ago

and the latest openwrt head failed to biuld...

aggh

/home/andrewm/src/cerowrt/build_dir/linux-ar71xx_generic/compat-wireless-2011-06-22/drivers/net/wireless/ath/ath9k/xmit.c:
In function 'ath_tx_complete_aggr':
/home/andrewm/src/cerowrt/build_dir/linux-ar71xx_generic/compat-wireless-2011-06-22/drivers/net/wireless/ath/ath9k/xmit.c:587:24:
error: 'struct ath_node' has no member named 'vif'
/home/andrewm/src/cerowrt/build_dir/linux-ar71xx_generic/compat-wireless-2011-06-22/drivers/net/wireless/ath/ath9k/xmit.c:
In function 'ath_tx_aggr_stop':

On Wed, Aug 3, 2011 at 7:27 PM, Dave Taht <> wrote:

Andrew:

I will review your patch over dinner. I see stuff I like.

Updated by Dave Täht about 3 years ago

I'm too wiped out to get openwrt head fixed, the changes to use tickless working, and andrews patch to all play together tonight. Sorry guys.

Updated by Dave Täht about 3 years ago

I'm on my way to california, arriving SFO 10AM. Andrew, if you are available for another mind meld today or tomorrow, I'd appreciate it.

Updated by Felix Fietkau about 3 years ago

Yeah, it was missing a small patch chunk. Fixed in r27891

- Felix

On 2011-08-04 3:29 AM, Dave Taht wrote:

and the latest openwrt head failed to biuld...

aggh

/home/andrewm/src/cerowrt/build_dir/linux-ar71xx_generic/compat-wireless-2011-06-22/drivers/net/wireless/ath/ath9k/xmit.c: In function 'ath_tx_complete_aggr': /home/andrewm/src/cerowrt/build_dir/linux-ar71xx_generic/compat-wireless-2011-06-22/drivers/net/wireless/ath/ath9k/xmit.c:587:24: error: 'struct ath_node' has no member named 'vif' /home/andrewm/src/cerowrt/build_dir/linux-ar71xx_generic/compat-wireless-2011-06-22/drivers/net/wireless/ath/ath9k/xmit.c: In function 'ath_tx_aggr_stop':

Updated by David Taht about 3 years ago

on my way to cali anyway

On Thu, Aug 4, 2011 at 3:26 AM, Felix Fietkau <> wrote:

Yeah, it was missing a small patch chunk. Fixed in r27891

- Felix

On 2011-08-04 3:29 AM, Dave Taht wrote:

and the latest openwrt head failed to biuld...

aggh

/home/andrewm/src/cerowrt/**build_dir/linux-ar71xx_** generic/compat-wireless-2011-**06-22/drivers/net/wireless/** ath/ath9k/xmit.c: In function 'ath_tx_complete_aggr': /home/andrewm/src/cerowrt/**build_dir/linux-ar71xx_** generic/compat-wireless-2011-**06-22/drivers/net/wireless/** ath/ath9k/xmit.c:587:24: error: 'struct ath_node' has no member named 'vif' /home/andrewm/src/cerowrt/**build_dir/linux-ar71xx_** generic/compat-wireless-2011-**06-22/drivers/net/wireless/** ath/ath9k/xmit.c: In function 'ath_tx_aggr_stop':

Updated by Dave Täht about 3 years ago

Alright that build popped out with felix's last commit. Sitting in peets coffee shop in palo alto now, think I want some real food and a saner place to set one up than here.

That said, I DO have the brain cells now (I hope) to look over andrew's patch in relation to nbds....

Updated by David Taht about 3 years ago

darn it. the huchra.bufferbloat.net/~andrewm build does not have a working
ethernet interface.

sigh... ripping out some stuff, trying again

Updated by David Taht about 3 years ago

it's the ethernet that is toast, the wireless is working...

On Fri, Aug 5, 2011 at 11:40 AM, Dave Taht <> wrote:

darn it. the huchra.bufferbloat.net/~andrewm<http://huchra.bufferbloat.net/%7Eandrewm>build does not have a working ethernet interface.

sigh... ripping out some stuff, trying again

Updated by Andrew McGregor about 3 years ago

Ok, this patch applies on top of Felix' last set of changes, and produces really spectacular results: about 2x better latency on 11n than before we both started on it at basically no performance cost, and fairly decent and consistent latency on 11g at a performance cost so small I can't measure it.

It may be worth fiddling a little more with the tunables, but this is pretty good for now.

Updated by Yuri Bene about 3 years ago

Andrew McGregor wrote:

Ok, this patch applies on top of Felix' last set of changes, and produces really spectacular results: about 2x better latency on 11n than before we both started on it at basically no performance cost, and fairly decent and consistent latency on 11g at a performance cost so small I can't measure it.

It may be worth fiddling a little more with the tunables, but this is pretty good for now.

I experience sluggish and lost pings on my router with ag71xx. Any way to update with your patch to try?

Updated by David Taht about 3 years ago

Just to keep bug tracking..

I pulled from 27917.

I don't see any commits that could have caused this breakage, except maybe
that I'm building against 2.6.39.3, and may have differences in my config
file from yours that are important.

could I encourage you to hop on huchra?

Also I'm curious if YOUR ethernet build showed this:

T
root@OpenWrt:~# dmesg | grep eth0
eth0: Atheros AG71xx at 0xb9000000, irq 4
eth0: unable to find MII bus on device 'rtl8366s'
eth0: Atheros AG71xx at 0xba000000, irq 5
eth0: unable to find MII bus on device 'rtl8366s'

I thought I was building from the same .config as you with support for the
1000hz and no_hz options - the above looks like a faster clock tick problem
to me... but you should have seen that too.

So I can revert to 2.6.39.2, revert all that

and/or restart from scratch from my patch set (Which only touches one tiny
bit of the eth driver and NOTHING of the wireless driver) But with a little
data perhaps that won't be neccessary.

I'll stick around here a while longer, cleaning up, and getting a router
from thursday online...

After I get some sleep.

On Fri, Aug 5, 2011 at 7:00 PM, Andrew McGregor <>wrote:

Huh? I generated it based on openwrt head r27912

On 5/08/2011, at 5:01 PM, Dave Taht wrote:

so this patch duplicates some, but not all, of what is now in openwrt head? It only partially applies.

Ripping it out again, and going off to hopefully find the problem in the ethernet driver instead....

On Fri, Aug 5, 2011 at 4:00 PM, Dave Taht <> wrote:

---------- Forwarded message ---------- From: Andrew McGregor <> Date: Fri, Aug 5, 2011 at 3:57 PM Subject: Resend of patch To: Dave Taht <>

Ok, this patch applies on top of Felix' last set of changes, and produces really spectacular results: about 2x better latency on 11n than before we both started on it at basically no performance cost, and fairly decent and consistent latency on 11g at a performance cost so small I can't measure it.

It may be worth fiddling a little more with the tunables, but this is pretty good for now.

(btw: drop this file in the packages/mac80211 directory)

Updated by David Taht about 3 years ago

---------- Forwarded message ----------
From: Dave Taht <>
Date: Sat, Aug 6, 2011 at 9:45 PM
Subject: Fwd: Resend of patch [#1216]
To:

---------- Forwarded message ----------
From: Andrew McGregor <>
Date: Sat, Aug 6, 2011 at 8:24 PM
Subject: Re: Resend of patch [#195]
To: Dave Taht <>

Oooops... Ok, I was patching the wrong thing. I'll need to retest this, but
this patch should at least apply:

On 7/08/2011, at 2:20 PM, Dave Taht wrote:

incidentally, I work primarily out of nbd's git trees, I find that's a lot
easier than svn.

The attached file is OUT OF DATE, but does point to git trees for everything
important...

On Sat, Aug 6, 2011 at 8:06 PM, Dave Taht <> wrote:

trick:

make package/mac80211/{clean,compile,install} V=99

On Sat, Aug 6, 2011 at 8:05 PM, Andrew McGregor <>wrote:

So, I'm going to take a quick look at that

On 7/08/2011, at 2:00 PM, Dave Taht wrote:

Please note, normally I have enough brain cells to cope with stuff like this, but was terribly burned out, and am only now beginning to feel halfway normal - with at least 2 more hot tub sessions left in the day.

and have 22 other bugs on my plate...

that said, I shoulda locked you in the loft until after this compiled... :)

Applying ./patches/580-ath9k_lowlatency.patch using plaintext: patching file net/mac80211/rc80211_minstrel_ht.c Hunk #1 FAILED at 487. 1 out of 1 hunk FAILED -- saving rejects to file net/mac80211/rc80211_minstrel_ht.c.rej patching file drivers/net/wireless/ath/ath9k/ath9k.h Hunk #1 succeeded at 120 with fuzz 2 (offset -4 lines). Hunk #2 FAILED at 529. 1 out of 2 hunks FAILED -- saving rejects to file drivers/net/wireless/ath/ath9k/ath9k.h.rej patching file drivers/net/wireless/ath/ath9k/xmit.c Hunk #1 FAILED at 247. Hunk #2 FAILED at 357. Hunk #3 succeeded at 384 with fuzz 2 (offset 19 lines). Hunk #4 FAILED at 447. Hunk #5 succeeded at 650 (offset 22 lines). 3 out of 5 hunks FAILED -- saving rejects to file drivers/net/wireless/ath/ath9k/xmit.c.rej Patch failed! Please fix ./patches/580-ath9k_lowlatency.patch! make2: *** [/home/andrewm/src/cerowrt/build_dir/linux-ar71xx_generic/compat-wireless-2011-06-22/.prepared_cc7790f9a6fba8b991c06bfd7f50721f] Error 1 make2: Leaving directory `/home/andrewm/src/cerowrt/package/mac80211' make1: *** [package/mac80211/compile] Error 2 make1: Leaving directory `/home/andrewm/src/cerowrt' make: *** [package/mac80211/compile] Error 2

On Sat, Aug 6, 2011 at 7:54 PM, Dave Taht <> wrote:

Hey, welcome to the ground! Or are you still in the air?

I wish I could take a picture of where I am - listening to live music, in a campground, with my super duper SR71 antennas up, eating BBQ, watching two girls from santa cruz dance.

Sometimes I REALLY like wireless technology.

No, it was this patch that broke, I can send you the stuff on it in a minute.

More importantly figuring out what broke ethernet is on my mind...

On Sat, Aug 6, 2011 at 7:50 PM, Andrew McGregor <>wrote:

Which one? The newest one, or the previous? That was true of the previous one, in the strange diff format. The one I just attached again is not, so far as I know.

On 7/08/2011, at 1:40 PM, Dave Taht wrote:

Felix told me that your patch had 1/3 stuff he'd done already.

---------- Forwarded message ---------- From: Andrew McGregor <> Date: Fri, Aug 5, 2011 at 8:59 PM Subject: Re: Resend of patch To: Dave Taht <>

I don't know if ethernet was working, I never checked.

I was using 2.6.39.2 however, so I suspect an upstream change is causing the patch to not apply. It's not big, you could put it in by hand easily enough. The last chunk you could also change the other case in that if statement to something smaller, although that won't happen on a WNDR3700

About to get on a plane to NZ, so I'll be out of circulation for around 36 hours.

On 5/08/2011, at 6:25 PM, Dave Taht wrote:

I pulled from 27917.

I don't see any commits that could have caused this breakage, except maybe that I'm building against 2.6.39.3, and may have differences in my config file from yours that are important.

could I encourage you to hop on huchra?

Also I'm curious if YOUR ethernet build showed this:

T root@OpenWrt:~# dmesg | grep eth0 eth0: Atheros AG71xx at 0xb9000000, irq 4 eth0: unable to find MII bus on device 'rtl8366s' eth0: Atheros AG71xx at 0xba000000, irq 5 eth0: unable to find MII bus on device 'rtl8366s'

I thought I was building from the same .config as you with support for the 1000hz and no_hz options - the above looks like a faster clock tick problem to me... but you should have seen that too.

So I can revert to 2.6.39.2, revert all that

and/or restart from scratch from my patch set (Which only touches one tiny bit of the eth driver and NOTHING of the wireless driver) But with a little data perhaps that won't be neccessary.

I'll stick around here a while longer, cleaning up, and getting a router from thursday online...

After I get some sleep. On Fri, Aug 5, 2011 at 7:00 PM, Andrew McGregor <>wrote:

Huh? I generated it based on openwrt head r27912

On 5/08/2011, at 5:01 PM, Dave Taht wrote:

so this patch duplicates some, but not all, of what is now in openwrt head? It only partially applies.

Ripping it out again, and going off to hopefully find the problem in the ethernet driver instead....

On Fri, Aug 5, 2011 at 4:00 PM, Dave Taht <> wrote:

---------- Forwarded message ---------- From: Andrew McGregor <> Date: Fri, Aug 5, 2011 at 3:57 PM Subject: Resend of patch To: Dave Taht <>

Ok, this patch applies on top of Felix' last set of changes, and produces really spectacular results: about 2x better latency on 11n than before we both started on it at basically no performance cost, and fairly decent and consistent latency on 11g at a performance cost so small I can't measure it.

It may be worth fiddling a little more with the tunables, but this is pretty good for now.

(btw: drop this file in the packages/mac80211 directory)

Updated by Dave Täht about 3 years ago

  • File deleted (580-ath9k_lowlatency.patch)

Updated by Dave Täht about 3 years ago

I finally have enough of the lab setup at ISC to start bisecting the ethernet issue introduced by increasing the clock, changing the kernel version, and going tickless.

More news will be on #195

Updated by Dave Täht about 3 years ago

  • Status changed from In Progress to Closed

To paraphrase ESR:

"With the right eyeballs, all bugs are shallow."

Updated by John Linville about 3 years ago

Is this going to result in a patch being posted upstream?

Updated by Dave Täht about 3 years ago

@John:

After I get it fully tested in the lab, absolutely, it's going upstream. So far as I know there are other problems with merging this upstream in that some of this code is merged under a different name?

Assembling the lab now. Ping me on irc if you want to chat further.

Updated by David Taht about 3 years ago

I note that txqueuelen of X is ok, when the station is in N mode.

It's when it's in G mode that this introduces tons more latency.

Perhaps a closer look at the mq qdisc is in order.

And I note that testing chrome appears to be in order.

---------- Forwarded message ----------
From: Andrew McGregor <>
Date: Tue, Aug 9, 2011 at 11:48 PM
Subject: Re: Resend of patch [#195]
To: Dave Taht <>

So, I kicked my txqueuelen up on the wireless interfaces to 100; I was
dropping too many packets coming back at me (mostly from Google sites
browsing with Chrome, so I presume the problem has to do with IW 10 on the
servers and SPDY in Chrome...)

On 10/08/2011, at 2:50 AM, Dave Taht wrote:

Excellent.

Can't convince ya to try cerowrt? :/

I am still not certain if my build ended up the same as yours.

On Tue, Aug 9, 2011 at 3:28 AM, Andrew McGregor <>wrote:

So, yes, I did read the book... interesting read.

I've now put one of the new WNDR3700v2s into my home network, and put my openwrt build on it... and it rocks. minstrel-ht with the recent tweaks really performs well.

On 7/08/2011, at 2:31 PM, Dave Taht wrote:

correction: did the tests with friday's build, with just nbd's patches, tickless, 1000hz, wirelessly, and nothing went boom. But losing ethernet is something of a problem. :)

did you have a chance to read the book? I read scalzi... thx... I needed that.

On Sat, Aug 6, 2011 at 8:29 PM, Dave Taht <> wrote:

I also really wanted to know what your kernel config was, exactly, so I can make sure the crazy-ass change to 1000HZ ticks and tickless was the same. I set it up for server mode, rather than low latency desktop, and I do think that proxies and otherstuff in user space might benefit more from even lower latency....

and I'm pretty sure at this point, that change is what is breaking ag71xx's mdi serial probe for both of us, but, lacking data to confirm that, it could be the kernel upgrade, or something else entirely.

Aside from that problem, I ran the router through a boatload of tests with thursday's build, and it held up.

On Sat, Aug 6, 2011 at 8:27 PM, Dave Taht <> wrote:

Or, I'll do it... but... jeeze... reading over your patch now, by the time I'm done I could be downloading a build... :)

root@huchra:~# write andrewm write: warning: write will appear from d yer on huchra slam it in the right dir and do a build batch make -j 8 make CNTRL-D

On Sat, Aug 6, 2011 at 8:24 PM, Andrew McGregor <>wrote:

Oooops... Ok, I was patching the wrong thing. I'll need to retest this, but this patch should at least apply:

On 7/08/2011, at 2:20 PM, Dave Taht wrote:

incidentally, I work primarily out of nbd's git trees, I find that's a lot easier than svn.

The attached file is OUT OF DATE, but does point to git trees for everything important...

On Sat, Aug 6, 2011 at 8:06 PM, Dave Taht <> wrote:

trick:

make package/mac80211/{clean,compile,install} V=99

On Sat, Aug 6, 2011 at 8:05 PM, Andrew McGregor <>wrote:

So, I'm going to take a quick look at that

On 7/08/2011, at 2:00 PM, Dave Taht wrote:

Please note, normally I have enough brain cells to cope with stuff like this, but was terribly burned out, and am only now beginning to feel halfway normal - with at least 2 more hot tub sessions left in the day.

and have 22 other bugs on my plate...

that said, I shoulda locked you in the loft until after this compiled... :)

Applying ./patches/580-ath9k_lowlatency.patch using plaintext: patching file net/mac80211/rc80211_minstrel_ht.c Hunk #1 FAILED at 487. 1 out of 1 hunk FAILED -- saving rejects to file net/mac80211/rc80211_minstrel_ht.c.rej patching file drivers/net/wireless/ath/ath9k/ath9k.h Hunk #1 succeeded at 120 with fuzz 2 (offset -4 lines). Hunk #2 FAILED at 529. 1 out of 2 hunks FAILED -- saving rejects to file drivers/net/wireless/ath/ath9k/ath9k.h.rej patching file drivers/net/wireless/ath/ath9k/xmit.c Hunk #1 FAILED at 247. Hunk #2 FAILED at 357. Hunk #3 succeeded at 384 with fuzz 2 (offset 19 lines). Hunk #4 FAILED at 447. Hunk #5 succeeded at 650 (offset 22 lines). 3 out of 5 hunks FAILED -- saving rejects to file drivers/net/wireless/ath/ath9k/xmit.c.rej Patch failed! Please fix ./patches/580-ath9k_lowlatency.patch! make2: *** [/home/andrewm/src/cerowrt/build_dir/linux-ar71xx_generic/compat-wireless-2011-06-22/.prepared_cc7790f9a6fba8b991c06bfd7f50721f] Error 1 make2: Leaving directory `/home/andrewm/src/cerowrt/package/mac80211' make1: *** [package/mac80211/compile] Error 2 make1: Leaving directory `/home/andrewm/src/cerowrt' make: *** [package/mac80211/compile] Error 2

On Sat, Aug 6, 2011 at 7:54 PM, Dave Taht <>wrote:

Hey, welcome to the ground! Or are you still in the air?

I wish I could take a picture of where I am - listening to live music, in a campground, with my super duper SR71 antennas up, eating BBQ, watching two girls from santa cruz dance.

Sometimes I REALLY like wireless technology.

No, it was this patch that broke, I can send you the stuff on it in a minute.

More importantly figuring out what broke ethernet is on my mind...

On Sat, Aug 6, 2011 at 7:50 PM, Andrew McGregor < > wrote:

Which one? The newest one, or the previous? That was true of the previous one, in the strange diff format. The one I just attached again is not, so far as I know.

On 7/08/2011, at 1:40 PM, Dave Taht wrote:

Felix told me that your patch had 1/3 stuff he'd done already.

---------- Forwarded message ---------- From: Andrew McGregor <> Date: Fri, Aug 5, 2011 at 8:59 PM Subject: Re: Resend of patch To: Dave Taht <>

I don't know if ethernet was working, I never checked.

I was using 2.6.39.2 however, so I suspect an upstream change is causing the patch to not apply. It's not big, you could put it in by hand easily enough. The last chunk you could also change the other case in that if statement to something smaller, although that won't happen on a WNDR3700

About to get on a plane to NZ, so I'll be out of circulation for around 36 hours.

On 5/08/2011, at 6:25 PM, Dave Taht wrote:

I pulled from 27917.

I don't see any commits that could have caused this breakage, except maybe that I'm building against 2.6.39.3, and may have differences in my config file from yours that are important.

could I encourage you to hop on huchra?

Also I'm curious if YOUR ethernet build showed this:

T root@OpenWrt:~# dmesg | grep eth0 eth0: Atheros AG71xx at 0xb9000000, irq 4 eth0: unable to find MII bus on device 'rtl8366s' eth0: Atheros AG71xx at 0xba000000, irq 5 eth0: unable to find MII bus on device 'rtl8366s'

I thought I was building from the same .config as you with support for the 1000hz and no_hz options - the above looks like a faster clock tick problem to me... but you should have seen that too.

So I can revert to 2.6.39.2, revert all that

and/or restart from scratch from my patch set (Which only touches one tiny bit of the eth driver and NOTHING of the wireless driver) But with a little data perhaps that won't be neccessary.

I'll stick around here a while longer, cleaning up, and getting a router from thursday online...

After I get some sleep. On Fri, Aug 5, 2011 at 7:00 PM, Andrew McGregor < > wrote:

Huh? I generated it based on openwrt head r27912

On 5/08/2011, at 5:01 PM, Dave Taht wrote:

so this patch duplicates some, but not all, of what is now in openwrt head? It only partially applies.

Ripping it out again, and going off to hopefully find the problem in the ethernet driver instead....

On Fri, Aug 5, 2011 at 4:00 PM, Dave Taht <>wrote:

---------- Forwarded message ---------- From: Andrew McGregor <> Date: Fri, Aug 5, 2011 at 3:57 PM Subject: Resend of patch To: Dave Taht <>

Ok, this patch applies on top of Felix' last set of changes, and produces really spectacular results: about 2x better latency on 11n than before we both started on it at basically no performance cost, and fairly decent and consistent latency on 11g at a performance cost so small I can't measure it.

It may be worth fiddling a little more with the tunables, but this is pretty good for now.

(btw: drop this file in the packages/mac80211 directory)

Updated by Jim Gettys about 3 years ago

David Taht wrote:

I note that txqueuelen of X is ok, when the station is in N mode.

It's when it's in G mode that this introduces tons more latency.

Perhaps a closer look at the mq qdisc is in order.

And I note that testing chrome appears to be in order.

---------- Forwarded message ---------- From: Andrew McGregor <> Date: Tue, Aug 9, 2011 at 11:48 PM Subject: Re: Resend of patch [#195] To: Dave Taht <>

So, I kicked my txqueuelen up on the wireless interfaces to 100; I was dropping too many packets coming back at me (mostly from Google sites browsing with Chrome, so I presume the problem has to do with IW 10 on the servers and SPDY in Chrome...)

Yes, this sounds like the insane IW 10 change.

I've just written that part of the bufferbloat paper, which writes the consequences of this change up into a decent form. I should finally be posting it this week; will probably post to both the tcp changes wg and the HTTP wg.. Scott Bradner was kind enough to give me some feedback on that draft.

Would be nice to confirm this, of course, but it's about the amount of buffering use you get from first principles, so it seems likely. The real disaster is when your output link happens to be going really slowly, of course...

Also available in: Atom PDF