Bug #401
major retries in the VI AMPDU queue on ath9k
| Status: | Closed | Start date: | 07/08/2012 | |
|---|---|---|---|---|
| Priority: | Urgent | Due date: | ||
| Assignee: | % Done: | 0% |
||
| Category: | Networking | Spent time: | 20.00 hours | |
| Target version: | 1st Public Cerowrt release |
Description
I find the AMPDU retry statistics to be massively high in the case of VI and still quite high in the case of BE.
(The radios happen to be in MCS4 most of the time, if that matters.
There are only three radios on the network, no competing traffic at all)
root@davesroof:/sys/kernel/debug/ieee80211/phy0/ath9k# cat xmit
Num-Tx-Queues: 10 tx-queues-setup: 0x10f poll-work-seen: 36206
BE BK VI VO
MPDUs Queued: 27436 174 15066 12730
MPDUs Completed: 27322 174 15062 12708
MPDUs XRetried: 114 0 4 22
Aggregates: 94194 5 0 0
AMPDUs Queued HW: 620975 49 21536 0
AMPDUs Queued SW: 438841 7 0 0
AMPDUs Completed: 1045702 34 6476 0
AMPDUs Retried: 438217 728 413300 0
AMPDUs XRetried: 14114 22 15059 0
FIFO Underrun: 0 0 0 0
TXOP Exceeded: 0 0 0 0
So I don't see TXOP exceeded, but I do see exorbitant numbers
of retries in the AMPDU VI queue for some reason. This disturbs me.
In xmit.c I see:
static u32 ath_lookup_rate(struct ath_softc *sc, struct ath_buf *bf,
struct ath_atx_tid *tid)
{
...
/*
* Find the lowest frame length among the rate series that will have a
* 4ms transmit duration.
* TODO - TXOP limit needs to be considered.
*/
max_4ms_framelen = ATH_AMPDU_LIMIT_MAX;
History
Updated by Dave Täht 11 months ago
Andrew McGregor: You'd be right to call that dubious
12:24 AM Either the limit has to be the minimum amongst all the queues, or it has to be per queue
me: Perhaps I'm still misunderstanding - at MCS4 what is the max aggregate?
1:57 AM Andrew McGregor: I don't know, it's expressed in units of time
1:58 AM It is prescribed by the queue's parameters...
1:59 AM There's a table here: http://wifi-insider.com/wlan/wmm.htm
2:01 AM So, the way you'd implement that is to query the driver for the transmit time for the AMPDU as you're building it up
Updated by Dave Täht 11 months ago
<nbd> hey dtaht [20:20]
<nbd> i've been looking into that VI
queue txop mess some more
<nbd> looks like i found an
interesting bug (in addition to
the stuff you already uncovered)
[20:21]
<nbd> seems that the configured txop
limit in the hw is off by a
factor 32 ;)
<nbd> if i'm reading this stuff
correctly
<nbd> only affects VI and VO of course
\ [22:00]
<nbd> it would limit the maximum transmission duration for VI to 94 usec and
VO to 47 [22:02]
<nbd> maybe the hw adds some extra time on top of that
<nbd> my patches seem to work ;) [22:09]
<nbd> i didn't test pushing traffic to the VI queue
<nbd> but i tested adjusting the BE queue to the same txop limit
<nbd> committed [22:11]
<nbd> have fun with that, i'm going to get some sleep now [22:12]
<nbd> i'll submit this stuff to linux-wireless@ tomorrow [22:13]
Updated by Dave Täht 11 months ago
- Priority changed from Normal to Urgent
Well, I see better behavior in the VI queue, using
[1] Done netperf l 60 -Y CS0,CS0 -H 172.20.1.1 Done netperf -l 60 -Y CS5,CS5 -H 172.20.1.1
[2]
[3]+ Done netperf -l 60 -Y EF,EF -H 172.20.1.1
The BK queue does indeed drop packets according to tc, VO, VI weren't
Before I got that far, this:
BE BK VI VO
MPDUs Queued: 1308 161 1364 39399
MPDUs Completed: 1308 161 1364 39397
MPDUs XRetried: 0 0 0 2
Aggregates: 2975 962 9326 0
AMPDUs Queued HW: 14134 4331 282818 0
AMPDUs Queued SW: 6773 2314 31996 0
AMPDUs Completed: 19780 6470 313444 0
AMPDUs Retried: 34959 4825 84898 0
AMPDUs XRetried: 1126 175 1370 0
FIFO Underrun: 0 0 0 0
TXOP Exceeded: 0 0 0 0
TXTIMER Expiry: 0 0 0 0
DESC CFG Error: 0 0 0 0
DATA Underrun: 0 0 0 0
DELIM Underrun: 0 0 0 0
TX-Pkts-All: 22214 6806 316178 39399
TX-Bytes-All: 2431000 602745 27746600 3522718
hw-put-tx-buf: 1 1 1 1
hw-tx-start: 53190 10240 390746 39399
hw-tx-proc-desc: 53189 10240 390746 39399
TX-Failed: 0 0 0 0
txq-memory-address: 8280a1b4 8280a230 8280a138 8280a0bc
axq-qnum: 2 3 1 0
axq-depth: 1 0 0 0
axq-ampdu_depth: 1 0 0 0
axq-stopped 0 0 0 0
tx-in-progress 0 0 0 0
pending-frames 1 0 0 0
txq_headidx: 0 0 0 0
txq_tailidx: 0 0 0 0
axq_q empty: 0 0 0 0
axq_acq empty: 1 1 1 1
txq_fifo[0] empty: 1 1 1 1
txq_fifo[1] empty: 1 1 1 1
txq_fifo[2] empty: 1 1 1 1
txq_fifo[3] empty: 1 1 1 1
txq_fifo[4] empty: 1 1 1 1
txq_fifo[5] empty: 1 1 1 1
txq_fifo[6] empty: 1 1 1 1
txq_fifo[7] empty: 1 1 1 1
However doing stuff in the reverse direction crashes the router.
This could be an out of memory condition or something else with fq_codel or the driver...
Using netperf 2.6 - from my laptop, connected via mesh via ad-hoc mode, at rates around 120Mbit...
netperf -l 60 -Y EF,EF -H 172.20.1.1 -t TCP_MAERTS & netperf -l 60 -Y CS5,CS5 -H 172.20.1.1 -t TCP_MAERTS & netperf -l 60 -Y CS0,CS0 -H 172.20.1.1 -t TCP_MAERTS &
This thoroughly exercises the VO,VI,BE queues (which is not something that happens in real life)
Updated by David Taht 2 months ago
- Status changed from New to Closed