Things to debug!
----------------

* Implement a net80211 WAR for this upon association:

wlan0: [00:1b:b1:58:f6:f0] discard duplicate frame, seqno <0,0> fragno <0,0> tid 0

.. basically, initialise the seqno's to be 4095 upon initial association.

* The txqactive bitmap (txeol, txurn, txok, etc) is setup at txq create
  time to a set of values, then it seems after a channel scan all of the
  bits are set to 1. I'm not yet sure why. Go in and fix these.
  Note that ath9k caches the IMR_S2 value and rewrites it where needed.
  It's possible that after a channel scan, these values are "initial"
  values rather than the values setup by the if_ath driver.

< adrian> oh that's a fun bug. if I trigger A-MPDU using an aggregate UDP TX, I'm out of buffers
< adrian> (as they're all locked up in pending queues)
< adrian> so I can't send the ADDBA. :P

* Write something in ath_tx_default_comp() that ensures the buffer is unlinked
  (ie, not part of an aggregate)
  - Done

* Write something in the aggr comp function which checks that the number of
  frames in the aggregate list matches bf_stats.bfs_nframes
  - Done

* Am I losing an ath_buf in the hardware TX queue code? Ie, are they not
  ever queued?
  - No

* Although it's a nice idea to run completion handlers in the ath task
  context (making scheduling and completion occur in a mutually exclusive
  setup within the same taskqueue), the fact that nodes can be flushed,
  drained and freed from -outside- that context puts a spanner in the works.
  Ideally, the only time completion handlers should be called from outside
  that context is when the node is being destroyed. Otherwise, we're going
  to have to lock -everything-.

  So, when going over the drain/flush/free code to see what's going on,
  please consider making -node free- the only time these completion
  functions are called from outside the ath taskqueue task, and ensure
  that they don't trample over any state which is going to cause contention
  (and thus need to be locked.)

* The RX side (AR9160, FreeBSD) seems to lock up from time to time, with
  stuck beacons and RXORN interrupts (ie, RX FIFO overflows.) Why?

* There's the possibility that ampdu seqno packets have sequence numbers
  allocated but are flushed -before- they're added to the BAW.
  I guess this is why seqno allocation should be delayed until just before
  it's queued to the hardware.

  That should be .. well, fixed.

  Yes, it's this. The problem is the cleanup, flush, drain handling.
  What I need to do is:

  * for interface -teardown-, the frames can be completed.
  * for interface -flush-, complete the retries as failed?, and reschedule
    the frames on the queue.
    - the problem - what if that flush is a mode change?
    - or what else? 20/40 change? rate change? etc.
  * for interface -cleanup-, ?
  * drain?

Things that need doing!
-----------------------

* TDMA: need to fix this:

  ath0: ath_tx_update_baw: comp bf=0xc0856768, seq=3744; slot bf=0xc08571cc, seqno=3744

  This is because a buffer is cloned when its currently marked busy and thus
  the ath_buf pointer won't match what's already in the BAW tracking window.
  It's harmless at the moment, but if it pops up whilst users are actively
  doing traffic, we know why.

  The solution is to also update the BAW pointer when a retry'ed buffer
  is cloned.

* When off-channel, aggregate traffic should stay queued, but raw
  frames (eg probes) should be sent?

* Handle channel/mode changes which shouldn't flush packets in the
  software queue (and the packets in the hardware queue that can
  be recovered after TX DMA is stopped..)
  + eg a mode change which changes the channel (2/5ghz), or ht<->non-ht,
    or 40<->20mhz modes - these may end up with packets in the software
    queue which can't be supported by the new operating mode.
  + How to fix? :-)

* .. and since raw queued frames may have invalid rate information, enforce
  valid rate/flags when the packet is being hardware queued.
* .. this also will allow for rate lookups to be done on retried frames,
  which may help with reliability.
* Find where in the driver the rate table is updated, and do what? Trigger
  updating the software-queued frames? Or?

* Add a swretrysubframe and swretrysubframemax counter, use it

* Scheduling is wrong - the software TXQ needs more time to assemble
  frames together for aggregation.

  Based on Linux ath9k:

  + Don't base it completely on how "deep" the hardware queue is, that
    can involve multiple TIDs, not just one!
  + If the TID has say, more than two hardware queued aggregate frames,
    or more than say eight hardware-queued normal frames, don't schedule it.
  + When the completion function is called, check to see if there are frames
    in the queue and the frames queued to hardware has dropped. If so,
    schedule the TID. If not, give the TID some more time to gather packets.

  The logic:

  + The MAC has to be kept busy enough to not wait around for data to send.
    Not more, not less.
  + Immediately handling a TID when the hardware is busy sending aggregate
    frames is just plain silly - as it's already busy. So queuing the frames
    to the hardware doesn't buy you anything, it doesn't decrease latency, etc.
  + So instead, just feed it enough frames to keep the hardware queue busy,
    and let the TID software queue gather up more frames.
  + Finally, if the TID has a couple frames queued to the hardware
    -and- there's only one frame to send, don't just queue it.
    Wait until the completion handlers are called, then schedule the
    queue function again.

* RIFS? Do I care about supporting RIFS?

* Fast Frames? Do I care about supporting FF here, or is it done via suitable
  evilness in net80211?

* A-MPDU aggregation?
  + how many pending A-MPDU frames per hardware TXQ? As many as needed? One?

* Software retransmit when aggregation is enabled
  + Whether doing A-MPDU or not
    - done
  + Support rate updates and lookup on a retry; maybe a slower rate
    is needed?
    - not yet done, but would need changes to the way the rate decisions
      are made

* 20<->2040 mode change?
  + part of this project or not?
  + right now packets are simply flushed; why not just re-prod them into
    the software TXQ ?

* nodes w/ a lower BA window size; try to fill it in ath_tx_form_aggr() ?

* teach ath_rate about aggregate completion?
  - just pass in the number of failed+attempted packets?
  - there's only one rate for all aggregates, but we need to know which
    one was used?
  - what else would be helpful?

* Am I too aggressively scheduling things, and thus the aggregation code
  doesn't ever get a chance to form decent sized aggregates?

* I'm getting a lot of OOR packets at the receive end, is this because
  of a large number of software retries? or is it that, combined with
  TX'ing a long list of aggregates that mostly fill the BAW, meaning
  packets can stay out of order for longer (but still be within the BAW?)
  I bet this'd piss off TCP, if it had to wait for 20/30 packets
  before it received the next retransmit (and thus the AMPDU RX queue
  would get flushed.)

* How should channel scanning be handled? Right now it's causing both a HW TXQ
  and SW TXQ / node flush; this means the BAW will need to be slid along. Eek.

* When a node is flushed (but not being deleted) should the BAW also be updated?
  I don't think it is right now and this could be incorrect.

* Send BAR when needed
  + after TX failure
  + when else? When shutting down an aggregation session and flushing packets?
  + ieee80211_send_bar() will only work if IEEE80211_AGGR_RUNNING is set;
    what's that mean for trying to send BAR frames during session teardown?
  + it'll call raw_xmit to send the BAR, so the various bits of TX code
    are going to have to be recursive. How's that going to work out for us?
    (think: with all the TXQ node locks being held..)
  + ic->ic_bar_response(ni, tap, status) is called on BAR response, and
    ieee80211_ampdu_stop(ni, tap, IEEE80211_REASON_TIMEOUT) is called on
    repeated failure to ACK the BAR.

  - I've implemented this; the recursion into the TX path was fixed by
    causing all TX scheduling and completion to run in the ath task,
    ath_tx_start() / ath_tx_raw_xmit() just queues packets to the hardware
    or software queue. It doesn't run scheduling as well. This eliminated
    most of the locking issues and the recursion.

  - There's a dirty workaround to unpause the queue if BAR TX is definitely
    failing. This needs to be addressed before this work is merged back into
    -HEAD.


Stuff to do to the rate control code
------------------------------------

* Teach ath_rate_sample about the packet error rate when TX'ing aggregates

* Delay rate control lookup? Until the aggregate is being formed or a non-agg
  frame is being TXed?

* .. then we can re-do rate control lookups for retransmits? That should help
  with performance issues when bad MCSes have been chosen

* Tidy up ath_rate_sample and have it use the flags in the ath_rc_series array,
  rather than re-calculating what flags are used (ht20/ht40, shortgi, etc.)

  That way if the rate selection logic decides to use something besides what
  the node says it supports (eg sending a HT/20 frame to a HT/40 station,
  sending long-gi instead of short-GI) then the correct calculations can be
  made.

  The rate selection stuff doesn't do this -now-, but it may be useful later.

Problems in net80211 (ie, not necessarily this branch)
------------------------------------------------------

ieee80211_node.c:1940 is within:

static void
ieee80211_timeout_stations(struct ieee80211com *ic)

This initial lock: IEEE80211_NODE_LOCK(nt);

lock order reversal:
 1st 0xc08316cc ath0_node_lock (ath0_node_lock) @ /data/freebsd/mips/if_ath_tx/src/sys/net80211/ieee80211_node.c:1940
 2nd 0xc0830014 ath0_com_lock (ath0_com_lock) @ /data/freebsd/mips/if_ath_tx/src/sys/net80211/ieee80211_power.c:295
KDB: stack backtrace:
db_trace_thread+30 (?,?,?,?) ra 8038e35c sp c077da90 sz 24
db_trace_self+1c (?,?,?,?) ra 80074c1c sp c077daa8 sz 24
80074be8+34 (?,?,?,?) ra 801d1ba4 sp c077dac0 sz 416
kdb_backtrace+44 (?,?,?,?) ra 801e9660 sp c077dc60 sz 24
801e962c+34 (?,?,?,?) ra 801ea2a4 sp c077dc78 sz 32
witness_checkorder+954 (?,?,?,?) ra 8018a740 sp c077dc98 sz 88
_mtx_lock_flags+c4 (?,?,?,?) ra 802a5a30 sp c077dcf0 sz 48
802a59a8+88 (?,?,?,?) ra 8029e4a8 sp c077dd20 sz 40
8029e444+64 (?,?,?,?) ra 8029d4d8 sp c077dd48 sz 32
ieee80211_node_timeout+1a4 (?,?,?,?) ra 801b0300 sp c077dd68 sz 64
softclock+298 (?,?,?,?) ra 80172860 sp c077dda8 sz 88
intr_event_execute_handlers+158 (?,?,?,?) ra 80173868 sp c077de00 sz 40
8017375c+10c (?,?,?,?) ra 8016fb64 sp c077de28 sz 48
fork_exit+a8 (?,?,?,?) ra 80386aa0 sp c077de58 sz 40
fork_trampoline+10 (?,?,?,?) ra 0 sp c077de80 sz 0

A kernel panic, when a station is downed;

wlan0: [8c:7b:9d:d6:65:ba] station with aid 1 leaves
ath0: ath_addba_stop: called
ath0: ath_tx_tid_pause: paused = 1
ath0: ath_tx_cleanup: TID 0: called
ath0: ath_tx_cleanup: TID 0: cleanup needed: 2 packets
ath0: ath_tx_tid_cleanup: node 0xc08e8000: cleaning up
Trap cause = 2 (TLB miss (load or instr. fetch) - kernel mode)
[ thread pid 0 tid 100024 ]
Stopped at      _mtx_lock_flags+0x58:   lw      v1,16(a0)
db> bt
Tracing pid 0 tid 100024 td 0x80a07600
db_trace_thread+30 (?,?,?,?) ra 80072dc0 sp c766f7f8 sz 24
80072cac+114 (8018a6d4,?,ffffffff,?) ra 8007237c sp c766f810 sz 32
80071ff4+388 (?,?,?,?) ra 80072500 sp c766f830 sz 168
db_command_loop+70 (?,?,?,?) ra 80074bc4 sp c766f8d8 sz 24
80074ad0+f4 (?,?,?,?) ra 801d1818 sp c766f8f0 sz 424
kdb_trap+104 (?,?,?,?) ra 80382910 sp c766fa98 sz 40
trap+e58 (?,?,?,?) ra 8037a5e0 sp c766fac0 sz 168
MipsKernGenException+134 (c08ec3e4,0,803e053c,10c9) ra 8018a6d4 sp c766fb68 sz 200
_mtx_lock_flags+58 (?,?,?,?) ra 80079ea0 sp c766fc30 sz 48
ath_tx_update_ratectrl+5c (?,?,?,?) ra 80088644 sp c766fc60 sz 56
ath_tx_aggr_comp+660 (?,?,0,?) ra 8007dcc4 sp c766fc98 sz 232
8007d7e8+4dc (?,?,?,?) ra 8007e56c sp c766fd80 sz 72		(if_ath.c:4326) - processq?
8007e4e4+88 (?,?,?,?) ra 801dfb24 sp c766fdc8 sz 48		(if_ath.c:4491) - tx tasklet
801dfa3c+e8 (?,?,?,?) ra 801e05ec sp c766fdf8 sz 56		(subr_taskqueue.c:308)
taskqueue_thread_loop+60 (?,?,?,?) ra 8016fb64 sp c766fe30 sz 40
fork_exit+a8 (?,?,?,?) ra 80386aa0 sp c766fe58 sz 40
fork_trampoline+10 (?,?,?,?) ra 0 sp c766fe80 sz 0
pid 0

lock order reversal:
 1st 0xc0831784 ath0_scan_lock (ath0_scan_lock) @ /data/freebsd/mips/if_ath_tx/src/sys/net80211/ieee80211_node.c:2158
 2nd 0xc0830014 ath0_com_lock (ath0_com_lock) @ /data/freebsd/mips/if_ath_tx/src/sys/net80211/ieee80211_node.c:2510
KDB: stack backtrace:
 db_trace_thread+30 (?,?,?,?) ra 8035f3ec sp c7beb810 sz 24
 db_trace_self+1c (?,?,?,?) ra 80074ddc sp c7beb828 sz 24
 80074da8+34 (?,?,?,?) ra 801de92c sp c7beb840 sz 416
 kdb_backtrace+44 (?,?,?,?) ra 801f66a8 sp c7beb9e0 sz 24
 801f6674+34 (?,?,?,?) ra 801f7364 sp c7beb9f8 sz 32
 witness_checkorder+9cc (?,?,803e8c9c,9ce) ra 80197340 sp c7beba18 sz 80
 _mtx_lock_flags+c4 (?,?,?,?) ra 802aaea8 sp c7beba68 sz 48
 ieee80211_node_leave+b8 (?,?,?,?) ra 802a0248 sp c7beba98 sz 48
 802a01bc+8c (?,?,?,?) ra 802aa15c sp c7bebac8 sz 32
 ieee80211_iterate_nodes+dc (?,?,?,?) ra 802a10b0 sp c7bebae8 sz 48
 802a0f60+150 (?,?,?,?) ra 802a17fc sp c7bebb18 sz 64
 802a1764+98 (?,?,?,?) ra 802a1fc8 sp c7bebb58 sz 72
 802a1a74+554 (?,?,?,?) ra 802a3c84 sp c7bebba0 sz 128
 ieee80211_ioctl+2c8 (?,?,?,?) ra 802c9f7c sp c7bebc20 sz 48
 in_control+1c8 (?,?,?,?) ra 80262bf8 sp c7bebc50 sz 88
 ifioctl+13cc (?,?,80a937e0,8072e300) ra 801ff200 sp c7bebca8 sz 144
 soo_ioctl+3b0 (?,?,?,?) ra 801f9ac4 sp c7bebd38 sz 40
 kern_ioctl+248 (?,?,?,?) ra 801f9c6c sp c7bebd60 sz 64
 ioctl+130 (?,?,?,?) ra 803533ec sp c7bebda0 sz 56
 trap+8a4 (?,?,?,?) ra 8034b87c sp c7bebdd8 sz 168
 MipsUserGenException+10c (?,?,?,40658290) ra 0 sp c7bebe80 sz 0
pid 171

Fixed issues:
-------------

* Recursive TXQ lock on interface destruction:

  - Fixed by only locking the TXQ in ath_tx_node_flush if the TXQ
    isn't already locked by us.

drian-home-mips# ifconfig wlan0 destroy
ath1: ath_tx_node_flush: called
ar5212StopDmaReceive: dma failed to stop in 10ms
AR_CR=0x00000024
AR_DIAG_SW=0x42000020
wlan0: [00:1b:b1:58:f6:f0] send station disassociate (reason 8)
ath1: ath_tx_node_flush: called
panic: _mtx_lock_sleep: recursed on non-recursive mutex ath1_txq1 @ /data/freebsd/mips/if_ath_tx/src/sys/dev/ath/if_ath_tx.c:1854

KDB: enter: panic
[ thread pid 0 tid 100028 ]
Stopped at      kdb_enter+0x4c: lui     at,0x804c
db> bt
Tracing pid 0 tid 100028 td 0x80762900
db_trace_thread+30 (?,?,?,?) ra 80072de0 sp c7717748 sz 24
80072ccc+114 (0,?,ffffffff,?) ra 8007239c sp c7717760 sz 32
80072014+388 (?,?,?,?) ra 80072520 sp c7717780 sz 168
db_command_loop+70 (?,?,?,?) ra 80074be4 sp c7717828 sz 24
80074af0+f4 (?,?,?,?) ra 801cdf78 sp c7717840 sz 424
kdb_trap+104 (?,?,?,?) ra 8037ee00 sp c77179e8 sz 40
trap+bd8 (?,?,?,?) ra 80376d50 sp c7717a10 sz 168
MipsKernGenException+134 (0,4,803f6b90,109) ra 801ce204 sp c7717ab8 sz 200
kdb_enter+4c (?,?,?,?) ra 80196e38 sp c7717b80 sz 24
panic+f4 (?,80a0274c,803ddea4,73e) ra 80186c9c sp c7717b98 sz 40
_mtx_lock_sleep+68 (?,?,?,?) ra 80186f14 sp c7717bc0 sz 32
_mtx_lock_flags+138 (?,?,?,?) ra 80085f74 sp c7717be0 sz 48
ath_tx_node_flush+70 (?,?,?,?) ra 80086028 sp c7717c10 sz 40
ath_tx_tid_cleanup+10 (?,?,?,?) ra 8007c654 sp c7717c38 sz 24
8007c5fc+58 (?,?,?,?) ra 80298278 sp c7717c50 sz 32
8029815c+11c (?,?,?,?) ra 80298958 sp c7717c70 sz 24
ieee80211_free_node+13c (?,?,?,?) ra 80079c80 sp c7717c88 sz 48
ath_tx_freebuf+68 (?,?,?,?) ra 80079d2c sp c7717cb8 sz 40
ath_tx_default_comp+34 (?,?,?,?) ra 80079ef0 sp c7717ce0 sz 24
80079e3c+b4 (?,?,?,?) ra 0 sp c7717cf8 sz 0
pid 0

* DELBA - ie, downgrade existing packets in the SWQ
  + What about stuff in the HWQ?
  + This is done, just completely and totally untested at the moment

  - implemented and tested

* A device timeout during an active iperf causes TCP to stop, until something
  triggers a TX (say an ICMP ping.) Then it all keeps flowing.
  - I had messed up the blockack window tracking a bit, and there were some
    races in marking the queue scheduled/unscheduled. I've since fixed these.

* LOR between the net80211 node lock and the txqs
  - These have disappeared now that the locking has been reworked.

lock order reversal:
 1st 0x80a02738 ath1_txq1 (ath1_txq1) @ /data/freebsd/mips/if_ath_tx/src/sys/dev/ath/if_ath.c:4154
 2nd 0xc086c6cc ath1_node_lock (ath1_node_lock) @ /data/freebsd/mips/if_ath_tx/src/sys/net80211/ieee80211_node.c:1702
KDB: stack backtrace:
db_trace_thread+30 (?,?,?,?) ra 8038aacc sp c7713a88 sz 24
db_trace_self+1c (?,?,?,?) ra 80074c3c sp c7713aa0 sz 24
80074c08+34 (?,?,?,?) ra 801ce304 sp c7713ab8 sz 416
kdb_backtrace+44 (?,?,?,?) ra 801e5dc0 sp c7713c58 sz 24
801e5d8c+34 (?,?,?,?) ra 801e6a04 sp c7713c70 sz 32
witness_checkorder+954 (?,?,?,?) ra 80186ea0 sp c7713c90 sz 88
_mtx_lock_flags+c4 (?,?,?,?) ra 8029885c sp c7713ce8 sz 48
ieee80211_free_node+40 (?,?,?,?) ra 80079c80 sp c7713d18 sz 48
ath_tx_freebuf+68 (?,?,?,?) ra 80079d2c sp c7713d48 sz 40
ath_tx_default_comp+34 (?,?,?,?) ra 8007d664 sp c7713d70 sz 24
8007d2b0+3b4 (?,?,?,?) ra 8007df3c sp c7713d88 sz 64
8007deb4+88 (?,?,?,?) ra 801dc284 sp c7713dc8 sz 48
801dc19c+e8 (?,?,?,?) ra 801dcd4c sp c7713df8 sz 56
taskqueue_thread_loop+60 (?,?,?,?) ra 8016c2c4 sp c7713e30 sz 40
fork_exit+a8 (?,?,?,?) ra 80383210 sp c7713e58 sz 40
fork_trampoline+10 (?,?,?,?) ra 0 sp c7713e80 sz 0
pid 0

And another LOR:

lock order reversal:
 1st 0xc08316cc ath0_node_lock (ath0_node_lock) @ /data/freebsd/mips/if_ath_tx/src/sys/net80211/ieee80211_node.c:1702
 2nd 0x80a03738 ath0_txq1 (ath0_txq1) @ /data/freebsd/mips/if_ath_tx/src/sys/dev/ath/if_ath_tx.c:1854
KDB: stack backtrace:
db_trace_thread+30 (?,?,?,?) ra 8038ab6c sp c7bf9798 sz 24
db_trace_self+1c (?,?,?,?) ra 80074c3c sp c7bf97b0 sz 24
80074c08+34 (?,?,?,?) ra 801ce3a4 sp c7bf97c8 sz 416
kdb_backtrace+44 (?,?,?,?) ra 801e5e60 sp c7bf9968 sz 24
801e5e2c+34 (?,?,?,?) ra 801e6aa4 sp c7bf9980 sz 32
witness_checkorder+954 (?,?,?,?) ra 80186f40 sp c7bf99a0 sz 88
_mtx_lock_flags+c4 (?,?,?,?) ra 80085fc4 sp c7bf99f8 sz 48
ath_tx_node_flush+8c (?,?,?,?) ra 800860d0 sp c7bf9a28 sz 48
ath_tx_tid_cleanup+10 (?,?,?,?) ra 8007c654 sp c7bf9a58 sz 24
8007c5fc+58 (?,?,?,?) ra 80298318 sp c7bf9a70 sz 32
802981fc+11c (?,?,?,?) ra 80298914 sp c7bf9a90 sz 24
ieee80211_free_node+58 (?,?,?,?) ra 8029a110 sp c7bf9aa8 sz 48
8029a078+98 (?,?,?,?) ra 8029c044 sp c7bf9ad8 sz 40
ieee80211_sta_join+20c (?,?,?,?) ra 8028e720 sp c7bf9b00 sz 40
8028e688+98 (?,?,?,?) ra 80290148 sp c7bf9b28 sz 48
802900e4+64 (?,?,?,?) ra 80290948 sp c7bf9b58 sz 72
802903f4+554 (?,?,?,?) ra 80292604 sp c7bf9ba0 sz 128
ieee80211_ioctl+2c8 (?,?,?,?) ra 802b9e84 sp c7bf9c20 sz 48
in_control+1c8 (?,?,?,?) ra 802514d8 sp c7bf9c50 sz 88
ifioctl+13cc (?,?,80aaada0,80dda900) ra 801ee8c0 sp c7bf9ca8 sz 144
soo_ioctl+3b0 (?,?,?,?) ra 801e91c4 sp c7bf9d38 sz 40
kern_ioctl+23c (?,?,?,?) ra 801e936c sp c7bf9d60 sz 64
ioctl+130 (?,?,?,?) ra 8037eb6c sp c7bf9da0 sz 56
trap+8a4 (?,?,?,?) ra 80376fec sp c7bf9dd8 sz 168
MipsUserGenException+10c (?,?,?,40818e20) ra 0 sp c7bf9e80 sz 0
pid 1510

* Scheduler issues - add some statistics to track how many packets are going
  out as aggregates (looks like around 84%) -and- what the distribution of
  aggregate packets is like. Abuse a histogram - will only send up to 64
  aggregate packets at once, so track:

  + single packet returned from ath_tx_form_aggr()
  + single packet with no BAW returned from ath_tx_form_aggr()
  + single packet with non HT rate
  + aggregate packets (ie, how many times were aggregates sent)
  + aggregate sub-frame count histogram, 2->64 sub-frames

  I can't help but think we're sending very small aggregates.

  - done - sysctl dev.ath.X.txagg=1

* Delimiter calculation? is it right?
  port ath_compute_num_delims()
  - done

* rate series packet duration calculation - it needs to be the entire
  aggregate length for aggregates, not the individual frame?
  (carrier code uses the whole aggregate, what about newma/fusion/ath9k?)
  newma does this too (check ath_pkt_duration())
  - done - it uses the whole aggregate length, incl. delimiters.

* The txactive bits are set in the interrupt handler context, and cleared
  in the TX completion process context. Since TX interrupts may occur
  during a TX completion process, it's unfortunately likely that this
  will be very racey and end up missing perfectly valid TX events.
  This should be resolved before things are merged into -HEAD.

  Maybe store the txqactive mask away in ath_softc and put the update
  of said ath_softc version behind an atomic operation or lock. That way
  the HAL doesn't have to change (for now).

  - done - this is currently protected by ATH_LOCK and shadowed in the
    ath_softc.

* Perhaps delay the rate lookup until the packet is being hardware queued,
  rather than doing the rate decision at ath_tx_raw_xmit() / ath_tx_start();
  - This is done - rate control lookup is done just before it's hardware
    queued.