Feature #329

Testing out the BFS cpu scheduler in openwrt

Added by Hector Ordorica over 1 year ago. Updated about 1 year ago.

Status:Closed Start date:01/16/2012
Priority:Normal Due date:
Assignee:- % Done:

0%

Category:Linux Kernel Spent time: 0.50 hour
Target version:1st Public Cerowrt release Estimated time:1.00 hour

Description

So, I decided to try patching openwrt with the BFS kernel 3.1 patch. The goal of BFS is to provide a low latency oriented cpu scheduler for the Linux kernel. The default scheduler (CFS) is tuned for servers with many cores.

Luckily, the default BFS patch applies cleanly to openwrt trunk 3.1. I just dropped the patch into /target/linux/generic/patches-3.1 and recompiled.

Now there is a BFS tunable in /proc/sys/kernel/rr_interval. It controls the time slice given to each process. I set it to 1 (1 millisecond).

Anyways, with the default CFS:


root@OpenWrt /storage# netperf -f M -I 99,10 -c -C -H 192.168.1.123
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.123 (192.168.1.123) port 0 AF_INET : +/-5.000% @ 99% conf.
!!! WARNING
!!! Desired confidence was not achieved within the specified iterations.
!!! This implies that there was variability in the test environment that
!!! must be investigated before going further.
!!! Confidence intervals: Throughput      : 3.451%
!!!                       Local CPU util  : 0.112%
!!!                       Remote CPU util : 110.528%

Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    MBytes  /s  % S      % S      us/KB   us/KB

 87380  16384  16384    10.00        11.79   99.97    0.83     82.832  5.485

root@OpenWrt /root# iperf -c localhost
------------------------------------------------------------
Client connecting to localhost, TCP port 5001
TCP window size: 49.5 KByte (default)
------------------------------------------------------------
[  4] local 127.0.0.1 port 5001 connected with 127.0.0.1 port 35935
[  3] local 127.0.0.1 port 35935 connected with 127.0.0.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec   389 MBytes   326 Mbits/sec
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec   389 MBytes   326 Mbits/sec
root@OpenWrt /root# iperf -c localhost
------------------------------------------------------------
Client connecting to localhost, TCP port 5001
TCP window size: 49.5 KByte (default)
------------------------------------------------------------
[  5] local 127.0.0.1 port 5001 connected with 127.0.0.1 port 35936
[  3] local 127.0.0.1 port 35936 connected with 127.0.0.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec   375 MBytes   315 Mbits/sec
[  5]  0.0-10.0 sec   375 MBytes   314 Mbits/sec

And now with BFS:


root@OpenWrt /root# dmesg | grep -i bfs
[    3.675781] BFS CPU scheduler v0.415 by Con Kolivas.
[   55.996093] usbcore: registered new interface driver usbfs

root@OpenWrt /root# netperf -f M -I 99,10 -c -C -H 192.168.1.123
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.123 (192.168.1.123) port 0 AF_INET : +/-5.000% @ 99% conf.
!!! WARNING
!!! Desired confidence was not achieved within the specified iterations.
!!! This implies that there was variability in the test environment that
!!! must be investigated before going further.
!!! Confidence intervals: Throughput      : 9.218%
!!!                       Local CPU util  : 0.312%
!!!                       Remote CPU util : 114.725%

Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    MBytes  /s  % S      % S      us/KB   us/KB

 87380  16384  16384    10.00        11.42   99.92    0.55     85.585  3.774

root@OpenWrt /root# iperf -c localhost
------------------------------------------------------------
[  4] local 127.0.0.1 port 5001 connected with 127.0.0.1 port 47075
Client connecting to localhost, TCP port 5001
TCP window size: 49.5 KByte (default)
------------------------------------------------------------
[  3] local 127.0.0.1 port 47075 connected with 127.0.0.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec   460 MBytes   386 Mbits/sec
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec   460 MBytes   386 Mbits/sec
root@OpenWrt /root# iperf -c localhost
------------------------------------------------------------
Client connecting to localhost, TCP port 5001
TCP window size: 49.5 KByte (default)
------------------------------------------------------------
[  5] local 127.0.0.1 port 5001 connected with 127.0.0.1 port 47076
[  3] local 127.0.0.1 port 47076 connected with 127.0.0.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec   456 MBytes   382 Mbits/sec
[  5]  0.0-10.0 sec   456 MBytes   382 Mbits/sec

The results are practically the same for netperf, running the client on the WNDR3700v2.

For iperf, running both the sever and client on the router, the bandwidth did go up. I'm guessing this is just because BFS is better able to allocate a fair share of resources to the server and client on the same router.

Anyways, just an FYI post. Perhaps you can let me know how to test latency to see if it improves.

History

Updated by Dave Täht over 1 year ago

That's the third or 4th time I've seen remote cpu utiliziation go beyond 100% on a single cpu box.

I don't think that better cpu scheduling is going to be too helpful - it may help on things like samba - I have found, however, that running at higher clock rates (256HZ was the last I tried)
seemed to improve responsiveness somewhat. I hope one day to quantify that with a non-debug build
(I was at one point achieving > 500Mbit throughput through these boxes, and now rarely crack 260 - but I have tons of debug code on at present)

The most productive thing I've found is to run a given benchmark through the router, not on the router, and look at the results with oprofile.

Updated by Dave Täht over 1 year ago

  • Category set to Linux Kernel
  • Status changed from Feedback to Closed
  • Target version set to 14

From a cpu perspective there are other things to try than the scheduler, but
I would suggest oprofiling under your workload first, rather than anything else.

Updated by Dave Täht about 1 year ago

  • Target version changed from 14 to 1st Public Cerowrt release

Also available in: Atom PDF