Why Broadcom 802.11 Linux STA driver sucks, and how to fix it

TL;DR – the broadcom sta linux driver always fails in the first scan request after the interface is brought up, this produces a long delay when connecting to a wireless network. There’s an open source driver which does not have this problem, but is not good with power management. In this post I describe the steps I took to pinpoint the problem in the proprietary driver and to fix it.

The story begins when I updated Ubuntu from 11.10 to 12.04 on my MacBook Air, everything worked fine after upgrading except one thing that bothered me a lot: when resuming the laptop after suspending it, it took around 30 seconds to connect to my wireless network. It wouldn’t have bothered me if it had been the same in 11.10, but in 11.10 the time to connect was barely 5 or 6 seconds, so having to wait 30 seconds was totally unacceptable.

Initially I thought it was a bug in NetworkManager, and increased the debug level in the config file to finally come out to the conclusion that I was using a different driver in 12.04 than in 11.10.

There are two drivers available for the Broadcom BCM4353 802.11 Wireless Controller:

Both wl (broadcom proprietary driver) and brcmsmac (the open source driver) were installed in my Ubuntu 11.10 but the open source driver was used by default, and this driver connected to the wifi network in 5 seconds.

In Ubuntu 12.04, the wl proprietary driver provided by the package bcmwl-kernel-source has been updated from version 5.100.82.38+bdcom-0ubuntu4 to version 5.100.82.38+bdcom-0ubuntu6.1 which includes the following fix:

---------------
bcmwl (5.100.82.38+bdcom-0ubuntu6.1) precise-proposed; urgency=low
	
  * debian/bcmwl-kernel-source.postinst:
    - Blacklist brcmfmac, brcmsmac and bcma so that they don't
      conflict with the closed driver (LP: #873117)
 -- Alberto Milone  Mon, 23 Apr 2012 16:11:56 +0200

Which basically blacklists the open source brcmsmac module, forcing the wl proprietary driver to be in use. When the brcmsmac was not blacklisted, even if the wl driver was loaded it failed silently and brcmsmac was used instead.

So, the easy path to solve my problem would have been to blacklist the wl module, and add the brcmsmac to /etc/modules and live happy with my 5 seconds needed to associate when resuming, *BUT* I compared both drivers and the proprietary driver has better signal and way better power management, which makes my battery last longer, so I decided to go the long route. My goal was to achieve the lowest delay possible to connect to a wireless network when coming from a suspend using the proprietary driver.

I went “down” one level and started looking at wpasupplicant, as NetworkManager communicates with it using the DBus control interface (dbus-monitor showed the problem was not in dbus communication) so, increased the debug level in wpa-supplicant by adding ‘-dd‘ switch in /usr/share/dbus-1/system-services/fi.w1.wpa_supplicant1.service, and looked through the logs, which quickly revealed the following:

May 20 11:49:46 maco wpa_supplicant[12610]: Scan requested (ret=0) - scan timeout 5 seconds
May 20 11:49:52 maco wpa_supplicant[12610]: Scan timeout - try to get results
May 20 11:49:52 maco wpa_supplicant[12610]: Failed to get scan results
May 20 11:49:52 maco wpa_supplicant[12610]: Failed to get scan results - try scanning again
May 20 11:50:07 maco wpa_supplicant[12610]: Scan requested (ret=0) - scan timeout 5 seconds

The first scan request (SIOCSIWSCAN), after the wireless interface was brought up always failed (?), and wpa_supplicant tried to get the scan results (SIOCGIWSCAN) because some drivers do not deliver SIOCGIWSCAN events to notify when scan is complete, but this failed too, so wpasupplicant requested a second scan after a timeout, which properly delivered the results this time. This first failing scan was adding 21 seconds of delay to the network association process.

I googled the error and found I was not the only soul affected by this problem, Kalle Valo submitted 4 different patches to the hostap mailing list between October 2010 and March 2011, but the patches were never accepted upstream, nor included in the ubuntu package. The wpasupplicant code has changed a bit since Kalle submitted his patches, so I adapted them to the current wpa_supplicant version in Ubuntu. If you are curious, you can dig through ubuntu bug #994739.

In short, the version 4 patch from Kalle basically patches the WEXT driver from wpasupplicant to check the return value when trying to get scan results (SIOCGIWSCAN) from the wl driver, if the number of last error (errno) is EINVAL on the first scan, it requests another scan, so this one will go through (as only the first one fails) and return the scan results next time wpasupplicant tries to get them. This is far from perfect, but it works (and doesn’t seem to break anything), reducing the time needed to associate to the wireless network after the interface has been brought up from 30 seconds to 12 seconds.

But I was still unhappy with this result, so I patched the wpasupplicant code to request a scan right after the driver init function, so this would be the first “failing” scan, and the real scan requested a bit later would return results. This was a very ugly patch, because it made wpasupplicant request a scan in INACTIVE state (when it should be SCANNING), but it worked and reduced the time from 30 seconds to 10 seconds.

So, still unhappy with the results, I decided to go down one more level and have a look at the GPL’d source of the Broadcom’s Linux STA proprietary driver, and BINGO! this is how the wl_iw_set_scan() function ends:

        (void) dev_wlc_ioctl(dev, WLC_SCAN, &ssid, sizeof(ssid));
	
        return 0;

They always return 0, even when the dev_wlc_ioctl() function fails!! and WTH is it casted to void?? It would have been easier to just return the result of this function!. Patching this shows that the first scan after the interface is up fails with errno EBUSY (device or resource busy), so I added a workaround here to make it request the scan to the underlying hardware until it returned something different than EBUSY and could be correctly handled by wpasupplicant, et voilà, time reduced to 10 seconds.

But hey, now that I looked at their source, it turns out that there’s a newer version available in broadcom’s website: 5.100.82.112. This version now supports the new linux cfg80211 wireless configuration API in addition to the older Wireless Extensions (WEXT), you can choose between CFG80211 or WEXT at compile time, the ubuntu package broadcom-sta-dkms in the development release for 12.10 ‘Quantal Quetzal’ has been updated to this version but still uses the old WEXT which is still broken (always returns 0, remember above?). But, guess what they have done it correctly this time in the new CFG80211 code, see the end of the function __wl_cfg80211_scan():

        err = wl_dev_ioctl(ndev, WLC_SCAN, &sr->ssid, sizeof(sr->ssid));
        if (err) {
                if (err == -EBUSY) {
                        WL_INF((\"system busy : scan for \\"%s\\" \"
                                \"canceled\n\", sr->ssid.SSID));
                } else {
                        WL_ERR((\"WLC_SCAN error (%d)\n\", err));
                }
                goto scan_out;
        }
	
        return 0;
	
scan_out:
        clear_bit(WL_STATUS_SCANNING, &wl->status);
        wl->scan_request = NULL;
        return err;

As you can see, they now return EBUSY when the driver cannot perform the scan, and wpa_supplicant can manage this situation correctly, so I quickly backported the broadcom-sta-dkms package from ubuntu 12.10 to ubuntu 12.04 and added a patch to compile it with CFG80211 enabled, and finally I CAN HAS ONLY 8 SECONDS DELAY!!!!1 to associate to the wifi network after my laptop resumes from suspend using the wl driver, and I’m a happy camper! :D

This entry was posted in linux, wireless and tagged , , , , , , , , , , , , , . Bookmark the permalink.

19 Responses to Why Broadcom 802.11 Linux STA driver sucks, and how to fix it

  1. Iolanda says:

    I can ensure it’s been a hard way before getting the Wifi connected in less of 10 sec. Proud of this achievement :)

  2. pof says:

    Ubuntu packages are available in poliva/pof ppa, sources on github:

    sudo add-apt-repository ppa:poliva/pof
    sudo apt-get update
    sudo apt-get install broadcom-sta-dkms

    You might want to remove the old wl module first, if you have it installed:

    sudo apt-get purge bcmwl-kernel-source

  3. Laurent says:

    This is brilliant work! thanks for sharing. Looks like my macbook is finally as snappy as under OSX :)

  4. Chris says:

    Ah so much better! Thanks for working on this.

  5. You shouldn’t have to patch broadcom-sta to build with the cfg80211 API; it’s supposed to do it automatically if you’re running a kernel newer than 2.6.32; which should be the case anyway on Precise.

    Have you found out that this in fact wasn’t working properly?

  6. Brian Kloppenborg says:

    Greetings. Thanks for putting this into a PPA. It makes maintenance so much easier.

    I’m running 3.2.0-29-generic on Ubuntu 12.04 64-bit with a BCM4313. I’ve noticed very inconsistent performance with this hardware setup and the broadcom-sta drivers. Ping times range from 0.1 ms to 9 seconds (although what I show below is more typical):

    $ ping 192.168.0.1
    PING 192.168.0.1 (192.168.0.1) 56(84) bytes of data.
    64 bytes from 192.168.0.1: icmp_req=1 ttl=61 time=124 ms
    64 bytes from 192.168.0.1: icmp_req=2 ttl=61 time=16.9 ms
    64 bytes from 192.168.0.1: icmp_req=3 ttl=61 time=115 ms
    64 bytes from 192.168.0.1: icmp_req=4 ttl=61 time=30.8 ms
    64 bytes from 192.168.0.1: icmp_req=5 ttl=61 time=56.9 ms
    64 bytes from 192.168.0.1: icmp_req=6 ttl=61 time=167 ms
    64 bytes from 192.168.0.1: icmp_req=7 ttl=61 time=54.7 ms
    64 bytes from 192.168.0.1: icmp_req=8 ttl=61 time=179 ms
    64 bytes from 192.168.0.1: icmp_req=9 ttl=61 time=42.1 ms
    64 bytes from 192.168.0.1: icmp_req=10 ttl=61 time=5.76 ms
    --- 192.168.0.1 ping statistics ---
    10 packets transmitted, 10 received, 0% packet loss, time 9012ms
    rtt min/avg/max/mdev = 5.766/79.453/179.395/59.439 ms

    On Windows the same hardware gets 1-2 ms ping times. The problem isn’t limited to the 5.100.82.112 driver similar performance exists with the 5.100.82.38 version as well. Any thoughts how I could troubleshoot this?

    • Brian Kloppenborg says:

      I got this sorted out. I just had to lower the level of power management (shutting it off fixed the issue entirely) using
      iwconfig wlan0 power off
      some level of power management is, of course, preferable. Experiment with the different levels to find a good mix of power and usability.

  7. Aniello Del Sorbo says:

    Thanks for this!

    It helped here on my HP!

  8. Carl says:

    Great work! The long connect times irked me no end. A lot better now, thanks!

  9. fascht says:

    <3, after 20 hours useless things. 3 simple steps solves it.
    lenovo edge 335.
    12.04 =)

  10. Αναστάσης says:

    I have the same problem on Debian Wheezy/Testing 7. I’ll probably try it and see the results. If somebody has already tried it on Debian share your exprerience please.

  11. Andrew says:

    At last a post that works!! Now my Inspiron R gets online faster than an 8 year old laptop!
    Many thanks for the repository all so easy :).

  12. Nathan Caldwell says:

    I just wanted to point out that the upcoming Raring driver has switched over to the cfg80211 API. I managed to get this version backported to Quantal. So now all that is required is to enable quantal-backports and install broadcom-sta-dkms/quantal-backports.

  13. Paul says:

    Thanks a lot ^^

  14. lotusbaba says:

    I tried the following on my HP laptop
    1. Installed Centos 6.4 which by default installs a 2.6 kernel that doesn’t have a Broadcom driver
    2. Tried installing the latest Kernel 3.9.3 whcih contains the Broadcom driver, but the system just didn’t boot. Probably because I’ve configured it to boot in init 5 and maybe gnome desktop that comes with Centos 6.4 isn’t yet compatible with the 3.9.3 kernel
    3. Next I compiled/installed the kernel sources containing the Broadcom driver but closest to the 2.6 kernel which happens to be 3.1.1
    4. When I booted back I saw that the
    5. I then went to “make menuconfig” ->Device Drivers ->Network device support -> Wireless LAN,
    and deselected “Support for BCMA bus” but retained selection for every option containing “Broadcom 43xx”
    6. When I booted back the wireless still wasn’t working. So I checked the dmesg and discovered that it didn’t load because of missing firmware in /lib/brcm
    7. I simply copied the two firmware files available in the link “http://code.deeproot.in/deepofix/browser/trunk/var/rootfs/lib/firmware/brcm?rev=693″ into the /lib/brcm directory
    8. I rebooted and viola my wireless device is works perfectly
    9. Now I’m gonna try upgrading the kernel bit by bit
    Cheers!
    LotusBaba

  15. John Rose says:

    Dear pof,

    I am amazed at your mastery of Linux. Personally I have very little technical knowledge but have a problem which has a bit in common with yours; I have been battling for two weeks to solve it and would greatly appreciate your advice.

    I just upgraded to Ubuntu 12.04 LTS (kernel 3.2.0-45) on my Dell Vostro 3700 with Broadcom BCM4313 wifi chip. Access points using channels 12 and 13 (allowed in the EU but not in the USA) are not recognised while those using channels 1-11 are. It is not a hardware problem since when I boot Windows XP (which I hate to do) on the same computer, it connects fine. I cannot change the access point channel (number 13 = 2472 MHz) where I am on vacation, but next week I will be home and can have access to my router, wired access, better back-up facilities, etc. Since I sometimes take this computer on travel, I would really like to get channels 12 and 13 to work. I would also like to keep 12.04 LTS which is good for 4 more years.

    I started with the Proprietary Broadcom STA Wireless driver (wl) which worked fine with channels 1-11. I found advice on the web that changing the ieee80211_regdom parameter to EU or to FR (my country, France) would do the trick. I could change this by setting in an “options cfg80211″ line in a conf file in /etc/modprobe.d/ or with iw reg set xx. In both cases the country domain was effectively changed, but no change in access to channel 13 (even by removing/reloading wl with modprobe or rebooting. My driver version was 6.20.155.1 (more recent than the one which was giving you trouble).

    I saw several references on the web that the open source brcmsmac is better (but others saying that the proprietary wl driver is better), and one reference to a fix to brcmsmac in kernel versions 3.2 and 3.3 to resolve a channel 12 & 13 problem, so I uninstalled wl (by unselecting it under System tools/System parameters/Additional drivers, can’t get it back right now since no internet access), and found that brcmsmac loaded upon boot. Exactly the same problem: access points using channels 1-11 are recognised, but not mine with channel 13. [Please note that I cannot be sure that brcmsmac is working for channels 1-11 since I do not have right now access to an access point using these channels, but it seems to be behaving correctly.] A difference relative to the wl driver is that the logical device is set to wlan0 instead of eth1. Another is that brcmsmac seems to set the country domain code to “00″ (apparently meaning broadest international) when asked to set to “EU” or “FR”:

    john@JOHN-PC:/etc/modprobe.d$ sudo iw reg set “FR”

    john@JOHN-PC:/etc/modprobe.d$ iw reg get

    country 00:

    (2402 – 2472 @ 40), (3, 20)

    (2457 – 2482 @ 20), (3, 20), PASSIVE-SCAN, NO-IBSS

    (2474 – 2494 @ 20), (3, 20), NO-OFDM, PASSIVE-SCAN, NO-IBSS

    (5170 – 5250 @ 40), (3, 20), PASSIVE-SCAN, NO-IBSS

    (5735 – 5835 @ 40), (3, 20), PASSIVE-SCAN, NO-IBSS

    The Linux Wireless page on brcmsmac says that the country domain code X2 is used (not 00), but iw cannot set to X2.

    A post in December 2011 (thus before 12.04) gives a fix for the channel.c program within brcmsmac, but says that channel 12&13 problem should have been fixed by the above-mentioned kernel fix. At any rate I can’t fine the channel.c source code in my system – seems that it is incorporated in the kernel?

    I would greatly appreciate it if you could give me some guidance?

    Thanks and best regards, John

  16. Ramiro says:

    Just passed by to say thank you!! I had signal strength problems with my boradcom wireless under Ubuntu 12.04, applying this patch improved a lot the perfomance of the connection. Thanks!

    This need to be applied main stream right now! :)

    Cheers!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>