Analyze and fix selenium grid / RC stability issues with firefox (and other browsers)

By neokrates, written on June 24, 2010


  • Join date: 11-30-99
  • Posts: 224
View Counter:
Rate it
  • Which frameworks do you use to test your web frontend?

    View Results

    Loading ... Loading ...
  • bodytext bodytext bodytext

We had one particular stability problem with Selenium Grid. On Linux, during particularly “heavy” tests with lots of Firefoxes, some browsers seemed to disappear. We found many reasons for that behavior and stabilized our tests. Here are some lessons we learned underway.


✔ Hudson 1.355

✔ Firefox 3.5.9

✔ Firefox 3.6.3

✔ Swiftfox 3.6.3

✔ Ubuntu Linux 9.x


✔ Debian GNU/Linux 5.0.3


  • Selenium grid remote control standalone 1.0.4
  • Selenium server 1.0.1
  • Selenium grid hub standalone 1.0.4
  • Selenium grid tools standalone 1.0.4

Should also work for:

✔ Other Hudson , Unix system, browser and grid combinations


Problem ONE. Memory issues, overcommit and general OOM


That problem occurs if /proc/sys/vm/overcommit memory is set to 1 or 2. Means, that Linux commits more memory for the processes than there actually is. In rare cases, in which all such processes use the committed memory simultaneously, kernel must decide, which process will it dispose of. It than writes down in /var/log/… “killed” and/or “out of memory” kind of messages.

Just to make sure, put 0 in overcommit_memory:

echo "0" > /proc/sys/vm/overcommit_memory


Out-of-memory (OOM) killer

The OOM killer will kill some random process, say rpm or syslog, because the system is short on memory, and the programmer is unable to do anything about it.

That is a risk factor and you can detect that browsers crash using less /var/log/messages:

less /var/log/messages
Jun 21 18:41:23 diuw-desktop kernel: [2704707.444818] firefox invoked oom-killer: gfp_mask=0x280da, order=0, oomkilladj=0
Jun 21 18:41:23 diuw-desktop kernel: [2704707.444824] firefox cpuset=/mems_allowed=0
Jun 21 18:41:23 diuw-desktop kernel: [2704707.444828] Pid: 25669, comm: firefox Tainted: P 2.6.31-20-generic #58-Ubuntu
Jun 21 18:41:23 diuw-desktop kernel: [2704707.444830] Call Trace: Jun 21 18:41:23 diuw-desktop kernel: [2704707.444841] [<c01b5a2f>] oom_kill_process+0x9f/0x250
Jun 21 18:41:23 diuw-desktop kernel: [2704707.444845] [<c01b603e>] ? select_bad_process+0xbe/0xf0
Jun 21 18:41:23 diuw-desktop kernel: [2704707.444848] [<c01b60c1>] __out_of_memory+0x51/0xa0
Jun 21 18:41:23 diuw-desktop kernel: [2704707.444851] [<c01b6163>] out_of_memory+0x53/0xb0
Jun 21 18:41:23 diuw-desktop kernel: [2704707.444854] [<c01b83f6>] __alloc_pages_slowpath+0x3f6/0x490
Jun 21 18:41:23 diuw-desktop kernel: [2704707.444858] [<c01b859f>] __alloc_pages_nodemask+0x10f/0x120
Jun 21 18:41:23 diuw-desktop kernel: [2704707.444861] [<c01ca1f6>] do_anonymous_page+0x66/0x200
Jun 21 18:41:23 diuw-desktop kernel: [2704707.444865] [<c012d2fd>] ? kmap_atomic_prot+0xcd/0xf0
Jun 21 18:41:23 diuw-desktop kernel: [2704707.444868] [<c01cc5e0>] handle_mm_fault+0x330/0x380
Jun 21 18:41:23 diuw-desktop kernel: [2704707.444874] [<c05760f8>] do_page_fault+0x148/0x380
Jun 21 18:41:23 diuw-desktop kernel: [2704707.444877] [<c0575fb0>] ? do_page_fault+0x0/0x380
Jun 21 18:41:23 diuw-desktop kernel: [2704707.444880] [<c0573fe3>] error_code+0x73/0x80

You might want to read this for more info:


Problem TWO. Firefox is unstable

On Debian Lenny, we had some stability issues with Firefox 3.5.7 and 3.6.3. That was clear after I saw that the Selenium HUB<->RC connection is there, but the browser is gone.

After some tests and research we found out that the Swiftfox build of Firefox is generally faster and didn’t crash. (It normally replaces ‘classic’ Firefox. User doesn’t see that it is a “different kind of fox” :). User gets more speed and stability with Swiftfox.)

💡 Simple way to install:

1. Go
2. Select your processor
3. Download
4. sudo dpkg -i swiftfox-YOUR-VERSION-HERE.deb

General idea for any OS/browser combination is to try different browser versions or watch through bug lists for your particular browser version.


Problem THREE. Too many parallel builds (and open browsers) freeze the system.

System performance and low resources problem will most likely become the functional problem of the Grid and of all started browsers.

How that kind of problem can be identified:

⭐ Use the top Linux command to sees current CPU and MEM usage. If it goes too high, builds might be slowed too much and will fail.

⭐ Just try to execute commands in console, if system reacts slowly (15-30 sec delay), it is under heavy load.

⭐ Check out build times. It tends to be 30-70% more if system is overloaded.

We reduced the number of parallel builds and running RC’s to solve the issue with System overload.


Problem FOUR. Browser and plug-ins

Does browser disappears during particular pages tests?

Do you test the pages with Flash, Silverlight, Windows media etc? If your browser tends to disappear during particular pages tests, browser<->plugin play might be the root cause. Read through bug reports, maybe upgrade the browser or just a plug-in for it. We had some issues with Flash. They where gone as we introduced newer Firefox version.

That’s it, have fun 8)

Be Sociable, Share!
Does that help to solve your problem?
VN:F [1.8.5_1061]
Rating: 0 (from 2 votes)
1 votes 'YES'  1 votes 'NO'


Be Sociable, Share!


Leave a Reply