[Beowulf] Re: Memory issue with Quad Dual-Core Opteron w/Two NICs

Jeremy Fleming jtfleming at gmail.com
Tue Sep 25 18:07:46 PDT 2007

I didn't realize the xen kernel of RHEL 5 was not numa aware, I figured this
out today after trying to run numactl --hardware.  I switched RHEL 5 to boot
off the other default kernel, which is NUMA aware and everything works!

Hopefully this will be helpful for someone else in the future.

On 9/24/07, Jeremy Fleming <jtfleming at gmail.com> wrote:
> I have a quad opteron node machine where each node is a dual core with
> each core running at 2.0Gz.  The machine has 64 GB of ram, two broadcom
> ethernet gigabit cards and 2 other gigabit intel cards each supplying 2
> ports, and are supported by the e1000 driver.  The machine is running the
> default install of Redhat Enterprise 5.0 (original release, no patches or
> updates).
> Remote machines are supplying ~512 megabit/sec streams over gigabit
> ethernet to this machine.  There are two streams on seperate ethernet
> lines.  I have each stream connected to a different port on one of the intel
> cards.  The streams are sent via multicast, and there are 4 sub-streams per
> ethernet line.  Each substream is approximately 131.072 megabits/sec.
> On the opteron machine I have a process that can pull a substream off of
> an ethernet port and dump it to a ring buffer in shared memory.  To start,
> the process could never keep up with receiving the data via ethernet and
> then doing a memcpy to shared memory.  Then I found out about NUMA, and
> decided to use sched_setaffinity to bind the process to a cpu, I bound the
> process to the same cpu the ethernet card is bound to via it's IRQ.
> I looked in /proc/interrupts and found "eth0" or "eth1", looked up it's
> IRQ, then went into /proc/irq/<eth 0 IRQ>/smp_affinity, and checked which
> cpu the IRQ was bound to.  I bound the process to that processor and ran it
> again.  Luckily no data loss and it could keep up.  I bound the process
> before I allocated memory so the memory was bound to the same process too.
> I was even able to run three more processes, bound to the same cpu and have
> all 4 read the sub-streams from the ethernet device eth0, with no data
> loss.  I can even run another process which reads from the ring buffer and
> dumps the data to disk and it causes no slow downs or data loss.
> Now I want to read a substream from the other stream connected to eth1
> while reading from the other 4 sub-streams.  I start that up just by binding
> the same application to the processor associated with eth 1, by checking
> "/proc/irq/<eth 1 IRQ>/smp_affinity".  When the process starts the system
> starts to not be able to keep up anymore, just like in the beginning when I
> just had 1 processor reading one stream without doing anything else.  I
> thought I was just trying to do too much work, so I turned off all streams,
> and ran just two processes bound to two different processors, each bound to
> the same processor as the associated eth device.  I ran them both, and they
> lose data.  If I run them seperately they work fine, but when I read 1
> sub-stream from each of the two unique streams they fail.
> Are the two ethernet devices dumping their multicast data into kernel
> buffers associated with different processors?
> How do I know what processor the kernel ethernet buffers are associated
> with?
> Is there a way to set cpuaffinity for ethernet devices before they boot up
> so I know which processors memory they are dumping data to?
> Any ideas on why there would be a problem with reading a stream from each
> eth device at the same time and not with reading 4 streams from one eth
> device?
> Do I need to turn a NUMA aware scheduler on somehow, or is that on by
> default in RHEL 5?
> I also noticed that linux assigns IRQs at bootup that vary with each boot,
> is there a way to statically assign IRQs to the ethernet cards?
> Any help or pointers at all would be great!
> Thanks in advance
> Jeremy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20070925/3d281d81/attachment.html>

More information about the Beowulf mailing list