[Beowulf] Memory issue with Quad Dual-Core Opteron w/Two NICs

Bruce Allen ballen at gravity.phys.uwm.edu
Thu Sep 27 19:39:50 PDT 2007


Jeremy,

You might be better off posting to the Linux Kernel Mailing List (LKML) 
about this issue.  There are a few experts here (Don, Joe, Mike, ....) who 
might know, but LKML is more likely to give you correct guidance quickly.

cheers,
 	Bruce

On Mon, 24 Sep 2007, Jeremy Fleming wrote:

> I have a quad opteron node machine where each node is a dual core with each
> core running at 2.0Gz.  The machine has 64 GB of ram, two broadcom ethernet
> gigabit cards and 2 other gigabit intel cards each supplying 2 ports, and
> are supported by the e1000 driver.  The machine is running the default
> install of Redhat Enterprise 5.0 (original release, no patches or updates).
>
>
> Remote machines are supplying ~512 megabit/sec streams over gigabit ethernet
> to this machine.  There are two streams on seperate ethernet lines.  I have
> each stream connected to a different port on one of the intel cards.  The
> streams are sent via multicast, and there are 4 sub-streams per ethernet
> line.  Each substream is approximately 131.072 megabits/sec.
>
> On the opteron machine I have a process that can pull a substream off of an
> ethernet port and dump it to a ring buffer in shared memory.  To start, the
> process could never keep up with receiving the data via ethernet and then
> doing a memcpy to shared memory.  Then I found out about NUMA, and decided
> to use sched_setaffinity to bind the process to a cpu, I bound the process
> to the same cpu the ethernet card is bound to via it's IRQ.
>
> I looked in /proc/interrupts and found "eth0" or "eth1", looked up it's IRQ,
> then went into /proc/irq/<eth 0 IRQ>/smp_affinity, and checked which cpu the
> IRQ was bound to.  I bound the process to that processor and ran it again.
> Luckily no data loss and it could keep up.  I bound the process before I
> allocated memory so the memory was bound to the same process too.  I was
> even able to run three more processes, bound to the same cpu and have all 4
> read the sub-streams from the ethernet device eth0, with no data loss.  I
> can even run another process which reads from the ring buffer and dumps the
> data to disk and it causes no slow downs or data loss.
>
> Now I want to read a substream from the other stream connected to eth1 while
> reading from the other 4 sub-streams.  I start that up just by binding the
> same application to the processor associated with eth 1, by checking
> "/proc/irq/<eth 1 IRQ>/smp_affinity".  When the process starts the system
> starts to not be able to keep up anymore, just like in the beginning when I
> just had 1 processor reading one stream without doing anything else.  I
> thought I was just trying to do too much work, so I turned off all streams,
> and ran just two processes bound to two different processors, each bound to
> the same processor as the associated eth device.  I ran them both, and they
> lose data.  If I run them seperately they work fine, but when I read 1
> sub-stream from each of the two unique streams they fail.
>
> Are the two ethernet devices dumping their multicast data into kernel
> buffers associated with different processors?
> How do I know what processor the kernel ethernet buffers are associated
> with?
> Is there a way to set cpuaffinity for ethernet devices before they boot up
> so I know which processors memory they are dumping data to?
> Any ideas on why there would be a problem with reading a stream from each
> eth device at the same time and not with reading 4 streams from one eth
> device?
> Do I need to turn a NUMA aware scheduler on somehow, or is that on by
> default in RHEL 5?
> I also noticed that linux assigns IRQs at bootup that vary with each boot,
> is there a way to statically assign IRQs to the ethernet cards?
>
> Any help or pointers at all would be great!
>
> Thanks in advance
> Jeremy
>



More information about the Beowulf mailing list