[Beowulf] A cluster of Arduinos

Lux, Jim (337C) james.p.lux at jpl.nasa.gov
Wed Jan 11 15:24:55 PST 2012

-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Vincent Diepeveen
Sent: Wednesday, January 11, 2012 2:47 PM
To: Beowulf Mailing List
Subject: Re: [Beowulf] A cluster of Arduinos

Jim, your microcontroller cluster is not a rather good idea.

Latency didn't keep up with the CPU speeds...

--- You're missing the point of the cluster.  It's not for performance (where I can't imagine that the slowest single CPU PC out there wouldn't blow the figurative doors off).  It's to provide a very inexpensive way to experiment/play/demonstrate loosely coupled multiprocessor systems. 

--> for example, you could experiment with redundant message routing across a fabric of nodes.  The algorithms are fairly simple, and this gives you a testbed which is qualitatively different than just simulating a bunch of nodes on a single PC.  There is pedagogical value in a system where you can force a link error by just disconnecting the cable, and your blinky lights on each node show what's going on.

There is still too much years 80s and years 90s software out there, written by the guys who wrote books about how to parallellize, which simply doesn't scale at all at modern hardware.

-->  I think that a lot of the theory of parallel processes is speed independent, and while some historical approaches might not be used in a modern system for good implementation reasons, students and others still need to learn about them, if only as the canonical approach.    Sure, you could do a simulation on a single PC (and I've seen them, in Simulink, and in other more specialized tools), but there's a lot of appeal to a hands-on-the-cheap-hardware approach to learning.

--> To take an example, if you set a student a problem of lighting a LED on each node in a specified node order at  specified intervals, and where the node interconnects are not specified in advance, that's a fairly interesting homework problem.  You have to discover the network connectivity graph, then figure out how to pass the message to the appropriate node at the appropriate time.  This is a classic "hot plug network discovery" kind of problem, and in the face of intermittent links, it's of great interest.

--> While that particular problem isn't exactly HPC, it DOES relate to HPC in a world where you cannot assume perfect processor nodes and perfect communications links.  And that gets right to the whole "scalability" thing in HPC.  It wasn't til the implementation of Error Correcting Codes in logic that something like the Q7A computer was even possible, because it was so large that you couldn't guarantee that all the tubes would be working all the time.  Likewise with many other aspects of modern computing.

--> And, of course, in the spaceflight world, this kind of thing is even more important.  A concept of growing importance is the "fractionated spacecraft" where all of the functions that would have been all in one physical vehicle are now spread across many smaller pieces.  And one might reallocate spacecraft fractional pieces between different virtual spacecraft.  Maybe right now, you need a lot of processing power to do image compression and analysis, so you want to allocate a lot of "processing pieces" to the job, with an ad hoc network connection among them.  Later,  you don't need them, so you can release them to other uses.  The pieces might be in the immediate vicinity, or they might be some distance away, which affects the data rate in the link and its error rates.

--> You can legitimately ask whether this sort of thing (the fractionated spacecraft) is a Beowulf (defined as a cluster supercomputer built of commodity components) and I would say it shares many of the same properties, especially in the early Beowulf days before multicores and fancy interconnects were fashionable for multi-thousand processor clusters.  It's that idea of building a large complex device out of many basically identical subunits, using open source/simple software to manage it.  

-->> in summary, it's not about performance.. it's about a teaching tool for networking in the context of cluster computing.  You claim we need to cast off the shackles of old programming styles and get some new blood and ideas.  Well, you need to get people interested in parallel computing and learning the basics (so at least they don't reinvent the square wheel).  One way might be challenges such as parallelization of game play; another might be working with parallelized database; the way I propose is with experimenting with message passing parallelization using dirt cheap hardware.

Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list