[Beowulf] how Google warps your brain
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mark Hahn hahn at MCMASTER.CAThu Oct 21 15:04:44 PDT 2010
- Previous message: [Beowulf] how Google warps your brain
- Next message: [Beowulf] how Google warps your brain
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> parallel jobs on massive datasets when you have a simple interface like > MapReduce at your disposal. Forget about complex shared-memory or message > passing architectures: that stuff doesn't scale, and is so incredibly brittle > anyway (think about what happens to an MPI program if one core goes offline). this is a bit unfair - the more honest comment would be that for data-parallel workloads, it's relatively easy to replicate the work a bit, and gain substantially in robustness. you _could_ replicate the work in a traditional HPC application (CFD, chem/md, etc), but it would take a lot of extra bookkeeping because the dataflow patterns are complex and iterative. > The other Google technologies, like GFS and BigTable, make large-scale > storage essentially a non-issue for the developer. Yes, there are tradeoffs: well, I think storage is the pivot here: it's because disk storage is so embarassingly cheap that Goggle can replicate everything (3x?). once you've replicated your data, replicating work almost comes along for free. > So, printf() is your friend. Log everything your program does, and if > something seems to go wrong, scour the logs to figure it out. Disk is cheap, > so better to just log everything and sort it out later if something seems to this is OK for data-parallel, low-logic kinds of workflows (like Goggle's). it's a long way from being viable for any sort of traditional HPC, where there's far too much communication and everything runs too long to log everything. interestingly, logging might work if the norm for HPC clusters were something like gigabit-connected uni-core nodes, each with 4x 3TB disks. so in a sense we're talking across a cultural gulf: disk/data-oriented vs compute/communication-oriented.
- Previous message: [Beowulf] how Google warps your brain
- Next message: [Beowulf] how Google warps your brain
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
