[Beowulf] What class of PDEs/numerical schemes suitable for GPU clusters
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Joe Landman landman at scalableinformatics.comThu Nov 20 07:43:15 PST 2008
- Previous message: [Beowulf] What class of PDEs/numerical schemes suitable for GPU clusters
- Next message: [Beowulf] What class of PDEs/numerical schemes suitable for GPU clusters
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Quick intervention from SC08 show Mark Hahn wrote: >> As we know by now GPUs can run some problems many times faster than CPUs > > it's good to cultivate some skepticism. the paper that quotes 40x > does so with a somewhat tilted comparison. (I consider this comparison > fair: a host with 2x 3.2 GHz QC Core2 vs 1 current high-end CPU card. > former delivers 102.4 SP Gflops; latter is something like 1.2 Tflop. > those are all peak/theoretical. the nature of the problem determines > how much slower real workloads are - I suggest that as not-suited-ness > increases, performance falls off _faster_ for the GPU.) Not always. [shameless plug] A project I have spent some time with is showing 117x on a 3-GPU machine over a single core of a host machine (3.0 GHz Opteron 2222). The code is mpihmmer, and the GPU version of it. See http://www.mpihmmer.org for more details. Ping me offline if you need more info. [/shameless plug] >> what I understand GPUs are useful only with certain classes of numerical >> problems and discretization schemes, and of course the code must be > > I think it's fair to say that GPUs are good for graphics-like loads, ... not entirely true. We are seeing good performance with a number of calculations that share similar features. Some will not work well on GPUs, those with lots of deep if-then or conditional constructs. If you can refactor these such that the conditionals are hoisted out of the inner loops, this is a good thing for GPUs. > or more generally: fairly small data, accessed data-parallel or with > very regular and limited sharing, with high work-per-data. ... not small data. You can stream data. Hi work per data is advisable on any NUMA like machine with penalties for data motion (cache based architectures, NUMA, MPI, ...). You want as much data reuse as you can get, or to structure the stream to leverage the maximum bandwidth. [...] >> than others? Given the very substantial speed improvements with GPUs, >> will there be a movement to GPU clusters, even if there is a substantial >> cost in problem reformulation? Or are GPUs only suitable for a rather >> narrow range of numerical problems? > > GP-GPU tools are currently immature, and IMO the hardware probably needs > a generation of generalization before it becomes really widely used. Hrmm... Cuda is pretty good. Still needs some polish, but people can use it, and are generating real apps from it. We are seeing pretty wide use ... I guess the issue is what one defines as "wide". > OTOH, GP-GPU has obviously drained much of the interest away from eg > FPGA computation. I don't know whether there is still enough interest There is still some of it on the show floor. Some things FPGAs do very well. But the cost for this performance has been prohibitive, and GPUs are basically decimating the business model that has been in use for FPGAs. > in vector computers to drain anything... Hmmm.... There is a (micro)vector machine in your CPU anyway. Joe > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615
- Previous message: [Beowulf] What class of PDEs/numerical schemes suitable for GPU clusters
- Next message: [Beowulf] What class of PDEs/numerical schemes suitable for GPU clusters
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
