[Beowulf] High Performance for Large Database
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mark Hahn hahn at physics.mcmaster.caWed Oct 27 10:25:58 PDT 2004
- Previous message: [Beowulf] High Performance for Large Database
- Next message: [Beowulf] High Performance for Large Database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> relationships regarding access to the HPC-generated data, a DB is needed > just to permit search and retrieval of your OWN results, let alone > somebody else's. right. the distinction here is that HPC and filesystems tend to have a very simple DB schema ;) > Writing a PARALLEL SQL database server is even MORE nontrivial, and > while yes, some reasons for this are shared by the HPC community, the > bulk of them are related directly to locking and the file system and to > SQL itself. depends. for instance, it's not *that* uncommon to have DB's which see almost nothing but read-only queries (and updates, if they happen at all, can be batched during an off-time.) that makes a parallel version quite easy, actually: imagine a bunch of 8GB dual-opterons running queries on a simple NFS v3 server over Myrinet. for a read-mostly load, especially one with enough locality to make 8GB caches effective, this would probably *fly*. tweak it with iSCSI and go to 64 GB quad- opterons. how many tables out there wouldn't have a good hit rate in 64GB? > NONtrivial parallelizations are things like distributing the execution > of actual SQL search statements across a cluster. Whether there is any it's easy to imagine that a stream of SQL queries could actually be handled in sort of an adaptive data refinement manner, where most of the thought goes in to managing division of the query labor (distributed indices searched in parallel, etc) , and in placement of data (especially ownership/locking of writable data). I have no idea whether Oracle-level DB's do this, but it wouldn't surprise me. the irony is that most of the thought that goes into advanced Beowulf applications is doing exactly this sort of labor/data division/balancing. I'd hazard a guess that the place to start putting parallelism in a DB is the underlying isam-like table layer...
- Previous message: [Beowulf] High Performance for Large Database
- Next message: [Beowulf] High Performance for Large Database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
