[Beowulf] High Performance for Large Database
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
hanzl at noel.feld.cvut.cz hanzl at noel.feld.cvut.czWed Oct 27 02:42:15 PDT 2004
- Previous message: [Beowulf] High Performance for Large Database
- Next message: [Beowulf] High Performance for Large Database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> > I'm currently working on a project that will require fast access to > > data stored in a postgreSQL database server. I've been told that a > > ... > > and 64+ GB of RAM, the prices sky rocket. If I can acheive high > > performance with a cluster, using 15-20 dual processor machines, that > > would be great. > > This sort of cluster isn't a "beowulf" cluster; rather it is a variant > of a high availability cluster. It's Extreme Linux, just not beowulf. > The beowulf design (and focus of this list) is "high performance > computing" clusters, aka supercomputing clusters. I think that while this is true in many particular cases, it is far from being true in general. There are applications which involve databases and could be as beowulfish as it can get. I know reseachers who work with extremely huge and complex graphs and use a database for this. Should they have say a MPI-based database with all data in RAM they could get tremendous speedups. They would be happy to copy the database to the distributed cluster RAM, do few zillions of operations on it and then copy some results back. I do agree that a database might not be the best tool for their job and complete rewrite of all the code they have might help :-) However I consider programming against a db API to be an important knowledge reuse and nice split of their problem into two parts which together take more computer time than one monolith would but one of them (the db searches) is a problem with commodity solutions. (And I might even argue that even high availability databases may very well use The True Beowulf as a component doing searches on mostly read-only data cached in cluster RAM or even cached in local harddisks.) The only difference I can see is the application (which is not a CFD or galactic evolution or similar). From the point of wiew of interconnects, OS types, parallel libraries used, RAM, processors, cluster management etc. I see no reason why databases and beowulf could not overlap. Best Regards Vaclav Hanzl
- Previous message: [Beowulf] High Performance for Large Database
- Next message: [Beowulf] High Performance for Large Database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
