No subject


Thu Jun 12 22:07:40 PDT 2014


should be using as much of one language as possible.  mod_perl would be my
choice - you'll have more flexibility with mod_perl because Apache is more
accessable overall to developers, IMHO.

> 
> The database question. The database is large, but entirely static.
> Would PostgreSQL be able to handle this comparably to Oracle? The 
> licenses for Oracle are not so very expensive anymore, but if we're
> going towards a farm of boxes, this will start to cumulate. Otoh,
> some Oracle stuff seems to be able to handle parallel databases
> natively. Hmm.

I think it might!  I'm working with PostgreSQL, and I'm seeing comparable
performance, but you'll need to do some benchmarking.  If it doesn't compare
now, it eventually will.  Watch out for using joins in PostgreSQL - they can
impact run times.

> 
> Sorry for the bunch of clueless questions, but I'm not exactly a
> computer person, nor is the company extremely competent in technical
> questions (they're all a bunch of chemists, which just have been
> working with computers for a long time). The budget for hardware seems
> to be there, but I'm a new guy, and I can't afford buying a bunch
> of expensive crap which will just sit there gathering dust.

You're not clueless 'cause you're asking the right questions - this is the
same type of thing directors and managers must contend with.  You have a
chance to sell beowulf to your higher-ups.  Go for it!



> 
> I would also welcome some pointers towards lists where questions
> such as these are handled holistically. I'd settle for a bunch
> of dedicated lists too, though.


> 
> <off-topic/>
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) 
> visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

------_=_NextPart_001_01C092C6.98BBCDA0
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
5.5.2650.12">
<TITLE>RE: phasing out Solaris/Oracle/Netscape with =
Linux/PostgreSQL/Apache</TITLE>
</HEAD>
<BODY>

<P><FONT SIZE=3D2>Let me know if I can help beyond what I write =
below.  Additional comments from the crowd welcome.  See =
below . . . </FONT>
</P>

<P><FONT SIZE=3D2>> -----Original Message-----</FONT>
<BR><FONT SIZE=3D2>> From: Eugene.Leitl at lrz.uni-muenchen.de</FONT>
<BR><FONT SIZE=3D2>> [<A =
HREF=3D"mailto:Eugene.Leitl at lrz.uni-muenchen.de">mailto:Eugene.Leitl at lrz=
.uni-muenchen.de</A>]</FONT>
<BR><FONT SIZE=3D2>> Sent: Thursday, February 08, 2001 10:43 =
PM</FONT>
<BR><FONT SIZE=3D2>> To: linux-elitists at zgp.org; =
pigdog-l at bearfountain.com;</FONT>
<BR><FONT SIZE=3D2>> beowulf at beowulf.org</FONT>
<BR><FONT SIZE=3D2>> Subject: phasing out Solaris/Oracle/Netscape =
with</FONT>
<BR><FONT SIZE=3D2>> Linux/PostgreSQL/Apache</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> This is somewhat off-topic, so bear with =
me.</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> Um, I need some advice. The company I'm with =
has a Solaris Sun with</FONT>
<BR><FONT SIZE=3D2>> Netscape Enterprise Server. It is running a mix =
of C, Perl, Oracle</FONT>
<BR><FONT SIZE=3D2>> and Daylight. The latter beast currently =
requires 1.5 GByte RAM,</FONT>
<BR><FONT SIZE=3D2>> but is still creaking at all seams. We're going =
to kick Daylight </FONT>
<BR><FONT SIZE=3D2>> sooner or later, so the memory footprint will =
drop, possibly a lot.</FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> I don't yet understand the architecture of the =
chemical database</FONT>
<BR><FONT SIZE=3D2>> we're running, so I don't know where the =
bottlenecks are, but I need</FONT>
<BR><FONT SIZE=3D2>> to build a Linux machine which could outperform =
the Sun at a fraction</FONT>
<BR><FONT SIZE=3D2>> of the price. I will set it up locally, and =
hammer it with queries,</FONT>
<BR><FONT SIZE=3D2>> doing measurements, stability tests, =
etc.</FONT>
</P>

<P><FONT SIZE=3D2>This could get you.  Critical to understand some =
basic things about the existing application.  Exporting data, etc =
is your first concern - you need to know how to do that.  That =
could very well set the pace of a conversion.</FONT></P>

<P><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> I would rather like to use Athlons, especially =
recent DDR </FONT>
<BR><FONT SIZE=3D2>> Dual Athlons,</FONT>
<BR><FONT SIZE=3D2>> but this doesn't seem to go too well with =
required stability. This</FONT>
<BR><FONT SIZE=3D2>> means I should use a dual Pentium III =
motherboard, right? Does Linux</FONT>
<BR><FONT SIZE=3D2>> handle these well? I heard these can take up to =
8 GByte RAM, I think </FONT>
<BR><FONT SIZE=3D2>> I should start with 2 GBytes. I don't think the =
application is CPU </FONT>
<BR><FONT SIZE=3D2>> bound, but I don't know for sure yet. It =
certainly seems to exercise </FONT>
<BR><FONT SIZE=3D2>> disks strongly, so here is another =
question:</FONT>
</P>

<P><FONT SIZE=3D2>Linux does handle them well, but I'm impartial to =
FreeBSD.  Since you're using Sun OS already, you could continue =
with Sun X86 version without making the big switch all at once - it's =
free and a works well enough.  It would make a fine part of a =
beowulf, and allow you to get used to Linux over a longer period of =
time.  If you can make the transition in baby steps, it's =
better.</FONT></P>

<P><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> do I absolutely, positively need SCSI? I was =
thinking about putting a</FONT>
<BR><FONT SIZE=3D2>> second 100 EIDE host adapter in, and run disk =
striping plus mirroring</FONT>
<BR><FONT SIZE=3D2>> over 4 EIDE hard drives (the better models from =
IBM). Or should I</FONT>
<BR><FONT SIZE=3D2>> use a Dual-Pentium mumboard with onboard SCSI, =
and buy several fast,</FONT>
<BR><FONT SIZE=3D2>> hot & noisy scuzzys, soft-RAIDing them? =
Perhaps even harware RAID?</FONT>
<BR><FONT SIZE=3D2>> I don't think the disks need to be very large, =
but they </FONT>
<BR><FONT SIZE=3D2>> better be fast.</FONT>
<BR><FONT SIZE=3D2>> </FONT>
</P>

<P><FONT SIZE=3D2>SCSI is GREAT, and you should set up redundant hot =
swaps so if you crash, you insert a new disk, type "boot", =
and you're back online with a node.  I think Sun stations =
outperform the Intel boards on disk throughput, but you could =
check.</FONT></P>

<P><FONT SIZE=3D2>> I would like to use Reiser FS, so this only =
allows me RAID 0/1, right?</FONT>
<BR><FONT SIZE=3D2>> Higher stability, lack of fscking delay in case =
machine needs to be</FONT>
<BR><FONT SIZE=3D2>> rebooted and no 2 GByte file size limit would =
seem to be needed.</FONT>
<BR><FONT SIZE=3D2>> Since this is ReiserFS, I should obviously go =
with the latest stable</FONT>
<BR><FONT SIZE=3D2>> kernel. I've been using Mandrake for my desktop =
and small </FONT>
<BR><FONT SIZE=3D2>> time testing, </FONT>
<BR><FONT SIZE=3D2>> but for this application this is probably not =
the way to go. Which </FONT>
<BR><FONT SIZE=3D2>> distro should I choose? Debian?</FONT>
</P>

<P><FONT SIZE=3D2>Not sure on this one.  Not enough exposure with =
all of those filesystems.  I know, use MFS, and load up machines =
with 2GB of ram!  Just kidding.</FONT></P>

<P><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> We seem to be moving towards an architecture =
consisting of a bunch </FONT>
<BR><FONT SIZE=3D2>> of Perl programs (it's not settled, but the few =
routines we have are</FONT>
<BR><FONT SIZE=3D2>> in Perl) communicating via sockets. (Right now =
it's a mix of C and </FONT>
<BR><FONT SIZE=3D2>> Perl, talking via named pipes). The queries (a =
chemical structure </FONT>
<BR><FONT SIZE=3D2>> drawn within a browser, using a Java applet or =
a plug-in) are </FONT>
<BR><FONT SIZE=3D2>> handled with a dispatcher (a cgi-bin Perl =
thing). Sooner or </FONT>
<BR><FONT SIZE=3D2>> later we will spread the query to a farm of =
boxes, each containing </FONT>
<BR><FONT SIZE=3D2>> a database, or segmenting a database across =
several, cheap, redundant </FONT>
<BR><FONT SIZE=3D2>> boxes. (But where not there yet). There's no =
alternative to sockets, </FONT>
<BR><FONT SIZE=3D2>> right?</FONT>
</P>
<BR>

<P><FONT SIZE=3D2>You'll take a big performace hit running perl too =
much - it's interpreted.  I know it's popular but with a big =
project, you should get the training required to do it in a compiled =
language, like C/C++.  If you insist on an interpreted language, =
use Java - it's a 'cleaner' language to manage on a large scale.  =
For socket work I would definitely stick with C - and create a library =
specific to your needs that your perl programs could call.</FONT></P>

<P><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> Right now a query can take up to minutes, so I =
don't think mod_perl</FONT>
<BR><FONT SIZE=3D2>> is needed. We don't get a lot of query hits, at =
least not yet. Should </FONT>
<BR><FONT SIZE=3D2>> I try using Apache mod_perl instead of the =
Netscape Server </FONT>
<BR><FONT SIZE=3D2>> nevertheless? </FONT>
</P>

<P><FONT SIZE=3D2>How much data does the longer queries access?  =
If you're seeing less than sub-minute response times, it's a red =
flag.  That's just a rule of  thumb I use.  I've seen =
five second return times on several gigs of data with a Sun and =
Oracle.  </FONT></P>

<P><FONT SIZE=3D2>From the standpoint of managing the code, regardless =
of what you use, you should be using as much of one language as =
possible.  mod_perl would be my choice - you'll have more =
flexibility with mod_perl because Apache is more accessable overall to =
developers, IMHO.</FONT></P>

<P><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> The database question. The database is large, =
but entirely static.</FONT>
<BR><FONT SIZE=3D2>> Would PostgreSQL be able to handle this =
comparably to Oracle? The </FONT>
<BR><FONT SIZE=3D2>> licenses for Oracle are not so very expensive =
anymore, but if we're</FONT>
<BR><FONT SIZE=3D2>> going towards a farm of boxes, this will start =
to cumulate. Otoh,</FONT>
<BR><FONT SIZE=3D2>> some Oracle stuff seems to be able to handle =
parallel databases</FONT>
<BR><FONT SIZE=3D2>> natively. Hmm.</FONT>
</P>

<P><FONT SIZE=3D2>I think it might!  I'm working with PostgreSQL, =
and I'm seeing comparable performance, but you'll need to do some =
benchmarking.  If it doesn't compare now, it eventually =
will.  Watch out for using joins in PostgreSQL - they can impact =
run times.</FONT></P>

<P><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> Sorry for the bunch of clueless questions, but =
I'm not exactly a</FONT>
<BR><FONT SIZE=3D2>> computer person, nor is the company extremely =
competent in technical</FONT>
<BR><FONT SIZE=3D2>> questions (they're all a bunch of chemists, =
which just have been</FONT>
<BR><FONT SIZE=3D2>> working with computers for a long time). The =
budget for hardware seems</FONT>
<BR><FONT SIZE=3D2>> to be there, but I'm a new guy, and I can't =
afford buying a bunch</FONT>
<BR><FONT SIZE=3D2>> of expensive crap which will just sit there =
gathering dust.</FONT>
</P>

<P><FONT SIZE=3D2>You're not clueless 'cause you're asking the right =
questions - this is the same type of thing directors and managers must =
contend with.  You have a chance to sell beowulf to your =
higher-ups.  Go for it!</FONT></P>
<BR>
<BR>

<P><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> I would also welcome some pointers towards =
lists where questions</FONT>
<BR><FONT SIZE=3D2>> such as these are handled holistically. I'd =
settle for a bunch</FONT>
<BR><FONT SIZE=3D2>> of dedicated lists too, though.</FONT>
</P>
<BR>

<P><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> <off-topic/></FONT>
<BR><FONT SIZE=3D2>> </FONT>
<BR><FONT SIZE=3D2>> =
_______________________________________________</FONT>
<BR><FONT SIZE=3D2>> Beowulf mailing list, =
Beowulf at beowulf.org</FONT>
<BR><FONT SIZE=3D2>> To change your subscription (digest mode or =
unsubscribe) </FONT>
<BR><FONT SIZE=3D2>> visit <A =
HREF=3D"http://www.beowulf.org/mailman/listinfo/beowulf" =
TARGET=3D"_blank">http://www.beowulf.org/mailman/listinfo/beowulf</A></F=
ONT>
<BR><FONT SIZE=3D2>> </FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C092C6.98BBCDA0--




More information about the Beowulf mailing list