quick question

Robert Pratte pratte at lincweb.com
Tue Jun 20 17:45:26 PDT 2000


You have probably gone through this list already, but sometimes it is
helpful
to check off the basics at least.

1) examine the hardware you are using.  I wouldn't be surprised if the
biggest
bottleneck you are facing is disk access.  What type of file system are you
using,
OS, etc?  I would guess that you are dealing with disk intensive processes
(unless
you are stuffing 1.5 gig of data into your free memory...:)...), so
increasing processor
throughput via threading the application/running it parallel/etc. may not
gain much.
I have seen HUGE differences with processes like this, though, by upgrading

drive arrays.  If you aren't using them, look at EMCs, or similar products.

2) how is the data set up.  Are you processing one giant log for a small
group
of processes, or is this some concatenated/conglomerated log(s) that can
easily be divided.  In the case of the latter, distributing the logs (not
necessarily
the process) may be a quick answer.

3) examine the script running the process.  You probably have some regular
expression
matching going on if this is a shell script/perl/python/etc.  Read the
O'Reilly Regular Expression
book, if you haven't already....quite elucidating.  If you are using a
binary, check the
source code (if available), there are lots of performance tweaks for
C/C++/etc that
may be useful.  Possibly recompiling using different flags may be useful.

4) examine processes running on the box.  unnecessary daemons, etc. just
drag
down performance....and create security hazards.  Is the kernel optimized?

5) see #1.....I am really suspicious that disk may be chewing up a lot of
your time.

Kurt Brust wrote:

> Hello, I am sure you are busy, so i will not take up much of your time.
>
> In regards to clustering, Is it possible to setup a beowulf cluster, to
> help process a log file (txt based) over multiple processer's to help
> distrube the load? Right now its at 1.5 gigs a day, takes 12 hours to
> process, I am looking to cut that down as much as possible.
>
> Thanks for your time!!!
>
> _______________________________________________
> Beowulf mailing list
> Beowulf at beowulf.org
> http://www.beowulf.org/mailman/listinfo/beowulf





More information about the Beowulf mailing list