Stock Trading &c

Sat Jun 3 16:37:40 PDT 2000

> Does anyone have info on the porting of stock trading 
> software to clusters?  
> For example, there is a list of financial/stock programs at:
> 
> http://linux.com/links/Software/Financial/
> 
> How many of these programs are worth parallelizing?  Who 
> has actually tried 
> it?

It depends on your exact needs, really. The software on the page you have
mentioned is all for monitoring performance of stocks. As such, it
requires very little processing power, so clusters are not really a
terribly useful platform to be porting it to.

There are other things you can do on a cluster, though.

I am currently working on a stock market trading and signalling system,
and when you think about it the right way, the parallelism is very
obvious. If you consider that there are in excess of 10,000 companies
being traded world wide, then analysing the trends in those can be
performed in parallel as 10,000 jobs running at the same time, each using
whatever your method of choice is, be it ridge/lease squares regression,
support vector machines, or neural networks.

The point is, if that is the sort of thing you are working on, then you
could quite simply run all of these in parallel. The tasks involved in
detailed analysis, such as the methods mentioned above, are extremely CPU
intensive, but cause very little IO traffic, to the disk, and hence the
network. This means that your spawning/migration times are going to be
negligible compared to CPU time consumed.

Seen as that is the case, you might as well just slap a few machines
together and use Mosix to load ballance the tasks.

If you are comparing the performance of companies, and comparing each one
of them with each of the others, then you again have the situation where
you are running a bunch of identical tasks in parallel on different data.

What you could potentially save on is using the same code section with
varying data section in your program, and using this to minimize memory
usage. This is often quite effective in conserving memory on a single CPU
system, but when you start trying to spread the program over the entire
cluster, you need the program code to be running on all machines, so you
will either not save anything, or you will cause enough IO traffic between
machines to make the whole exercise not worth your while due to horrendous
overheads.

As far as the stock trading problem goes, the explanation given here is
rather trivial, but I hope that it does illustrate the kind of problem you
are likely to be facing.

Hope this helps.

Gordan