[Beowulf] Generation of strings MPI fashion..

John Hearns John.Hearns at xma.co.uk
Sun Oct 9 15:04:06 PDT 2016

This sounds interesting.

I would question this step though, as it seems intrinically a bottlneck:
> Removes duplicate lines:
> $ sort filename.txt | uniq

First off - do you need a sorted list?  If not do not perform the sort...

Secondly - how efficient is the Unix 'uniq' utility?
How about, every time you generate a string, check if it has already been generated.  I guess this then starts to depend on the efficiency of searching through a file.

Thinking about this problem, would it be better to allocate each MPI rank a unique set of string to generate?
Say there are N characters in the set A-Z a-z 0-9 plus symbols . This is likely to be 255 as I'm assuming you are talking the ASCII character set.
So then if you have 256  MPI ranks then allocate rank 1 to generate all strings commencing with 'A'  then rank 2 'B' etc.
If you are lucky enough to have 256x256 cores to allocate then it goes 'AA' 'AB' 'AC'  etc. etc.

Which further makes me think - why generate random strings in the first place? If you want a sorted list, then set the MPI ranks off to generate sequential strings,
and as I say make sure there is no duplicated effort somehow by allocating unique subsections to each rank.
All you have to do is gather in the already sorted lists at the end, starting at AA , AB etc etc
This is quite thought provoking - maybe the random approach IS more efficient ???

Thinking further though, if you have processors with AVX externsions then is there an efficient way to bring those to bear on the problem - 256 and 512 vector lengths mapping rather well to ASCII 255 characters.

I know this is an exercise in learning MPI, but I also idly winder if the Micron Automata could do this rather well

From: Beowulf [beowulf-bounces at beowulf.org] on behalf of Skylar Thompson [skylar.thompson at gmail.com]
Sent: 08 October 2016 15:20
To: darren at wisecorp.co.uk
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf] Generation of strings MPI fashion..

If you haven't used MPI before, I would break this up into chunks:

1. Write a MPI program that gathers a fixed amount of /dev/urandom
(Brian's suggestion is wise) data and sends it back to the master. This
will get you used to the basic MPI_Send/MPI_Recv commands.

2. Use the same program, but use MPI_Gather rather than MPI_Recv to
assemble the final array. That will get you used to collective

3. Find a parallel sorting algorithm and implement it. You'll still need
to have rank 0 do some work, but the point of the algorithm would be to
reduce the amount of work it has to do. I found a good description of
parallel merge sort here:


Good luck!


On 10/07/2016 10:19 AM, Darren Wise wrote:
> Heya folks,
> This may seem really simple to some and it is fairly simple from the
> terminal using bash of generation and sorting (examples below)
> What I would like to do, to get started with making a program to run in
> MPI on my cluster.. I thought of a fairly simple bash script and this
> works fine for a single machine but what is the best way to go around it
> or converting this simple notion in to an MPI runable command.
> Generates random strings of chars:
> $ tr -cd '[upper:]' < /dev/random | fold -w9 | head -c${1:-1000} | tee
> somefile.txt
> Removes duplicate lines:
> $ sort filename.txt | uniq
> This is fine for generation as I mention for a single machine, but
> what's the best way to turn this around and use MPI with a shared NFS
> mounted filesystem..
> While this is an example im going to generate a bash script to allow me
> to generate every possible permutation of a desired string length along
> with types of chars a-z, A-Z, 0-9 along with symbols..
> So any advice is welcome really and it's just an educational way for me
> to transpire into generating code that can be used within my small beowulf.
>> Kind regards,
>> Darren Wise Esq.
>> www.wisecorp.co.uk <http://www.wisecorp.co.uk>
>> www.wisecorp.co.uk/babywise <http://www.wisecorp.co.uk/babywise>
>> www.darrenwise.co.uk <http://www.darrenwise.co.uk>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. Employees of XMA Ltd are expressly required not to make defamatory statements and not to infringe or authorise any infringement of copyright or any other legal right by email communications. Any such communication is contrary to company policy and outside the scope of the employment of the individual concerned. The company will not accept any liability in respect of such communication, and the employee responsible will be personally liable for any damages or other liability arising. XMA Limited is registered in England and Wales (registered no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP

More information about the Beowulf mailing list