[Beowulf] transcode Similar Video Processing on Beowulf?

Mark Hahn hahn at mcmaster.ca
Wed Apr 16 09:54:55 PDT 2014


I'm trying to understand this from a perspective of conventional HPC.

> cop-out but we're not keen to reinvent the wheel. It provides
> statekeeping and job queues in one package; replacing it wouldn't be

"statekeeping" is just tracking queued/running/done jobs, right?

> trivial but wouldn't be a massive task; the cost of using it is tiny,
> though, and it made our life a lot easier. It's all written in terms
> of deciders, which make decisions based on a list of events associated
> with an event (eg a "finished activity" event will have the details
> about the activity starting, being scheduled, and being completed,
> output status etc),

is the workflow complicated - a directed graph with complicated 
structure, rather than a series of discrete jobs, each a simple 
chain/pipeline in structure?

> maintained by passing JSON blobs around as messages; there'll be a
> blog post or two explaining things on our website soonish and I'll
> post them across if there's interest.

a reference would be interesting.

> It's being used in production on a regular basis and has had quite a
> lot of content processed through it so far; these tasks on average run
> for 2-6 hours and involve ~1GB of data going in and a few megabytes
> out.

that's unexceptional from an HPC perspective.

> The APIs are all simple HTTPS RESTful ones, storage can be cloud
> provider storage or local shared drive storage.

one premise usually found in HPC is that the job, at least the main part,
should be compute-bound.  how do you ensure that your compute resources
are not idle or starved by external IO bottlenecks?

> interprocess communication performance is less important and
> robustness and dynamic scalability plays a major role.

well, I think that's a bit disingenuous, since HPC is highly tuned
for robustness and dynamic scalability...

thanks, mark hahn.



More information about the Beowulf mailing list