[Beowulf] help for metadata-intensive jobs (imagenet)
hahn at mcmaster.ca
Fri Jun 28 10:47:00 PDT 2019
I wonder if anyone has comments on ways to avoid metadata bottlenecks
for certain kinds of small-io-intensive jobs. For instance, ML on imagenet,
which seems to be a massive collection of trivial-sized files.
A good answer is "beef up your MD server, since it helps everyone".
That's a bit naive, though (no money-trees here.)
How about things like putting the dataset into squashfs or some other
image that can be loop-mounted on demand? sqlite? perhaps even a format
that can simply be mmaped as a whole?
personally, I tend to dislike the approach of having a job stage tons of
stuff onto node storage (when it exists) simply because that guarantees a
waste of cpu/gpu/memory resources for however long the stagein takes...
thanks, mark hahn.
operator may differ from spokesperson. hahn at mcmaster.ca
More information about the Beowulf