would creating a cron job for each of the nodes to where only one is workign on the files on the head node?<br><br>
<div class="gmail_quote">On Wed, Aug 26, 2009 at 2:11 AM, <span dir="ltr"><<a href="mailto:madskaddie@gmail.com">madskaddie@gmail.com</a>></span> wrote:<br>
<blockquote style="BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex; PADDING-LEFT: 1ex" class="gmail_quote">Greetings,<br><br><br>I relatively new to cluster environments and I was given a small<br>(7nodes+1head) cluster to admin. So far I only had to maintain what<br>
was already installed so few problems to solve (and to think on). But<br>new (diferent: amd opteron vs intel xeon) machines came and I have to<br>expand the cluster (think and solve problems). The (old) cluster is<br>semi-diskless (all machines do have disks but they boot from a single<br>
image on a central server) with nfs for filesystem sharing. The main<br>problems I had were:<br> * if the /var filesystem is shared, race conditions happen (all nodes<br>want to write on the same files). I had this problem and moved to a<br>
local /var filesystem.<br> * if /var is local (which it may because the disks do exist), the<br>whole point of central point for easy admin vanishes, because I would<br>had to create all the /var structure that packages need to work, on<br>
each node (would be easier to do: "for $node; ssh $install_cmd; done",<br>than guessing which dirs I need to create or files to copy).<br> * if /var is tmpfs all forensics are certainly gone after failure<br>(Murphy told me this one ;).<br>
<br>Everything I read on the subject do underline the advantages of<br>diskless approaches but miss to alert to this problem and/or to solve<br>it. On the other side, the distributed approach tools (where every<br>node is autonomous) seem to be halted (as systemimager - which is used<br>
in the Oscar project) or discontinued, or truly overblown for my<br>reference scale (IBM's xCat); so it really seems that I'm missing<br>something.<br><br>The question is what you do about this ?<br><br>Gil Brandao<br>
_______________________________________________<br>Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org">Beowulf@beowulf.org</a> sponsored by Penguin Computing<br>To change your subscription (digest mode or unsubscribe) visit <a href="http://www.beowulf.org/mailman/listinfo/beowulf" target="_blank">http://www.beowulf.org/mailman/listinfo/beowulf</a><br>
</blockquote></div><br><br clear="all">
<div></div><br>-- <br>Jonathan Aquilina<br>