[Beowulf] User resource limits
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Prentice Bisbal prentice at ias.eduMon Jun 9 08:41:29 PDT 2008
- Previous message: [Beowulf] A couple of interesting comments
- Next message: [Beowulf] User resource limits
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
This topic is slightly off topic, since it's not a beowulf specific problem, but it is HPC-related: I have several fat servers with 4 cores and 32 GB of RAM, for jobs that aren't very parallel and need large amounts of RAM. They are not clustered in any way. At the moment, users ssh into these systems to run large jobs. Eventually, I will have these nodes managed by a queuing system. The problem: Every couple of days, one of these systems become unresponsive due to OOM errors. If we wait long enough, the offending job will complete, and everything will return to normal. Since these are multi-user shared resources, I don't have the luxury of waiting for the systems to clear themselves up, and I often have to hit the power button. I would like to impose some CPU and memory limits on users that are hard limits that can't be changed/overridden by the users. What is the best way to do this? All I know is environment variables or shell commands done as the user (ulimit, for example). -- Prentice
- Previous message: [Beowulf] A couple of interesting comments
- Next message: [Beowulf] User resource limits
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
