From benson_muite at emailplus.org Mon May 3 06:53:32 2021 From: benson_muite at emailplus.org (Benson Muite) Date: Mon, 3 May 2021 09:53:32 +0300 Subject: [Beowulf] Ethernet switch OSes In-Reply-To: <20210423040730.GA22794@rd.bx9.net> References: <20210423040730.GA22794@rd.bx9.net> Message-ID: <9071ed2d-e3d3-1e0b-d011-7a221e54c1fe@emailplus.org> On 4/23/21 7:07 AM, Greg Lindahl wrote: > I'm buying a 100 gig ethernet switch for my lab, and it seems that the > latest gear is intended to run a switch OS. Being as cheap as I've > always been, free software sounds good. > > It looks like Open Network Linux is kaput. > > It looks like SONiC is doing pretty well. There is an OpenCompute HPC group. May wish to indicate your future networking requirements to help shape purchase possibilities: https://www.opencompute.org/wiki/HPC > > And there are several commercial offerings. > > Does anyone have experience with these OSes? Initially I'm going to > just have a single 32-port switch here and there, but I may have to > build much larger systems in the 6-12 month timeframe. > > Thanks in advance! > > -- greg > > p.s. I've left Silicon Valley and I'm working at the Event Horizon > Telescope, those people with the image of the ring around the black > hole. I'll be attending Supercomputing again! Surely there will be a > Beowulf Bash! > > p.p.s. and if any vendor wants to *sell* me aforesaid switch, please > send me a direct email. After 24 years of not having to get 3 quotes > for smallish purchases, my purchasing department is driving me nuts. > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf > From engwalljonathanthereal at gmail.com Tue May 4 19:53:42 2021 From: engwalljonathanthereal at gmail.com (Jonathan Engwall) Date: Tue, 4 May 2021 12:53:42 -0700 Subject: [Beowulf] A quick ML investigation Message-ID: Hello Beowulf, After this brief OCaml video https://youtu.be/a3kVJKt9mq4 I created a larger data set one with 5,000 of 10,000 numbers sequenced randomly. It illuminates the workings of RED BLACK binary tree. Simply copy&paste (ctrl k ctrl v) the large set to the command line. OCaml immediately builds several BLACK root nodes searching for a "balance point." I give a link to the RED BLACK implementation and larger data set. My source receives credit as well. Earlier experience with simple routines like fibonacci have me worried that OCaml is slow, RED BLACK's all short branches look quite fast. Jonathan Engwall -------------- next part -------------- An HTML attachment was scrubbed... URL: From engwalljonathanthereal at gmail.com Thu May 6 02:11:02 2021 From: engwalljonathanthereal at gmail.com (Jonathan Engwall) Date: Wed, 5 May 2021 19:11:02 -0700 Subject: [Beowulf] A quick ML investigation In-Reply-To: References: Message-ID: More exciting (if ML is your thing) news: when I reinstituted the call for mem x, let rec mem x = function Leaf -> false | Node (_, y, left, right) -> x = y || (x < y && mem x left) || (x > y && mem x right) ;; emptied the tree, re-loaded the 5,000 or so of 10,000 unsorted numbers I got this: # mem 7178 s;; - : bool = true # mem 285 s;; - : bool = true # mem 6009 s;; - : bool = true # mem 6008 s;; - : bool = false 6008 is in the chopped out section of the original 10,000. OCaml reads this list of int's in the blink of an eye; the entire list! Jonathan Engwall On Tue, May 4, 2021 at 12:53 PM Jonathan Engwall < engwalljonathanthereal at gmail.com> wrote: > Hello Beowulf, > After this brief OCaml video https://youtu.be/a3kVJKt9mq4 I created a > larger data set one with 5,000 of 10,000 numbers sequenced randomly. It > illuminates the workings of RED BLACK binary tree. > > Simply copy&paste (ctrl k ctrl v) the large set to the command line. OCaml > immediately builds several BLACK root nodes searching for a "balance point." > I give a link to the RED BLACK implementation and larger data set. My > source receives credit as well. > > Earlier experience with simple routines like fibonacci have me worried > that OCaml is slow, RED BLACK's all short branches look quite fast. > > Jonathan Engwall > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pc7 at sanger.ac.uk Tue May 25 11:06:39 2021 From: pc7 at sanger.ac.uk (Peter Clapham) Date: Tue, 25 May 2021 11:06:39 +0000 Subject: [Beowulf] [External] head node abuse [EXT] In-Reply-To: References: <3769e64d-155f-5bde-ad0b-02d961ec0867@pppl.gov>, Message-ID: <265d45773d2c42adaff30602f8c47c84@sanger.ac.uk> +1 to this. Relying on tech to solve issues like this tend to fall foul of new required software stacks and their requirements on head nodes, e.g. Nextflow, Cromwell etc. So a combined people process and education/assistance approach combined with some tech safty nets seems appropriate as long as there is agreement on what is proportionate. Another $0.02 in the pot. Pete ________________________________ From: Beowulf on behalf of Adam DeConinck Sent: 26 March 2021 14:41:13 To: beowulf Subject: Re: [Beowulf] [External] head node abuse [EXT] I agree with Chris D that this is more of a human problem than a technical problem. I have actually had a lot of success with user education -- people don't often think about the implications of having lots of people logged into the same head node, but get the idea when you explain it. Especially when you explain it along the lines of, "if we let all these other people test their MPI jobs on the head node, it would slow down YOUR work!" Granted, people don't tend to read that explanation in the onboarding doc, and I often have to re-explain it when it comes up in practice. ;-) But in general I rarely see "repeat offenders", and when it happens removing access is the right policy. We do ALSO enforce some per-user limits with cgroups (auto-generating the user-{UID}.slice as part of the user onboarding process). But in practice this mostly protects against accidental abuse ("whoops, I launched mpirun in the wrong terminal!"). The rare people who intentionally misuse the head node will find work-arounds. Arbiter looks really interesting but I haven't had a chance to play with it yet. Need to bump that further up the priority list... On Fri, Mar 26, 2021 at 8:27 AM Prentice Bisbal via Beowulf > wrote: Yes, there's a tool developed specifically for this called Arbiter that uses Linux cgroups to dynamically limit resources on a login node based on it's current load. It was developed at the University of Utah: https://dylngg.github.io/resources/arbiterTechPaper.pdf [dylngg.github.io] https://gitlab.chpc.utah.edu/arbiter2/arbiter2 [gitlab.chpc.utah.edu] Prentice On 3/26/21 9:56 AM, Michael Di Domenico wrote: > does anyone have a recipe for limiting the damage people can do on > login nodes on rhel7. i want to limit the allocatable cpu/mem per > user to some low value. that way if someone kicks off a program but > forgets to 'srun' it first, they get bound to a single core and don't > bump anyone else. > > i've been poking around the net, but i can't find a solution, i don't > understand what's being recommended, and/or i'm implementing the > suggestions wrong. i haven't been able to get them working. the most > succinct answer i found is that per user cgroup controls have been > implemented in systemd v239/240, but since rhel7 is still on v219 > that's not going to help. i also found some wonkiness that runs a > program after a user logs in and hacks at the cgroup files directly, > but i couldn't get that to work. > > supposedly you can override the user-{UID}.slice unit file and jam in > the cgroup restrictions, but I have hundreds of users clearly that's > not maintainable > > i'm sure others have already been down this road. any suggestions? > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf [beowulf.org] _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf [beowulf.org] -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: