[Beowulf] A careful exploit?

Robert G. Brown rgb at phy.duke.edu
Tue Jun 11 20:18:32 PDT 2019


On Tue, 11 Jun 2019, Jonathan Engwall wrote:

> Update:Centos assures me that all host are down.
> 
> On Sun, Jun 9, 2019, 1:41 PM Jonathan Engwall
> <engwalljonathanthereal at gmail.com> wrote:
>       Hello Beowulf,
> Recently we had serious trouble with the internet. A technician had to
> climb the pole. Another technician, an IT specialist in Mexico City,
> could not resolve the issue, sent the man here.
> Now trouble is back. What does this mean? Where are the missing IPs?
> From the pole to the modem, to my repeater, to my machine, and then my
> VM gives this using nmap:

I think you'll have to give more detail than that to get any help. For
starters:

* Where are the hosts in 192.168.0.x physically located?

* Where is the host that is probing for them physically located?

* In particular, are they on the same network (that is, is the scanning
host also in 192.168.0.x)?

* What is the output of ifconfig on a host that is up (e.g.
192.168.0.1).  What is the output of ifconfig on the host that is
probing?  What is the output of ifconfig on a host that is marked "down"
(assuming you can login to it directly via a console -- this may not be
possible depending on your cluster configuration).

* How are you assigning IP addresses to the cluster nodes/hosts with the
missing IP numbers?  DHCP?  What is acting as your DHCP server?  How is
it configured?

* Are these real hosts, each with their own network interface (wired or
wireless), or are these virtual hosts?  Can you put a console of some
sort onto one of the hosts that is supposedly "down" and see what IT
thinks its network is doing?

The problem as I see it could be many, many things.  For example, a
faulty netmask might be able to produce the odd pattern of working and
not working you see.  On a private internal network of this sort, the
netmask should PROBABLY be ffffff00, for example, on the network I'm
typing this reply on, my laptop produces the line:

inet 192.168.0.130  netmask 255.255.255.0  broadcast 192.168.0.255

(note that netmask and broadcast are basically inverses of each other).

A minor problem is that you should almost certainly not assign any of:
192.168.0.0, 192.168.0.254, 192.168.0.255 as ost IP numbers, certainly
not without knowing exactly what you are doing.  Note well that the
latter is the broadcast address.  The 254 address is often reserved for
the router itself, if you are relying on a router/switch like a Netgear
to do DHCP.  Some switches also use 192.168.0.0 (or 192.168.0.1) as the
router control address.  Assigning these with DHCP can then have some
"interesting" side effects.

An interesting possibility (if you are using some sort of "canned"
cluster software) is that you are running two DHCP servers -- one inside
of your router, and one inside of your head node -- without realizing
it.  That would produce spectacularly inconsistent results as they raced
to reply to the broadcast requesting an IP number, especially if one of
the two is in a default configuration giving out IP numbers in, for
example, 192.168.1.x.

The problem here is that networks are indeed difficult to manage if you
don't know what you are doing, and yes, even "experts" working for an ISP
are often startlingly incompetent.  I imagine that there are people on
list that can help you, but you'll have to give us a LOT of information,
way more than just the results of a single nmap scan.  If I were
debugging this, I'd look first for some glaring problem (like two DHCP
servers, a ridiculous netmask, broken hardware) and then home in on a
single host that has no IP number but should have an IP number, and look
at the logging produced by whatever is doing the DHCP serving and the
dmesg output from the boot on the host in question to see if I could
figure out what is going wrong.  But lacking the access to your network
or the privileges required to debug in this way or a knowledge of how
things are laid out and what you are trying to do with them, how can I
help?

     rgb

> 
> Starting Nmap 6.40 ( http://nmap.org ) at 2019-06-09 13:30 PDT
> Initiating Ping Scan at 13:30
> Scanning 256 hosts [2 ports/host]
> Completed Ping Scan at 13:31, 6.64s elapsed (256 total hosts)
> Initiating Parallel DNS resolution of 256 hosts. at 13:31
> Completed Parallel DNS resolution of 256 hosts. at 13:31, 0.04s
> elapsed
> Nmap scan report for 192.168.0.0 [host down]
> Nmap scan report for 192.168.0.1
> Host is up (0.0080s latency).
> Nmap scan report for 192.168.0.2
> Host is up (0.00068s latency).
> Nmap scan report for 192.168.0.3 [host down]
> Nmap scan report for 192.168.0.4 [host down]
> Nmap scan report for 192.168.0.5
> Host is up (0.063s latency).
> Nmap scan report for 192.168.0.6
> Host is up (0.00068s latency).
> Nmap scan report for 192.168.0.7 [host down]
> Nmap scan report for 192.168.0.8 [host down]
> Nmap scan report for 192.168.0.9 [host down]
> Nmap scan report for 192.168.0.10 [host down]
> Nmap scan report for 192.168.0.11 [host down]
> 
> Is this a new exploit?
> Thank you,
> Jonathan Engwall
> 
> 
>

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu




More information about the Beowulf mailing list