[Beowulf] Re: OT: PXE boot with no control over DHCP?
Donald Becker
becker at scyld.com
Thu Sep 22 09:17:31 PDT 2005
On Wed, 21 Sep 2005, David Mathog wrote:
> > 2) Make sure these hosts are on the same router or switch as your dhcp
> > server so your server manages to offer an address first, before the
> > campus dhcp that you don't manage.
>
> Here's where things go south. I don't see any evidence
> of the dhcp packets from the booting workstation reaching
> the server.
I'm probably a good person to answer this, having written the Scyld BeoPXE
server. This is a full PXE server written from scratch to meet the
specifications, and then modified to match reality ;->
First, a bit of baseline info:
PXE is *approximately* DHCP followed by TFTP (Note 1)
DHCP is built on BootP
BootP is based on broadcast UDP packets
Part of the reason for this design is because booting and boot
parameters are explicitly a network-local activity. You don't want the
boot information, such as default route and local printer, from some host
on the other side of the country.
If you want to have a centralized, multi-network boot server you can
explicitly configure BootP forwarding agents. A correctly written agent
(it's easy to get it correct -- they are trivial) will work correctly
with BootP, DHCP and PXE. The agent doesn't need to implement policy
(e.g. "only respond to these MAC addresses") if your underlying servers
only respond when desired.
My PXE server can be configured to respond to only PXE requests with the
'-p' flag. You can identify PXE requests -- they have "PXEClient" as the
first characters in one of the dhcpClassID options. (Note 2) DHCP
servers that "just happen" to work with PXE generally don't check for
this.
Note that if you use a BootP forwarding agent, the implementation must
fill in the IP addresses correctly. This is trivial for a network-local
PXE server, but not so easy when you consider that you have a bunch
of different machines involved:
- the forwarding agent for BootP packets
- the IP address of the PXE-DHCP server
- the IP address for the off-network router (default route)
- the IP address of an optional intermediate PXE agent
- the IP or multicast address for the PXE-TFTP server
Your implementation should also implement boot rate control and UDP flow
limiting, especially when multiple networks and routing are involved. Or
you can skip this if reliable booting isn't important :-O (Note 3)
* References:
* Preboot Execution Environment (PXE) Specification v2.1.
* RFC2131 describes the Bootp and DHCP request and response.
* DHCP options and values are in RFC1533.
* Also See RFC1350, RFC2090, RFC2347, RFC2348, RFC2349, etc.
Note 1
PXE isn't exactly DHCP+TFTP. If you implement according to the
specification, you must write your own combined service. The most
obvious example is using multicast, where the PXE DHCP information must be
gotten from the TFTP server, although there are other places as well.
Luckily for sleazy implementations, essentially all PXE client
implementation use the ubiquitous code provided by Intel which will fall
back to DHCP+TFTP in a compatible way.
Note 2
Pssst... want to break stuff? Build a PXE client that uses multiple
dhcpClassID options, and put the PXEClient option in the middle. Conform
to the standard and work with no one! But the generic DHCP servers, which
are not checking, will still blindly respond. In this case a hack works
while the more-correct implementations often fail.
Note 3
Come on.. reliable booting is important. The common "PXE" servers out
there are basically hacks. PXE is an ugly protocol and the clients are
dumb, but matching it with a dumb server and accepting unreliable booting
is not the answer. You can make PXE reliable by understanding
the common failures and bugs, and carefully designing a server to avoid
them.
> I also tried booting knoppix on the machine, because
> it uses dhcp to find it's IP address, but the one it came up
> with was from the campus DHCP server and not my DHCP server.
I'm guessing that you don't have a BootP forwarding agent.
But turning on firewall rules is also a very common reason that PXE
servers don't see traffic.
> The workstations in question have an MBA which offers
> 4 network boot options: PXE, tcp/ip, netware, and RPL.
The Netware and RPL boot modes are (marginally) usable, but there is a
reason that PC's were not considered to have a standard network boot until
PXE.
BTW, in some cases where locally-administered DHCP servers are
prohibited, we suggest the use of the Scyld-developed Beoboot system which
usually slips past the rules. We developed Beoboot as a network boot
system before PXE was common, but it wasn't designed only for a local
cluster environment. Beoboot uses an ad-hoc extension of RARP packets,
and thus always requires a network-local server rather than a
standard forwarding agent. For a similar approach see RFC1931: Dynamic
RARP Extensions for Automatic Network Address Acquisition.
--
Donald Becker becker at scyld.com
Scyld Software Scyld Beowulf cluster systems
914 Bay Ridge Road, Suite 220 www.scyld.com
Annapolis MD 21403 410-990-9993
More information about the Beowulf
mailing list