[Beowulf] Questions about upgrading InfiniBand
prentice at ias.edu
Wed Apr 18 12:45:50 PDT 2012
I just thought of something else... All of my current IB devices
(switch, HCAs) are copper with CX4 connectors. It looks like all the
Mellanox QDR and FDR cards use QSFP connectors, so that's something else
I'll have to consider with my upgrade plans.
On 04/18/2012 11:05 AM, Prentice Bisbal wrote:
> I'm planning on adding some upgrades to my existing cluster, which has
> 66 compute nodes pluss the head node. Networking consists of a Cisco
> 7012 IB switch with 6 out of 12 line cards installed, giving me a
> capacity of 72 DDR ports, expandable to 144, and two 40-port ethernet
> switches that have only six extra ports between them.
> I'd like to add a Lustre filesystem (over InfiniBand) to my cluster,
> and then begin adding/replacing nodes in the cluster. Obviously, I'll
> need to increase capacity of both my IB and ethernet networks. The
> questions I have are about upgrading my InifiniBand.
> 1. It looks like QLogic is out of the InfiniBand business. Is Mellanox
> the only game in town these days?
> 2. Due to the size of my cluster, it looks like buying a just a
> core/enterprise IB switch with capacity for ~100 ports is the best
> option (I don't expect my cluster to go much bigger than this in the
> next 4-5 years). Based on that criteria, it looks like the Mellanox
> IS5100 is my only option. Am I over looking other options?
> 3. In my searching yesterday, I didn't find any FDR core/enterprise
> switches with > 36 ports, other than the Mellanox SX6536. At 648 ports,
> the SX6536is too big for my needs. I've got to be over looking other
> products, right?
> 4. Adding an additional line card to my existing switch looks like it
> will cost me only ~$5,000, and give me the additional capacity I'll need
> for the next 1-2 years. I'm thinking it makes sense to do that, and wait
> for affordable FDR switches to come out with the port count I'm looking
> for instead of upgrading to QDR right now, and start buying hardware
> with FDR HCAs in preparation for that. Please feel free to
> agree/disagree. This brings me to my next question...
> 5. FDR and QDR should be backwards compatible with my existing DDR
> hardware, but how exactly does work? If I have, say an FDR switch with a
> mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to
> the lowest-common denominator, or will the slow-down be based on the two
> nodes involved in the communication only? When I googled for an answer,
> all I found were marketing documents that guaranteed backwards
> compatibility, but didn't go to this level of detail, I searched the
> standard spec (v1.2.1), and didn't find an obvious answer to this question.
> 6. I see some Mellanox docs saying their FDR switches are compliant with
> v1.3 of the standard, but the latest version available for download is
> 1.2.1. I take it the final version of 1.3 hasn't been ratified yet. Is
> that correct?
More information about the Beowulf