[Beowulf] Problem with Single RAID disk larger than 2TB and Linux

Scott Atchley atchley at myri.com
Wed Oct 3 08:14:05 PDT 2007


Is someone using a signed int to represent the 1 KB blocks?

2 * 1024 * 1024 * 1024 * 1024 = 2199023255552

Scott

On Oct 3, 2007, at 7:29 AM, Anand Vaidya wrote:

> Dear Beowulfers,
>
> We ran into a problem with large disks which I suspect is fairly  
> common, however the usual solutions are not working.  IBM, RedHat  
> have not been able to provide any useful answers so I am turning to  
> this list for help. (Emulex is still helping, but I am not sure how  
> far they can go without access to the hardware)
>
> Details:
>
> * Linux Cluster for Weather modelling
>
> *  IBM Bladecenter blades and an IBM x3655 Opteron head node FC  
> attached to a Hitachi Tagmastore SAN storage, Emulex LightPulse FC  
> HBA, PCI-Express, Dual port
>
> * RHEL 4update5, x86_64 kernel 2.6.9-55 SMP and RHEL provided  
> Emulex driver (lpfc) and lpfcdfc also installed
>
> * GPT partition created with parted
>
> There is one 2TB LUN, works fine.
>
> There is a 3TB LUN on the Hitachi SAN which is reported as "only"  
> 2199GB ( 2.1TB) ,
>
> We noticed that, when the emulex driver loads, the following error  
> message is reported:
>
>             Emulex LightPulse Fibre Channel SCSI driver 8.0.16.32
>             Copyright(c) 2003-2007 Emulex.  All rights reserved.
>             ACPI: PCI Interrupt 0000:2d:00.0[A] -> GSI 18 (level,  
> low) -> IRQ 185
>             PCI: Setting latency timer of device 0000:2d:00.0 to 64
>             lpfc 0000:2d:00.0: 0:1305 Link Down Event x2 received  
> Data: x2 x4 x1000
>             lpfc 0000:2d:00.0: 0:1305 Link Down Event x2 received  
> Data: x2 x4 x1000
>             lpfc 0000:2d:00.0: 0:1303 Link Up Event x3 received  
> Data: x3 x1 x10 x0
>             scsi5 : IBM 42C2071 4Gb 2-Port PCIe FC HBA for System x  
> on PCI bus 2d device 00 irq 185 port 0
>             Vendor: HITACHI   Model: OPEN-V*3          Rev: 5007
>             Type:   Direct-Access                      ANSI SCSI  
> revision: 03
>             sdb : very big device. try to use READ CAPACITY(16).
>             sdb : READ CAPACITY(16) failed.
>             sdb : status=1, message=00, host=0, driver=08
>             sdb : use 0xffffffff as device size
>             SCSI device sdb: 4294967296 512-byte hdwr sectors  
> (2199023 MB)
>             SCSI device sdb: drive cache: write back
>             sdb : very big device. try to use READ CAPACITY(16).
>             sdb : READ CAPACITY(16) failed.
>             sdb : status=1, message=00, host=0, driver=08
>             sdb : use 0xffffffff as device size
>             SCSI device sdb: 4294967296 512-byte hdwr sectors  
> (2199023 MB)
>             SCSI device sdb: drive cache: write back
>
> The problem is with the READ CAPACITY(16) failed, but we are unable  
> to find the source of this error.
>
> We conducted several experiments without success:
>
> - Tried compiling the latest driver from Emulex (8.0.16.32) - same  
> error
> - Tried Knoppix (2.6.19) and Gentoo LiveCD (2.6.19 ) , and CentOS  
> 4.4   - same error
> - Tried to boot Belenix (Solaris 32 bit live), failed to boot  
> completely (may be unrelated issue)
>
> We have a temporary workaround in place: We created 3x1TB disks and  
> used LVM to create a striped 3TB  volume with ext3 FS. This works  
> fine.
>
> RedHat claims ext3 and RHEL4  supports disks upto 8TB and 16TB  
> respectively (since RHEL4u2)
>
> I would like to know if anyone on the list has any pointers that  
> can help us solve the issue.
>
> Regards
> Anand Vaidya
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list