memory bandwidth on Athlon systems
Brian D. Haymore
brian at chpc.utah.edu
Wed Jul 19 08:19:57 PDT 2000
Guignon Thomas wrote:
>
> Hello,
> We have recently received an Athlon 700 system (Abit KA7 + 2 x 512M
> pc100) and stream benchmark give poor results (even when using
> turbo mode) compared to my old K7M with an Athlon 600. Here
> are the results:
>
> zephyr: 700 kx133 pc100 turbo (2x512) ABIT KA7
> -------------------------------------------------------------
> This system uses 8 bytes per DOUBLE PRECISION word.
> -------------------------------------------------------------
> Array size = 999936, Offset = 0
> Total memory required = 24.9 MB.
> Each test is run 10 times, but only
> the *best* time for each is used.
> -------------------------------------------------------------
> Your clock granularity/precision appears to be 1 microseconds.
> Function Rate (MB/s) RMS time Min time Max time
> Copy: 302.8281 0.0529 0.0528 0.0530
> Scale: 300.8701 0.0533 0.0532 0.0533
> Add: 342.1093 0.0703 0.0701 0.0714
> Triad: 338.7202 0.0709 0.0709 0.0709
>
> copernic: 600 amd750 pc100 (1x128) ASUS K7M
> -------------------------------------------------------------
> This system uses 8 bytes per DOUBLE PRECISION word.
> -------------------------------------------------------------
> Array size = 999936, Offset = 0
> Total memory required = 24.9 MB.
> Each test is run 10 times, but only
> the *best* time for each is used.
> -------------------------------------------------------------
> Your clock granularity/precision appears to be 1 microseconds.
> Function Rate (MB/s) RMS time Min time Max time
> Copy: 358.9613 0.0447 0.0446 0.0450
> Scale: 436.5540 0.0367 0.0366 0.0368
> Add: 447.4549 0.0537 0.0536 0.0537
> Triad: 451.6738 0.0533 0.0531 0.0535
>
> As you see the old system is by far superior to the new one.
> I wonder if anyone has some experience with the Abit KA7 to help me
> solving this problem?
>
> A+
> --
> Thomas Guignon (guignon at asci.fr)
> Laboratoire ASCI, ORSAY(France), www.asci.fr
> #####
>
> _______________________________________________
> Beowulf mailing list
> Beowulf at beowulf.org
> http://www.beowulf.org/mailman/listinfo/beowulf
Move up to the RY bios release, it is beta but proves to be fairly
stable. With the RY set the memory interleaving to 4-way and that
should help out a lot. I have 34 950 Mhz Athlon systems with ABIT KA7
boards and have 256MB PC-133 modules in them. When I set them to PC-133
with the 10ns SDRAM timings and 4-way interleaving this is what I get.
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 10000000, Offset = 0
Total memory required = 228.9 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 244740 microseconds.
(= 244740 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 458.0563 0.3512 0.3493 0.3550
Scale: 457.9331 0.3503 0.3494 0.3518
Add: 526.3619 0.4586 0.4560 0.4595
Triad: 537.3515 0.4503 0.4466 0.4519
--
Brian D. Haymore
University of Utah
Center for High Performance Computing
155 South 1452 East RM 405
Salt Lake City, Ut 84112-0190
Email: brian at chpc.utah.edu - Phone: (801) 585-1755 - Fax: (801) 585-5366
More information about the Beowulf
mailing list