[Beowulf] Has anyone actually seen/used a cell system?
diep at xs4all.nl
Mon Oct 2 20:39:36 PDT 2006
----- Original Message -----
From: "Eric W. Biederman" <ebiederm at xmission.com>
To: "Vincent Diepeveen" <diep at xs4all.nl>
Cc: "Andrew Shewmaker" <agshew at gmail.com>; <beowulf at beowulf.org>;
<J.A.Delcorso at larc.nasa.gov>
Sent: Tuesday, October 03, 2006 2:35 AM
Subject: Re: [Beowulf] Has anyone actually seen/used a cell system?
> "Vincent Diepeveen" <diep at xs4all.nl> writes:
>> ----- Original Message -----
>> From: "Eric W. Biederman" <ebiederm at xmission.com>
>> To: "Vincent Diepeveen" <diep at xs4all.nl>
>>>> Which is my bottom line point. You'll need 1000 compiler programmers to
>>>> supported well.
>>> Well I believe this 1000 compiler programmers number is quite high.
>>> Certainly that isn't needed for a simple language like C. One or
>>> two man years is sufficient to write a usable and decent optimizing
>>> C compiler.
>> The crucial word you use is 'decent'. That's open to judgement.
>> The only reason some years ago for me to move from DOS to windows, was
>> the hard
>> fact that visual c++ generated executables executing time 20% faster than
>> C++ (at the time the fastest DOS compiler) did do.
>> If i can program 1 full week very hard and speedup my chessprogram 0.5%
>> by that,
>> then i consider that a very good deal.
>> If 'decent' in your eyes is a factor 3 slower than the best C compilers,
>> then so
>> be it, but no one is going to use that C-compiler in that case.
> Not at all. What I mean by decent is a compiler that gets all of the easy
> low hanging fruit but doesn't necessarily hit every corner case just so.
> So within a factor of 1%-5% on most codes of the best compilers. But
> certainly doing poorly on the few codes that totally pound the corner
> cases you could have handled better.
Windows is what matters of course.
For windows there is 2 compilers. There is visual c++ 2005 and there is
I'll ignore GUI now, as obviously that is the most speed relevant part of
the software but our only option is to compile it in visual c++. There is
nothing else simply that is having the same speed.
In fact the visual c++ generates 486 code for it and that 486 code still is
a lot faster.
Speed matters for opengl. Not only the graphics card is an important issue
there; we cannot help it when someone has a slow graphics card. Majority has
a slow graphics card regrettably. So all the polygons that are in RAM need
to get sorted there and so on. All in RAM. Very slow.
I'll skip to the engine which works both for linux and windows.
Fastest at AMD64 is visual c++ 2005. Amazingly for 32 bits pgo-ed exe. I'm
not sure whehter it generates newer than 486 code actually. CMOV last time i
checked it didn't generate, despite that
at AMD64 this is just 1 cycle versus intel chips like P4 it is 7 cycles.
90% of all speed freaks in computerchess has right now K8 based chips of
I'm convinced they will all switch to core2 now when it releases, say within
1 year. Unless AMD has very quickly something even faster than that at the
market. For a part it won't matter; if that quad core chip of intel releases
at 2.67Ghz, they buy it at the moment it is available in the shops. .
Talking about most gflops per dollar. That chip kicks everything of course
in a practical manner.
Borland for graphics is just not interesting. Always was like factor 2
slower than other compilers.
I remember 1 test in past where it was "only" 45% slower. Long time ago. No
chance for improvements there.
Open Watcom C++ is there now. Honestely i didn't even test it too long. A
single compile took me hours to fix things in order to get it compiled; it
didn't improve much since its latest 10.0 release.
Extremely slow. Factor 2 slower roughly. + or minus 40% is no big deal then.
At AMD64 of course the third in speed after intel.
Even 64 bits pgo under linux the GCC has major problems getting close to
visual c++ 2005.
I remember a snapshot some time ago pretty ok. That snapshot was of course
but the released 4.1 is 20% slower than visual c++ 2005.
They can be lucky then that visual c++ isn't even using CMOV type
instructions, nor doing any effort
to be fast at AMD. Obviously visual c++ just stays ahead.
Some compilers that get mentionned here are difficult to use. For example
PGI is not compiling Diep at all. I downloaded the evaluation function and
it cannot compile programs that use linux function calls nor windows
He i need SOMETHING to time my program and i need SOMETHING to share memory
between processes. In *nix i do that with shmget/shmat and in windows i use
functions like MapViewOfFile. See MSDN. Windows is extremely primitive
>From some company coders who managed to use it for some apps i heard bad
reports. Those guys will never post to any group. We're talking about big
companies here who want to keep everybody their friend. Still i tried.
I emailed helpdesk about the problem. Took them 1 week to answer. I email
another question why the f**** those simple functions are not working. Next
week another answer. After 3 emails my evaluation trial period expired. No
more testing possible. Nothing working of course still.
There is reports the commercial compiler is faster of them. However if their
evaluation version doesn't compile my code. Not even a simple small testcode
program which i attached here, then i'm powerless against claims. If they
aren't willing to get even a public open source programmed compiled with
pgi, then i can't do anything.
Where are they on specint/specfp with their compiler?
My guess: slower than GCC.
Which basically is the beginners qualification. GCC is the lowest standard.
Everything underneath it, is utmost amateurs. GCC litterary is a bunch of
very nice guys. Great amateurs who in their spare time donate time to work
Sometimes a company (AMD) pays a tad and then GCC speeds up magnificently
I'm under the impression some P4 friends have been busy at GCC. Why do ANY
effort for intel at GCC? There is intel c++ already for that.
Then another compiler called Pathscale is there. It is slower than GCC. I
know they have done a lot of effort for all kind of floating point
benchmarks and all kind of specint programs, so i wish them good luck and
hope one day they produce something very fast and efficient which is good in
register handling and not solving everything by inlining everything.
The Diep code is about 2.2 megabyte C code. That entire 2.2MB gets used
during search. Of that 2.2MB the interface is about 30KB code.
If you realize that visual c++ with just generating 486 code is totally
outgunning all other compilers, you can perhaps *start* to realize the big
gap there when running on AMD processors.
When running on intel chips, there is no argument possible. Intel c++
totally dominates there.
Visual c++ doesn't even get close to that speed and is second best.
GCC i didn't actually test at intel chips. No one i know who is doing number
crunching owns any intel at the moment.
At itanium2 the difference between gcc and intel c++ is so big that it's not
> Getting a factor of 3 slower then the best C compilers on most codes
> is almost impossible. C hardly leaves you that much room to optimize.
> You almost have to write a pessimizing compiler to get that.
You really have no fu**in idea it seems about compiler technology and how
far you can go
to optimize code and/or mess it up.
I see regurarly factor 3 differences in codes between C compilers.
In fact i found 1 case of a 200 line program where net2003 is a factor 3.0
exactly slower than GCC at AMD64.
Simply because GCC is generating CMOV's and visual isn't.
In specfp at a certain point, the Sun team managed to speedup 1 program a
factor 6 or so.
So i challenge you, write me a compiler that is within 5% of speed of the
latest visual c++ and a compiler other than GCC for my AMD64 and i'll pay
you some big bucks for your source code.
Work with 2 persons at it for 2 years.
I'll compile my Diep code then at it, which most DEFINITELY is optimized to
run as fast as possible.
He, my move generator i put into the GPL, so you can already start testing
at that code with your compiler, where visual completely knocks silly GCC,
then you can get millionaire within 2 years.
You can get millionaire.
Already that generator that is in gpl the diff is > 5% between GCC and
I'm sure some intel c++ CEO is prepared to bet for 100 million euro with 1
to 100 payout.
(you get paid 100 euro for each euro you bet) with you that you can't get
within 5% of
intel c++ compiler at montecito itanium2/core2 hardware for diep's source
I repeat : Intel does NOT have my source code.
It would be amazing if you can get within the 25% range in fact. Quite
impossible in fact.
>> For a good support of a compiler you most definitely need thousands of
>> In case of a console only compiler with support, you'll already pass the
>> quite quickly.
> If you can come back and tell me how many more compilers you have written
> then we can have a reasonable discussion. Since you chosen to argue
> with someone was largely agreeing with you, and was simply putting
> things into perspective somehow I doubt there is much of any substance
> we can exchange.
A total underestimation and lack of understanding how much effort companies
are doing to
get fast at certain chips (especially intel chips). Your 2 guys doing the
work of 2000.
Your first problem when analyzing software will be writing a program that
can analyze the problem.
You realize there is not many profilers giving accurate information?
AMD...Intel both have one. Especially VTUNE is good.
Which profiler are you going to use to get accurate information?
GCC tools are not gonna help you there much.
How about 64 bits debugging tools?
Have one for me?
Sir, You need to write a compiler blindfolded i'm afraid, unless you are
going to use software from others. But well you mentionned 2 guys can solve
all the problems...
More information about the Beowulf