[Beowulf] A start in Parallel Programming?

Robert G. Brown rgb at phy.duke.edu
Tue Mar 13 11:55:34 PDT 2007


On Tue, 13 Mar 2007, Peter St. John wrote:

> Brown Dai-Sensei-Sama,
>
> Regarding "...Nobody knows why CPS departments no longer teach students to
> code in C (and instead teach a bizarre mix of C++, java, lisp, and
> god-knows-what else first -- at one time they just LOVED pascal and where is
> THAT now I aske you), ..."
>
> Pascal, C, C++, Java, and LISP are not 5 languages, really; let's say, they
> are spanned by a lower dimensional basis set. They are really two languages
> (C and LISP) with two (or more) conceptual paradigms (Procedural vs Object
> Oriented, say). It would be insulting to say that PASCAL is merely C with
> BEGIN, END instead of { and }, but...
> So I think that CS departments just agree with me, that you understand
> programming better if you learn two.

[Small-yield blast nearby.  Ground zero shakes.  A few small flakes fall
from the ceiling...]

It would be insulting and >>incorrect<< to say that PASCAL is C with
BEGIN and END.  Not that this wasn't an annoyance.  I type like the
wind, but typing B-E-G-I-N and E-N-D instead of {} is just dumb.  Almost
as dumb as

       DO 100 I=1,N
         ....
100   CONTINUE

pairs, but nothing can be quite THAT dumb except maybe forcing all code
sectioning to be done by indentation alone [as he snugs up his suit and
checks to make sure that the anti-snake-constriction reinforcing rods
are all in place].

Pascal was, and remains, the German of compilers.  All sentences must at
the end a verb have.  Declarations and definitions of all entities must
occur in strict order. Differences in passing by reference vs value,
especially for functions a la your example below.  Pascal made me
shudder (back when I enthusiastically tried it).  I could almost hear
the jackbooted heels tromping up to me door when it delivered its
compiler warnings about some silly little infraction I'd made in syntax.

All this is why CPS departments loved it.  It FORCED you, by jiminy, to
learn structured programming -- it wouldn't compile unless you'd
structured your code according to its inflexible and precise rules.
This is why developers hated it, and why every commercial application
ever written with it died a horrible death (or was backported to C or
C++ and rescued in the nick of time).  This is why after one or two
college-try programs, I vowed never again.

Beyond that your point is well taken -- there are at least three
distinct paradigms among the languages listed, which were not intended
to be an exhaustive list (especially if one includes interpreters and
scripting languages as well as compilers) but rather exemplary.  I'd
argue that the general COMPILER categories are procedural, object
oriented, and list oriented, although not very hard for somebody that
wanted to add a category of further distinguish these categories.  I'd
also argue -- quite vehemently -- that there is little "fundamental"
difference between procedural and OO languages -- the primary
operational requirement for an OO language is some sort of
implementation of a struct/union.  Everything else is a question of just
how cooked you want the struct to be at the compiler level, how much
protection you want from the compiler, how much freedom the compiler
gives you to bend its rules and adopt different coding styles as
appropriate to a problem, and how tight the compiler is relative to
assembler/machine language.

The more important differences are associated with efficiency and
convenience and how well the syntax matches the problem.  OO languages
are good for problems with fairly obvious OO structure, where
inheritance is more than just a tedious and somewhat silly concept, or
MAYBE for projects where protection matters, although I personally think
that having to work through an OO interface to set a trivial parameter
is akin to being asked to go around to the little window in front to get
helped by the person you're talking to through an open door.  When the
person WORKS for you, goddammit.  I'll tell YOU when I want to have to
work through the little window (which is so close to never as makes no
difference:-).

Fortran is good because it tells YOU how matrices are going to look and
what their indices will run over, and by jolly darn it you'd better
learn to live with it.  That makes compiler-writers very happy, as they
can optimize the hell out of matrix operations because they KNOW that
the matrix is rectangular and starts at 1 and has a fixed layout in
memory of its rows and its columns.  It also has a binary exponentiation
operator, which annoys C purists because it is properly a library
function and not an operation (a trancendental call no less) but is
convenient as all hell when writing code and I miss it.  I do NOT miss
doing anything at all to characters in fortran (noting well, Jeff, that
I last wrote in Fortran -- willingly -- back when F77 was embarrassingly
new).

C++ is good because it has a great library.  Not that a good library
makes the compiler any better, not that you can't find great libraries
for C, but C++ users swear by it.  And yeah, it does all that
inheritance and protection stuff (at the expense of a lot of obfuscation
in the way those objects are laid out -- trying to trace just what some
control variable is through layers of include files sucks).  And it
provides fast and easy I/O commands (as if printf isn't good enough,
sniff).  However, C++ coders (if they didn't start out in C) are
completely blind to what pointers are, what they can do, why they are,
really, amazingly useful.  Seriously -- I've had multiple C++-trained
students who wanted to work on projects for me who were completely
clueless about them.  And C++ tends to be fussy in other ways as well.

> Re: FORTRAN, for awhile there we didn't really compile it, but translate it
> to C and then invoke the C compiler. That gets you the beauty of the IMSL

Which didn't really work terribly well, of course.  Sometimes (for a lot
of code) it didn't work at all.  It isn't really a linear map.

> libraries and the efficiency of very sharply maintained C compilers, at the
> same time. Is there a good extant FORTRAN compiler? I wonder why, fortran is

Except that it gives up all of the much greater efficiency of Fortran
for numerical code.  Has C >>ever<< beaten fortran on non-trivial
numerical code?  I doubt it. I seem to recall Greg remarking on this in
flame wa.. I mean "discussions" past;-).  IIRC his point from that time,
C's very flexibility makes it much more difficult to optimize.  When I
create a typedef for a struct with three doubles and two ints as
contents, malloc a block of memory for a vector of the structs, and then
try to do linear algebra using the struct's second (y) double component,
the compiler simply CANNOT know that the y's are in a simple vector with
char stride 3*8 + 2*4 -- if that is indeed correct for the hardware in
question.  Fortran arm-twists one to allocate the y vector as a
standalone vector of stride 8 (or 1, double precision, using completely
standard and built in offset arithmetic) and even starts the offset
predictably, where I can make the offset into the C struct anything I
like, or can drop whole vectors of variable length at EVERY point inside
the struct by making the struct hold pointers and descriptors for the
vectors like length and type).  Compilers can even align the variable
favorably in memory where that matters -- what can they do with my
struct and its mallocs (or C++'s even more hidden equivalents).

In some cases the C (or C++) can be much, much easier to write and think
about because it lets you craft arbitrary objects that match the problem
instead of taking a much smallser set of built in objects that can be
handled efficiently and them matching them, however crudely, to the
problem.  C++ extends the latter still further, at still greater cost in
efficiency (although in many cases it can be programmed to be relatively
efficient, as can the C for that matter).

> easy to express in C (unlike conceptually variant languages, like APL or
> LISP).

And then, as you note, there are really different languages -- APL,
LISP, TCL, mathematica, python, perl -- languages where one "can" often
program anything you like, but where the language itself is very, very
far from the machine code produced and very, very difficult to optimize.
Basically, if you use one of them for a problem, you are acknowledging
that you don't much care about the low level efficiency of the language,
but that instead you REALLY care about how hard the program is to write,
how tightly it fits the problem.  So ultimately what I was recommending
was to not use them to do parallel programming the first time around,
and outside of that was simply advancing my own manifest biases in
procedural/OO compilers in favor of the arguably most fundamental.

I say fundamental because I personally like to be able to "see" through
the compiler to the underlying assembler.  With C that is quite simple
-- I have a fairly good idea of just how each loop is implemented, how
memory is laid out (in considerable detail), how loops roll or unroll.
I can exert even more control than I usually do by inlining assembler or
using pointers even more heavily than I do.  (In perl, on the other
hand, I have a very difficult time even imagining what it is doing with
memory, because that memory is allocated and deallocated according to an
incredibly arcane ritual (one I've actually looked at while thinking
about writing C extensions to perl).  Basically, you will NOT step out
of its memory management rituals or your code will die horribly, and
they are crafted to be a) amazingly stable in interpreted code; b)
efficient in that order.)

These latter are the reasons that C is heavily favored for writing
systems code and operation systems and kernels and so on.  You need to
"know" what various code fragments are going to compile to within
spitting distancne.  That's where fortran's exponentiation operator is a
bad thing, as it represents a library insertion of unknown etiology to
be able to handle x**1.74 correctly when x could be negative.  In C it
is either x*x or pow(x,1.74), where you know exactly where pow() lives
and how to handle its error codes on return.

     rgb

[As he staggers to the refrigerator, takes out a strange looking shiny
cylinder and presses it into a specially designed hopper on his suit.  A
bead or two of sweat is visible on his brow as he takes a hasty swig of
cooling liquid through a special fitting in the front of his mask.
Overhead, drones fly relentlessly overhead, looking for some sort of
signature, a ventilation pipe perhaps, at which they can launch their
diabolical payload.  Cripes!  He jumps up and pumps the handle to try to
retract the ventilation pipe, sour foam spewing from the fitting and
drooling off of his face and down towards his toes.  Was he in time?]

> Peter
>
> P.S.
> I don't write in LISP myself (or ALGOL) but I respect it's expression of
> conceptual frameworks that are awkward in C; LISP takes the von Neuman idea
> of code segment within data segment to fruition, so the list of arguments
> for a function may well be itself a function, since everything is a list. In
> C, the address passed to a function might be the entry point of a function,
> but you can't actually refer to that abstractly; although
> printf("The entry point of \"printf\" is %x\n", printf);
> will work unless your compiler is paranoid.
>

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu





More information about the Beowulf mailing list