[Beowulf] Java vs C++ for interfacing to parallel library

Sun Aug 20 20:50:31 PDT 2006

On Mon, 21 Aug 2006, Jonathan Ennis-King wrote:

> One way to alleviate the performance hit is of course to use a 90% Java
> strategy, where the computationally intensive 10% (here, parallel sparse
> matrix inversion) is handled in C.
> It's the mixed language part that worries me with Java, especially in
> the light of rgb's comments. It is claimed by some that Java and C++ are
> largely incompatible. Or is this all solved in the Java native interface
> (JNI)?
>
> My specific question was whether anyone out there was running parallel
> codes either written completely in Java, or with Java wrappering some
> big numerical library for the hard part. Are there any additional issues
> with parallel performance, or it this just a subcase of Java-C
> interfacing in a scalar setting.

Well, GIYF.  For example,

   http://www.math.ucla.edu/~anderson/JAVAclass/JavaInterface/JavaInterface.html

turns up on the search string "java C interface", along with a few
zillion other hits and howto articles.

I think that the example given in the web hit above should serve to
guide you as a template on that route if that's what you want to try.
However, I wouldn't count on the result being horribly portable
cross-platform -- you'll face the double barrier of having to get the
same result from your C or C++ compiler on both platforms/OS
architectures AND have similar behavior from and versioning of the java
ports to both platform/OS's.

> The other option is the Unix-like strategy suggested by rgb, where for
> example the computational part is completely written in C, and then the
> pre and post-processing which benefit from a GUI are written in some
> other language (e.g. Java), or strung together from other unix tools and
> wrapper languages.

Or a library-based strategy.  If you take your core code and develop a
plain old reusable library out of it with a fairly straightforward API
(which takes a lot of programming discipline and a certain amount of
practice, I think, but isn't particularly difficult) then you can USE
the library in a variety of UIs.  You can write a simple
vanilla-variable C interface to embed in java or perl.  You can write a
tty/ascii UI for command line users.  You can write a Gtk/Gnome
interface using glade and callbacks for native X/linux.  The UI code may
or many not be portable (tty/ascii I/O code using standard posix and
libc and libm calls is pretty much lowest common denominator and tends
to run compile and run "anywhere" including on Windows boxes with
minimal tweaking, more complex UIs become successively harder to port to
and less portable, as a rule) but the library itself, if written in
"boring" C with a very clear and simple API, should be able to support
all the UIs and interactive languages in a straightforward manner.

That's actually what I'd recommend if you really want UI flexibility and
code maintainability.  The latter is a very important consideration that
hasn't been touched on yet.  If you actually WRITE your application
integrated with a complex top level language/tool like perl, python,
java then maintaining it becomes much more complicated (as I've learned
very much the hard way, alas). If the basic sparse matrix routine
library changes, it will likely lead to emergent bugs.  If your C/C++
encapsulation of that routine changes or your application goals change
over time along with your core code, it's another thing to debug.  If
java (or any other UI base) changes, it's yet another thing to debug.
If you get enough layers (and languages and data interfaces) in there,
running down bugs and deciding "whose fault" they are gets to be really
quite difficult.  Fault aside, just finding them and fixing them can
become painful in the extreme.

For that reason I think it is a really good idea to have as few distinct
layers as possible to work with and to SEPARATE those so that they can
be separately debugged with a clean layer (API) in between.  If your
program is wrapped up in a library with a very simple C tty/ascii UI,
you control it all and can be pretty certain that any errors you
encounter are in YOUR code in ONE language and ONE data representation.
In other words, you have a decent chance of efficiently debugging
things.

OTOH if the only way you observe the failure is by accessing your core
routines through a big, complicated GUI written in an entirely different
language with its own data representation and with all sorts of stuff
happening at the callback level and with multiple layers of event loops
or even multiple threads (GUI-based programs have the nasty habit of
blocking when they are in their core work loops UNLESS they are written
with multiple threads, and multiple threads of course are far more
complex and enable far more subtle bugs to surface) then you're looking
at a LOT more work to debug any problems.  I personally would rather
tattoo the windows logo on my left bicep with a dull needle and food
coloring than mess with it at all.

Worse still, if you write a GUI-based program WITHOUT a relatively clean
API between the UI part and the work part and have NO other UI or
encapsulation to work on just the actual work routines, then god help
you if you ever have to debug the code OR rework the GUI.  Even "simple"
stuff like adding a new graphical display of some result can become
nightmarish if the UI and operational code are all entangled together.

So even if you ultimately want to integrate with java, with R, with
perl, or write a native GUI for some platform or another, I'd strongly
suggest writing your core code as a de facto library with its own
#include files that define the shared interface and all externally
visible data structures.  Develop this code with a simple ASCII front
end -- basically a command line parser to input program parameters and
perform any needed initialization, a work routine that takes the input
data and calls the core library subroutine(s) that do the required work
and produce a desired result (and does no significant work itself), and
a minimal output layer that pulls the result out of the standard
interface variables altered or returned by the library calls according
to the program API and dumps it onto either stdout or into a
command-line specified file where it can be verified for selected input
test data.

With this minimal encapsulation and debugging system, you can then do
whatever you like with the core work routines, quickly and easily.  For
example it is absolutely straightforward to replace the command line
interface with a glade-constructed set of GUI input widgets, the work
interface with a callback on a "run" button, and to add whatever kind of
output interface you like (graphical or otherwise).  Or you can fancy up
your command line interface and wrap it up in python or perl.  Or you
can modify the command line interface so that it is suitable for turning
the library calls into java or perl subroutine calls and obtaining the
input from java variables and delivering the output back to java
variables.  If you always take care to maintain your minimal C/tty UI
along with the library, you can easily isolate any problems that emerge
to JUST one layer in the initialization, execution, postprocessing,
presentation sequence, in particular keeping the execution part isolated
from the rest (that are more likely to be tied to some particular UI
environment with quirks, an API and data representation, and even a
language of its own).

    rgb

>
>
> Robert G. Brown wrote:
>> On Sun, 20 Aug 2006, Joe Landman wrote:
>>
>>> Jonathan:
>>>
>>> Jonathan Ennis-King wrote:
>>>
>>>> Does anyone have experience writing parallel Java code (using MPI) with
>>>> calls to C libraries which also use MPI? Is this possible/sensible? Is
>>>> there a big performance hit relative to doing the same in C++?
>>>
>>>
>>> Unless all of the important optimizable calculation is done in libraries
>>> that you are stitching together with Java glue, the compiled languages
>>> are likely to be quite a bit faster.
>>>
>>> There is a sizeable abstraction penalty associated with OO languages.
>>> Many of the design patterns that they encourage (object factories,
>>> inheritance chains, etc) are anathema to high performance.
>>
>>
>> Hear, hear!
>>
>>>> I'm considering writing some parallel code to do fluid flow in porous
>>>> media, the heart of which is solving systems of sparse linear equations.
>>>> There are some good libraries in C which provide the parallel solver
>>>> (e.g. PETSC), but I'm trying to resolve which language to use for my
>>>> code. The choice is between C++ and Java, and although I'm favouring
>>>> Java at present, I'm not sure about its performance in this context.
>>>
>>>
>>> Hmmm.  For this, C or Fortran may be far more appropriate.  Depends upon
>>> what it is you want to do with the code.  High performance using MPI
>>> depends upon many factors.  If there is one particular part of the code
>>> that is better served by an OO based language, then I might suggest
>>> designing/implementing all the speed sensitive bits in a language which
>>> lets you achieve high performance, and then interfacing them to your OO
>>> language so that the OO system isn't being used for the critical time
>>> sensitive portions.
>>
>>
>> <disclaimer>Parts of the stuff below are editorial comment and religious
>> belief and can be ignored or sniffed at by those of differing
>> belief.</disclaimer>
>>
>> Remember well the observation that you can write object oriented code in
>> a procedural language (and ditto, you can write procedural code in an OO
>> language).  Matching the language to the kind of code -- or more
>> likely, the personal taste of the coder -- simply makes development a
>> bit more simple and natural.
>>
>> Untimately, OO vs procedural code is a matter of style as much as
>> anything else.  I write "real" code exclusively in C.  I'm in the
>> process of (re)writing a random number testing program (dieharder) into
>> a library-based tool that was originally (first pass) quite procedural
>> in its design.  In the second pass, as I came to fully understand the
>> data objects better in practice and could start to see how the code
>> could be simplified and compressed, I began to introduce a set of "lazy"
>> shared objects for certain parts of the code.
>>
>> In the third (current) pass I'm splitting off all of the actual testing
>> code, as opposed to the startup/results/presentation UI code, into a
>> library.  Since most of the tests share a very similar implementation
>> structure and certain control variables in common, I can now see
>> precisely how to make the code very object oriented with a set of "test
>> objects" (structs and similarly structured test implementations that
>> read from them and fill them in) and a single set of "shell" code for
>> calling a standard test.  This reduces writing a UI to nothing but
>> simple, repetitive boilerplate for calling the actual tests and
>> displaying the returned results -- one can focus on the human side of
>> the UI and stop worrying about the tests, and one can relatively easily
>> and scalably add more tests or RNGs to test.
>>
>> Since the code is still both lazy OO and C, I can freely intersperse the
>> use of pointers, can choose to treat variables (incluing all
>> structs/objects) as "opaque" or not as makes sense in the code, and keep
>> the code as efficient as C can make it, which is to say damn near as
>> efficient as assembler.  The "objectness" of the encapsulated tests just
>> permits me to write a relatively clean API to the library (without too
>> many test specific global/shared variables or the even greater hassle of
>> dealing with passing variable length argument lists through layers of
>> encapsulating subroutines) so that when I'm done adding a UI or GUI or
>> implementing the tests native inside e.g.  R or octave or whatever will
>> be fairly straightforward.
>>
>> The point being that one CAN write non-lazy OO code in C or even in
>> Fortran -- that's more a question of program design and an understanding
>> of the basic data objects that a program requires, although it certainly
>> helps if the language permits the definition of a struct of one sort or
>> another.  One has the choice in C, though, of writing fully OO, lazy
>> (mixed) OO or fully procedural code when and where that is appropriate
>> for either ease of coding or program efficiency.  I suppose that choice
>> exists to some extent for at least some non-fascist OO environments
>> (e.g. C++ as a sort-of superset of C) but I think that the only people
>> who even know how to do so are those who have learned to code in a
>> non-OO language first -- people who learn C++ as their primary language
>> tend to be pretty clueless about pointers or the performance advantages
>> of NOT using protection and inheritance in your structs but just letting
>> everything access them directly.  C provides few safety nets but rather
>> permits you to do pretty much anything you like, at your own risk, in
>> code that is ultimately transparent.
>>
>> Now, I personally believe that all nontrivial programs go through stages
>> like the three described above no matter what language they are written
>> in.  This is one of the reasons that Wirth's Pascal had its day and that
>> it passed -- whether one starts at the top or at the bottom or both, one
>> is likely to encounter mismatches that require rethinking all or part of
>> the memory hierarchy one begins with in any difficult project.  In that
>> SECOND pass and beyond, both strict-topdown and strict-bottomup
>> languages tend to require MORE work to fix than one that is less
>> hierarchically prestructured.
>>
>> Perhaps there are OO ubercoders that can just "see" what the data
>> objects appropriate to a complex application are from the beginning and
>> can start off with the right top level, mid, AND bottom level objects
>> all perfectly enmeshed and integrated but I have yet to meet one.  One
>> of the great (IMO) illusions promoted by OO fanatics is that by using an
>> OO language (per se) to write the code in the first place one can
>> somehow shorten this process and home in on the correct hierarchy of
>> data structures (objects or not) that optimally support the
>> application's efficient implementation from top to bottom.  This is not
>> my experience, but hey, the world is a big place and there may be people
>> who just think that way and for them it may be true.
>>
>> For code like the specific stuff you want to implement above that have
>> efficient libraries written in C, my guess is that you would do best
>> using C -- this is pretty much a no-brainer.  It is highly probable that
>> in C you have the best access to example programs using the library,
>> UIs, human support in the form of others who use the libraries in their
>> C code, and more.  Even communicating with the author/maintainers of the
>> library is bound to be simplest if you are implementing in C.  Second
>> best would almost certainly be C++, as C++ can (I believe) call C
>> libraries fairly transparently or with a minimal C++ encapsulation of
>> the C prototypes and data structures.
>>
>> OTOH Fortran and C tend to have somewhat different subroutine call
>> mechanisms so binding a C library into fortran code or VV tends to be a
>> PITA -- for example, C always passes subroutine arguments by value,
>> fortran by reference.  In addition, C and fortran use slightly different
>> conventions for other simple stuff e.g. terminating a string.  Some of
>> the issues associated with the port are mentioned here:
>> http://star-www.rl.ac.uk/star/dvi/sun209.htx/node4.html as well as
>> elsewhere on the web.  Basically, calling C libraries in fortran code is
>> possible but requires some work and code encapsulation (and vice versa
>> for calling fortran routines from inside C code, IIRC -- fortran/C
>> compiler folks can check me on this:-).
>>
>> Java, octave, matlab, python, perl etc. are MUCH WORSE in this regard.
>> All require NONTRIVIAL encapsulation of the library into the interactive
>> environment.  I have never done an actual encapsulation into any of
>> them, but I'll wager that it is really quite difficult because each of
>> them has their very own internal data types that are REALLY opaque
>> objects that bear little overt resemblance to the simple "all data
>> objects can be viewed as a projection onto a block of memory with either
>> typed or pointer driven offset arithmetic" view of data in C or for that
>> matter C++ or Fortran (with slighly different projective views in both
>> cases).
>>
>> These languages typically permit you to allocate memory by just using a
>> named variable.  This is marvelously convenient for an interactive
>> environment -- it is marvelously expensive in terms of program
>> efficiency because the underlying environment has to manage allocating
>> the memory transparently extensibly (most of the languages permit you to
>> allocate whole vectors or matrices of variables by just referencing
>> them), tracking instances of the memory in code, and freeing the memory
>> when it is no longer referenced or being used.  Conservatively, so that
>> they tend to keep things if there is ANY CHANCE of their ever being
>> referenced, making them typically memory hogs almost as bad as a C
>> program would be if every memory reference in the program was to static
>> global memory -- no memory allocation or freeing at all, beyond whatever
>> goes on stack/heap in the course of subroutine calls or internal
>> function execution.  Complicated hashes or advanced list structures are
>> used to keep the execution itself moderately efficient (but highly
>> INefficient compared to a decent compiler with flat memory outlays).
>>
>> The point being that you have to interface these opaque and not
>> obviously documented data types to the C library calls.  This is surely
>> possible -- it is how all those perl libraries, matlab toolboxes, java
>> interfaces come about.  It will probably require that you learn WAY more
>> about how the language itself is implemented at the source level than
>> you are likely to want to know, and it is probably not going to be
>> terribly easy...
>>
>>    rgb
>>
>>>
>>>>
>>>>
>>>>   Jonathan Ennis-King
>>>>
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org
>>> To change your subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>>
>>
>
>
> - --
>  Jonathan Ennis-King
>  email: Jonathan.Ennis-King at csiro.au
>  post: CSIRO Petroleum, Private Bag 10, Clayton South, Victoria, 3169,
> Australia
>  ph: +61-3-9545 8355 fax: +61-3-9545 8380
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.5 (GNU/Linux)
> Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
>
> iD8DBQFE6QYzzYw438SPLScRAqrGAJ997UJwcWXjdf3CGpGeb6tBFFfHlQCgpBTe
> d5DPvPgmj3rYng+9m04bVvQ=
> =i5QY
> -----END PGP SIGNATURE-----
>

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu