[Beowulf] May 1st 2013 (10 weeks) Coursera High Performance Scientific Computing

Eugen Leitl eugen at leitl.org
Wed Apr 10 07:38:18 PDT 2013


High Performance Scientific Computing

Randall J. LeVeque

Programming-oriented course on effectively using modern computers to solve
scientific computing problems arising in the physical/engineering sciences
and other fields. Provides an introduction to efficient serial and parallel
computing using Fortran 90, OpenMP, MPI, and Python, and software development
tools such as version control, Makefiles, and debugging.

Watch intro video Next Session: May 1st 2013 (10 weeks long)	Workload:
10-12 hours/week 

About the Course

Computation and simulation are increasingly important in all aspects of
science and engineering. At the same time writing efficient computer programs
to take full advantage of current computers is becoming increasingly
difficult. Even laptops now have 4 or more processors, but using them all to
solve a single problem faster often requires rethinking the algorithm to
introduce parallelism, and then programming in a language that can express
this parallelism.  Writing efficient programs also requires some knowledge of
machine arithmetic, computer architecture, and memory hierarchies.

Although parallel computing will be covered, this is not a class on the most
advanced techniques for using supercomputers, which these days have tens of
thousands of processors and cost millions of dollars. Instead, the goal is to
teach tools that you can use immediately on your own laptop, desktop, or a
small cluster. Cloud computing will also be discussed, and students who don't
have a multiprocessor computer of their own will still be able to do projects
using Amazon Web Services at very low cost.

Along the way there will also be discussion of software engineering tools
such as debuggers, unit testing, Makefiles, and the use of version control
systems. After all, your time is more valuable than computer time, and a
program that runs fast is totally useless if it produces the wrong results.

High performance programming is also an important aspect of high performance
scientific computing, and so another main theme of the course is the use of
basic tools and techniques to improve your efficiency as a computational

Course Syllabus

The use of a variety of languages and techniques will be integrated
throughout the course as much as possible, rather than taught linearly. The
topics below will be covered at an introductory level, with the goal of
learning enough to feel comfortable starting to use them in your everyday
work. Once you've reached that level, abundant resources are available on the
web to learn the more advanced features that are most relevant for you.

Working at the command line in Unix-like shells (e.g. Linux or a Mac OSX

Version control systems, particularly git, and the use of Github and
Bitbucket repositories.

Work habits for documentation of your code and reproducibility of your

Interactive Python using IPython, and the IPython Notebook.

Python scripting and its uses in scientific computing.

Subtleties of computer arithmetic that can affect program correctness.

How numbers are stored: binary vs. ASCII representations, efficient I/O.

Fortran 90, a compiled language that is widely used in scientific computing.

Makefiles for building software and checking dependencies.

The high cost of data communication.  Registers, cache, main memory, and how
this memory hierarchy affects code performance. 

OpenMP on top of Fortran for parallel programming of shared memory computers,
such as a multicore laptop.

 MPI on top of Fortran for distributed memory parallel programming, such as
on a cluster.

Parallel computing in IPython.

Debuggers, unit tests, regression tests, verification and validation of
computer codes.

Graphics and visualization of computational results using Python.

Recommended Background

Experience writing and debugging computer programs is required : 

Preferably experience with scientific, mathematical, or statistical
computing, for example in Matlab or R. (Previous knowledge of Fortran,
Python, or parallel computing languages is not assumed.)

Students should also be comfortable with undergraduate mathematics,
particularly calculus and linear algebra, which is pervasive in scientific
computing applications. Many of the examples used in lectures and assignments
will require this background. Past exposure to numerical analysis is a plus.

All of the software used in this course is open source and freely available.
A Virtual Machine will be provided that can be used to create a Linux desktop
environment (with all of the required software pre-installed) that can be run
on any operating system using the free VirtualBox software. An Amazon Web
Services AMI will also be provided to allow doing the course work in the

Suggested Readings

Course notes will be provided to compliment lectures. The notes and slides
from lectures will also contain many references to other free resources on
the web, along with some recommended books on the topics covered.

Course Format

The class will consist of lecture videos with integrated quiz questions.
There will also be programming assignments that are not part of the lectures
and optional reading material.

About the Instructor

Randall J. LeVeque

University of Washington


Information, Tech, and Design

Statistics and Data Analysis

More information about the Beowulf mailing list