[Beowulf] DARPA issues 20 MUSD grant to nVidia to go from 1 GFLOPS/Watt to 75 GFLOPS/Watt

Eugen Leitl eugen at leitl.org
Mon Dec 17 05:21:00 PST 2012


http://www.networkworld.com/community/blog/darpa-awards-20m-nvidia-stretch-achilles-heel-advanced-computing-power


DARPA awards $20M to Nvidia to stretch "Achilles Heel" of advanced computing:
Power

Processor-maker Nvidia will get chance to boost power chip output from 1
GFLOPS/watt to 75 GFLOPS/watt

By Layer 8 on Fri, 12/14/12 - 2:43pm.


DARPA Nvidia said this week it got a contract worth up to $20 million from
the Defense Advanced Research Projects Agency (DARPA) to develop chips for
sensor systems that could bolster power output from today's 1 GFLOPS/watt  to
75 GFLOPS/watt. 

The five-year award was made under DARPA's Power Efficiency Revolution For
Embedded Computing Technologies or PERFECT program which aims to produce a
revolutionary approach to processing power efficiency which has become the
Achilles Heel of increased computational capability.

NEWS: The weirdest, wackiest and coolest sci/tech stories of 2012

>From DARPA: This approach includes near threshold voltage operation, massive
heterogeneous processing concurrency, and novel architectural developments
combined with techniques to effectively utilize the resulting concurrency and
tolerate the resulting increased rate of soft errors.  The PERFECT program
will leverage anticipated industry fabrication geometry advances to 7 nm.
Research and development will specifically address embedded systems
processing power efficiencies and performance, and are not concerned with
developments that focus on exascale processing issues.  No operational
hardware is to be built in this program, instead a simulation capability will
be developed that will measure and demonstrate progress."

In the past, computing systems could rely on increasing computing performance
with each processor generation.  Following Moore's Law, each generation
brought with it double the number of transistors.  And according to Dennard's
Scaling, clock speed could increase 40% each generation without increasing
power density.  This allowed increased performance without the penalty of
increased power, DARPA stated.

DARPA said PERFECT system development will address five areas including:

    Architecture: to address hardware and software power efficiency
innovation and development. Example areas anticipated for development include
near threshold voltage operation, heterogeneous massive concurrent
architectural approaches, and novel hardware architectural approaches such as
new memory hierarchies, application-specialized cores, and data movement
minimizing techniques. At the software level, the goal is to develop
technologies and techniques that tolerate and exploit new hardware
capabilities and overcome the associated limitations. This specifically will
include addressing concurrency and reliability.

    Concurrency : This element will address the hardware and software to
support high levels of concurrency - thousands to millions of concurrent
execution streams. Hardware efforts in this area may include processing cores
and data stores of varying capabilities and efficiencies and perhaps include
automatically synthesized processing elements that are optimized for the
embedded platform's workload. Software efforts in this area may include
language development or augmentation, compilers, and support software to
specify and manage concurrent threads.

    Resiliency: Will focus on the issue of soft errors. Such errors are
expected to increase for near threshold voltage operation.

    Locality: Will focus on minimizing run-time data communication by
managing data location and availability. In particular, the memory hierarchy
and the software to manage data are included. Languages and language
annotations that enable programmer control of data allocation as well as
automatic control of data allocation will be investigated.

    Algorithms: Software techniques to minimize energy consumption. In
addition, algorithmic approaches to enable the tolerance of hardware faults
will be investigated, at both the kernel and the system levels.

Nvidia calls it program Project Osprey, and the company says it will research
low-power circuits and extremely efficient architectures and programming
systems that enable 75 gigaflops per watt, using process technologies as
advanced as 7 nanometer (nm) compared with today's 28-nm process.

Project Osprey will utilize the company's heterogeneous computing and
parallel processing technology, which enable more efficient processing than
traditional CPUs, the firm says.  Nvidia said its  processors are used in a
wide variety of embedded applications, including automobiles made by Audi,
BMW, Tesla and Lamborghini, aircraft including the F-22 Raptor, and US Army
tanks.

 


More information about the Beowulf mailing list