Tikalon Blog is now in archive mode.
An easily printed and saved version of this article, and a link
to a directory of all articles, can be found below: |
This article |
Directory of all articles |
Benchmarks
November 17, 2010
Benchmarks are ubiquitous. Starting at a very personal level, there's your
Body Mass Index (BMI), your
SAT score, and possibly your
GRE among other
post-graduate admission tests. Corporations have things such as
return-on-assets (ROA), and countries have their
gross domestic product (GDP). Academics have their infamous
citation analysis that puts a new spin on the "
publish or perish"
paradigm. Supercomputers have benchmarks associated with two standard software libraries,
Fortran programs known as
LINPACK and
LAPACK.
LINPACK, an acronym for Linear Algebra Package, is a software library for doing linear algebra. It was written in Fortran, the language of choice for scientific computation, in the late 1970s. It was ported from
Fortran 77 to
Fortran 90, and it provides the essentials for mathematical analysis. There is a benchmark based in LINPACK that assesses the
floating point performance of a computer using a type of
linear algebra problem that's common in
engineering. The resultant metric is
floating point operations per second (FLOPS). As
computer architecture has evolved to include things such as huge
cache memories, LINPACK has been replaced by LAPACK, which runs much faster on the newer architectures. LAPACK contains all the essentials, including
least-squares analysis and many
matrix operations for both real and
complex numbers in single and
double precision. The LINPACK and LAPACK benchmarks have been used to rate supercomputer speed for decades, and there's a web site,
TOP500, that publishes a list of the 500 fastest computers based on a LINPACK derivative called
hpl (High Performance Linpack).[1] At this writing, a
Tianhe-1A system at
China's National Supercomputer Center in
Tianjin holds the record at 2.57 petaflop/sec.[1] It runs
Linux, of course.
The
Cray-1, an early commercial supercomputer, was hot stuff when it was introduced in 1976 in more ways than one.
Freon was needed as a coolant for its power-hungry
ECL logic chips. The Cray-1 floating point performance is
estimated to be about 80 MFLOP. As can be seen from the ratio of 2.57 PFLOP to 80 MFLOP (= 32 million), times have changed, so much so that
Sandia National Laboratory has proposed a new supercomputer rating system called Graph500. If all goes according to schedule, it will be announced today at the
Supercomputing Conference 2010 in
New Orleans.[3]
The purpose of this new benchmark is to rank computers according to their ability to analyze large, graph-based structures that occur outside the
physics realm of typical scientific calculations. These calculations forge relationships between data points in such areas as
biological and
social systems. The Graph500 benchmark is intended as a reminder to supercomputer vendors that their machines are used for more than just physics. Graph500 got its start in informal conversations at the
Supercomputing Conference in 2009, and it became an international effort by more than thirty supercomputing professionals with a website at
www.graph500.org. Richard Murphy of Sandia says that the idea isn't to replace LINPACK/LAPACK but to have a complementary test that addresses these other areas of science. One potential problem is that supercomputer manufacturers may ignore a new test, especially if the results on the current machine architectures are not that good. Murphy says that
Intel,
IBM,
Advanced Micro Devices and
NVIDIA have shown interest. NVIDIA
GPU chips are used in the 2.57 petaflop/sec Chinese supercomputer.[1]
Synthetic graph generated by Kronecker multiplication, a small-scale version of one of the Graph500 tests. (Sandia National Laboratory/Jeremiah Willcock, Indiana University). (Click for a larger version.)
Where can we apply Graph500 supercomputers? The Graph500 realm is the analysis of large, sparse datasets where searching is important and computation is only secondary. As a consequence, fast random access of huge memories is essential. In short,
exascale computing requires exascale memory. Sandia lists the following applications:[2]
• Cybersecurity. Some organizations may create 15 billion log entries per day.
• Medical informatics. There are an estimated 50 million patient records, with 20 to 200 records per patient, resulting in billions of individual pieces of information.
• Data enrichment. One example is maritime domain awareness, where there are hundreds of millions of individual transponders, tens of thousands of ships, and tens of millions of pieces of cargo.
• Social networks; e.g., Facebook.
• Symbolic networks. As an example, the human cortex has 25 billion neurons with about 7,000 connections each.
As present examples, Murphy says that "There's been good graph-based analysis of
pandemic flu. Facebook shows tremendous social science implications.
Economic modeling this way shows promise. Many of us on the steering committee believe that these kinds of problems have the potential to eclipse traditional physics-based HPC [
high performance computing] over the next decade."[2]
References:
- Top 500 Supercomputer Sites.
- Neal Singer, "New standard proposed for supercomputing," Sandia News Release, November 15, 2010.
- Richard Murphy, David Bader and Marc Snir, "Unveiling the First Graph 500 List," Supercomputing Conference 2010 (New Orleans), November 17, 2010.
- John Markoff, "Technology; Measuring How Fast Computers Really Are," New York Times, September 22, 1991.
Permanent Link to this article
Linked Keywords: Body Mass Index; BMI; SAT; Graduate Record Examination; GRE; post-graduate admission tests; return-on-assets; ROA; gross domestic product; GDP; citation analysis; publish or perish; paradigm; Fortran; LINPACK; LAPACK; Fortran 77; Fortran 90; floating point; linear algebra; engineering; floating point operations per second; FLOPS; computer architecture; cache memory; least-squares analysis; matrix operations; complex numbers; double precision floating-point format; TOP500; hpl; High Performance Linpack; Tianhe-1A; National Supercomputer Center; China; ; Tianjin; Linux; Cray-1; Freon; emitter-coupled logic; ECL; Sandia National Laboratory; Supercomputing Conference 2010; New Orleans; physics; biology; social sciences; ACM/IEEE Supercomputing Conference; www.graph500.org; Intel; IBM; Advanced Micro Devices; NVIDIA; GPU; exascale computing; pandemic flu; economics; high performance computing.