Tikalon Header

Benchmarks

November 17, 2010

Benchmarks are ubiquitous. Starting at a very personal level, there's your Body Mass Index (BMI), your SAT score, and possibly your GRE among other post-graduate admission tests. Corporations have things such as return-on-assets (ROA), and countries have their gross domestic product (GDP). Academics have their infamous citation analysis that puts a new spin on the "publish or perish" paradigm. Supercomputers have benchmarks associated with two standard software libraries, Fortran programs known as LINPACK and LAPACK.

LINPACK, an acronym for Linear Algebra Package, is a software library for doing linear algebra. It was written in Fortran, the language of choice for scientific computation, in the late 1970s. It was ported from
Fortran 77 to Fortran 90, and it provides the essentials for mathematical analysis. There is a benchmark based in LINPACK that assesses the floating point performance of a computer using a type of linear algebra problem that's common in engineering. The resultant metric is floating point operations per second (FLOPS). As computer architecture has evolved to include things such as huge cache memories, LINPACK has been replaced by LAPACK, which runs much faster on the newer architectures. LAPACK contains all the essentials, including least-squares analysis and many matrix operations for both real and complex numbers in single and double precision. The LINPACK and LAPACK benchmarks have been used to rate supercomputer speed for decades, and there's a web site, TOP500, that publishes a list of the 500 fastest computers based on a LINPACK derivative called hpl (High Performance Linpack).[1] At this writing, a Tianhe-1A system at China's National Supercomputer Center in Tianjin holds the record at 2.57 petaflop/sec.[1] It runs Linux, of course.

The
Cray-1, an early commercial supercomputer, was hot stuff when it was introduced in 1976 in more ways than one. Freon was needed as a coolant for its power-hungry ECL logic chips. The Cray-1 floating point performance is estimated to be about 80 MFLOP. As can be seen from the ratio of 2.57 PFLOP to 80 MFLOP (= 32 million), times have changed, so much so that Sandia National Laboratory has proposed a new supercomputer rating system called Graph500. If all goes according to schedule, it will be announced today at the Supercomputing Conference 2010 in New Orleans.[3]

The purpose of this new benchmark is to rank computers according to their ability to analyze large, graph-based structures that occur outside the
physics realm of typical scientific calculations. These calculations forge relationships between data points in such areas as biological and social systems. The Graph500 benchmark is intended as a reminder to supercomputer vendors that their machines are used for more than just physics. Graph500 got its start in informal conversations at the Supercomputing Conference in 2009, and it became an international effort by more than thirty supercomputing professionals with a website at www.graph500.org. Richard Murphy of Sandia says that the idea isn't to replace LINPACK/LAPACK but to have a complementary test that addresses these other areas of science. One potential problem is that supercomputer manufacturers may ignore a new test, especially if the results on the current machine architectures are not that good. Murphy says that Intel, IBM, Advanced Micro Devices and NVIDIA have shown interest. NVIDIA GPU chips are used in the 2.57 petaflop/sec Chinese supercomputer.[1]

Synthetic graph generated by Kronecker multiplication

Synthetic graph generated by Kronecker multiplication, a small-scale version of one of the Graph500 tests. (Sandia National Laboratory/Jeremiah Willcock, Indiana University).
Larger version.

Where can we apply Graph500 supercomputers? The Graph500 realm is the analysis of large, sparse datasets where searching is important and computation is only secondary. As a consequence, fast random access of huge memories is essential. In short,
exascale computing requires exascale memory. Sandia lists the following applications:[2]
• Cybersecurity. Some organizations may create 15 billion log entries per day.
• Medical informatics. There are an estimated 50 million patient records, with 20 to 200 records per patient, resulting in billions of individual pieces of information.
• Data enrichment. One example is maritime domain awareness, where there are hundreds of millions of individual transponders, tens of thousands of ships, and tens of millions of pieces of cargo.
• Social networks; e.g., Facebook.
• Symbolic networks. As an example, the human cortex has 25 billion neurons with about 7,000 connections each.

As present examples, Murphy says that "There's been good graph-based analysis of
pandemic flu. Facebook shows tremendous social science implications. Economic modeling this way shows promise. Many of us on the steering committee believe that these kinds of problems have the potential to eclipse traditional physics-based HPC [high performance computing] over the next decade."[2]

References:

  1. Top 500 Supercomputer Sites.
  2. Neal Singer, "New standard proposed for supercomputing," Sandia News Release, November 15, 2010.
  3. Richard Murphy, David Bader and Marc Snir, "Unveiling the First Graph 500 List," Supercomputing Conference 2010 (New Orleans), November 17, 2010.
  4. John Markoff, "Technology; Measuring How Fast Computers Really Are," New York Times, September 22, 1991.