Thoughts on High Performance Computing

Supercomputing & The CyberInfrastructure

Thoughts on High Performance Computing: from 1989 till the present and beyond

I have been involved in technical computing for over 40 years, starting with minicomputers that were used by this community. This is a collection of articles, memos, talks, and testimony covering that period.

Paths and cul de sacs on the endless road to supercomputing. (2.5 PPT). Talk given at the 100th Centenary of John Vincent Atanasoff at Iowa State University on 30 October-November 1 at their Symposium on Modern Computing. This talk is substantially along the lines of my criticism of the U.S. efforts to regain supremacy for the world's fastest computer from Japan. Our approach that uses loosely coupled, commodity-based uni- and multi-processor computers is still not a match for a very high speed cross-point switch interconnecting shared memory vector processors that NEC has evolved based on the "Cray" formulae.
Global Grid Forum Keynote 25 June 2003, Seattle. (12 MB PPT, ) Basic messages using six examples: enable your applications now as web services, Grid developers need to adopt users in order to understand how to create the Grid, and we see a real drive to Community and Data Centric Computing. This new world will be around databases and data mining where discoveries come from analyzing large datasets. Over time, the web services program model may be so pervasive, that these services will be the way to communicate within the computer. Furthermore, web services begin to open up the 20 year promise of distributed computing.
DOE Science Computing Conference 2003 19 June 2003 includes my talk entitled Seven Paths to High Performance (and the Petaflops) 2 MByte PDF. Evolution is the most likely path. Any idea of a magic bullet that will not be tried by Moore's Law is fool-hearty. The DARPA HPC program is certain to require more than six years to achieve any significant performance. The tried and true vector architecture with tight coupling among nodes must remain as a component as the Japanese have shown in their use at NEC and Fujitsu --and especially their compilers. Finally, we must go back and work on the programming environment for cluster that includes both the compiler and run-time systems.
International Conference on Computational Science (ICCS2003) Presentation (PPT) and PDF "Progress, Prizes, and Community-centric Computing, Melbourne 2 June, 2003. The presentation has three parts: history of the Gordon Bell Prize and the computers that have enabled the winners; a very brief look at the misguided efforts of GRID computing to supply more operations as epitomized by the Tera-grid project, and ends with a strong plea for Community- and data-centric versus Centers-centric versus. In this regard, planning has to revert to the various scientific communities that have the need. Whether such communities are capable of planning and operating as a community is a key question. However, the SkyServer that the astronomy community constructed provides a model as does NCAR.
Testimony to the NRC/CSTB Committee on High Performance Computing 22 May 2003. (1MB PPT).
Interview in IEEE Software Engineering July 1987 discussing the NSF research needs aimed at parallelism as Assistant Director for the Computers and Information Science and Engineering Directorate. Also, the Gordon Bell Prize was posited. I stated: "Our goal is obtaining a factor of 100 in the performance of computing, not counting vectors, within the decade and a factor of 10 within five years. I think 10 will be easy because it is inherently there in most applications right now. The hardware will clearly be there if the software can support it or the users can use it. Many researchers think this goal is aiming too low. They think it should be a factor of I million within 15 years. However, I am skeptical that anything more than our goal will be too difficult in this time period. Still, a factor of 1 million may be possible through the SIMD approach. The reasoning behind the NSF goals is that we have parallel machines now and on the near horizon that can actually achieve these levels of performance. Virtually all new computer systems support parallelism in some form (such as vector processing or clusters of computers). However, this quiet revolution demands a major update of computer science, from textbooks and curriculum to applications research.
Funding_Alternatives_for_NSF_Supercomputing_Centers_870821 (Word, HTML) A memo as the founding Assistant Director of NSF's CISE (Computing and Information Science and Engineering) directorate. It outlines the basic problem of how funds are allocated to form computing centers. It should be noted that by 2003, nothing has fundamentally changed in this allocation process.
CACM_Future_HPC_8909 article positing the race to the Teraflops.
CACM_ Ultracomputers_A_Teraflop_Before_Its_Time_9208 argues that given the state of software standards and the ability to exploit the peak performance of large, parallel computers, spending over 100 million for a computer is not prudent. Just wait and let Moore's Law provide a less expensive machine at a later time and invest in software tools and research.
A policy for how to conduct computing research entitled (GB_Report_Policy- Best Computer_R&D_support_1994-1995.doc )was written in 1994 based on observing the results of numerous computing research projects, nearly all of which failed.
NRC_Brooks-Sutherland_HPC_Supercomputing_Review_Panel_Talk_940311 is a panel chaired by Fred Brooks and Ivan Sutherland that I helped form to try to effect a change in the supercomputing scene Another supportive supercomputing report was issued..
IEEE_Parallel_and_Dist_Proc_Why_there_won't_be_MPP_Apps_9409 discusses the search for a programming model to enable portable applications.
CACM_Observations_on_Supercomputing_Alternatives_9603 having reached a Teraflops, now what will be the approach to get to a Petaflops?
IEEE_Proc_DSM_Perspective_Another_Point_of_View_9903 is a perspective on Distributed Shared Memory Architectures that I felt were essential for programmability, but were unattainable because clustered commodity computers aka multicomputers were driving them out.
The Next Ten Years in Supercomputing, SC99CD PowerPoint (2 MB) keynote talk given at The 14th Mannheim Supercomputer Conference, 10 June 1999, Supercomputing 99 Conf. Proc. CD. Click to view the talk with a browser.
CACM_What's_Next_in_HPC_Bell_&_Gray_0202 for the time being, it is all over -- clustered Beowulfs using the Intel architecture are "the way". The good: everyone can buy and build their own, and we have a common platform that enables apps. The bad: programs that exploit the inherent performance of large clusters to provide over 10% of peak performance is still elusive.
Petascale Computational Systems: Balanced CyberInfrastructure in a Data-Centric World (pdf). Response to NSF invitation regarding the CyberInfrastructure, October 2005, by Alex Szalay, Johns Hopkins, Jim Gray and Myself. Abstract: Computational science is changing to be data intensive. NSF should support balanced systems, not just CPU farms but also petascale IO and networking. NSF should allocate resources to support a balanced Tier-1 through Tier-3 national cyber-infrastructure.
Department of Energy ASCAC (Advanced Scientific Computing Advisory Committee) Panel Report on Petascale Performance Metrics for Centers and Projects submitted to Dr. Ray Orbach, the Under-secretary of Energy for Science, on 27 February 2007 by panelists: F. Ronald Bailey, Gordon Bell (Chair), John Blondin, John Connolly, David Dean, Peter Freeman, James Hack (co-chair), Steven Pieper, Douglass Post, Steven Wolff. The 10 appendices from presentations to the performance panel 2006-07-23 (35.5 Mbytes).

Historical

History of Supercomputers PowerPoint talk (PDF) and one hour Video of the talk was given at Lawrence Livermore National Laboratory on 24 April, 2013 .wmv. Abstract: Since my first visit to Livermore in 1961 and seeing the LARC, recalling the elegance of the 6600, and just observing this computer class evolution have been high points of my life as an engineer and computing observer. Throughout their early evolution, supercomputer architecture “trickled down” in various ways for use with other computers. In the mid 1990s the flow reversed when large computers became scalable and constructed from clusters of microprocessors. Unlike the two paths of Bell’s Law that account for the birth, evolution, and death of other computer classes e.g. minicomputers, http://ieeeghn.org/wiki/index.php/STARS:Rise_and_Fall_of_Minicomputers supercomputers have doubled in performance every year for the last 50 years just by building larger structures. While computer performance is the first order term to track their high performance, many other technologies e.g. FORTRAN, LINPACK, government funding policy, and applications have contributed to the extraordinary progress. This talk traces a trajectory and contributors to this exciting class.

My perspectives on Seymour Cray's contributions to Computing. Talk and Script in html and PowerPoint. This talk was presented at the University of Minnesota Cray lecture Services, 10 November 1997. The html presentation contains about 90 slides with photos of various diagrams and photos of all of the "Cray" computers. PowerPoint slides with text of the talk is about 2 Mbytes and can be downloaded for reading.
Supercomputing-A_Brief_History_1965_2002 (.doc, html) is an expanded draft of the article by Jim Gray and myself, but focuses more on the history and especially traces the evolution of the "Cray" Brand, starting with Seymour Cray at Univac, going to help form CDC (1957), forming Cray Research (1972) where he created the vector supercomputer and on to Cray Computer Corp. to SRC and finally bought by Tera that was renamed Cray Company. This rather tragic trajectory shows how government policy has helped wipe out the US Supercomputing industry aka Cray, and on the other hand enable NEC to provide the highest performance computer measured in real application performance (RAP) in the 2002-2005 time frame.
A 1984 Critique on an NRC/OSTP Plan for Numerical computing 1984 was written, and conditions are virtually unchanged . The report starts: "I believe the report greatly underestimates the position and underlying strength of the Japanese in regard to Supercomputers. The report fails to make a substantive case about the U. S. position, based on actual data in all the technologies from chips (where the Japanese clearly lead) to software engineering productivity." Note a set of heuristics are given for managing such an effort.