A survey of problems and preliminary results concerning parallel processing and parallel processors1
Summary After an introduction which discusses the significance of a trend to the design of parallel processing systems, the paper describes some of the results obtained to date in a project which aims to develop and evaluate a unified hardware-software parallel processing computing system and the techniques for its use.
1. Multiprogramming, multiprocessing, and parallel processing
A brief review of the literature, of which a partial listing is given in the bibliography, reveals an active and growing interest in multiprogramming, multiprocessing, and parallel processing. These three terms distinguish three modes of usage and also serve to indicate a certain historical development. We cannot here attempt to trace this history in detail and so must rely on the bibliography to credit the contributions from industrial, university, and other research and development organizations.
The emergence of autonomous input-output devices first suggested [Gill, 1958] the time-sharing of the processing and peripheral units of a computing system among several jobs. Thus surplus capability that could not be applied to the processing of the leading job in a batch processing load, at any stage of the computation, could be usefully applied to successor jobs in the work load. In particular, while any computation was held up for some I/O activity, the single main processor could be used for other computation. The necessary decision-taking, scheduling, and allocation procedures were vested in a supervisor program, within which the user-jobs were embedded, and the resultant mode of operation was termed Multiprogramming.
The use of computers in on-line control situations and for other applications giving rise to ever-more stringent reliability and availability specifications, resulted in the construction of systems including two or more central processing units [Leiner et al., 1959; Bright, 1964; Desmonde, 1964; McCullough et al., 1965]. Under normal circumstances, with all units operational, each could be assigned a specific activity within an overall control program. As a result of the multiplicity of units in such Multiprocessing Systems, failure of any one would degrade, but not immobilize, the system, since a supervisor program could re-assign activities and configure the failed unit out of the system. Subsequently, it was recognized that such systems had advantages over a single processor system in a more general environment, with each processor in the system having a multiprogramming capability as well.
Finally, following from ideas first exploited in the Gamma 60 Computer [Dreyfus, 1958], there has come the realization that multi-instruction counter systems can speed up computation, particularly of large problems, when these may be partitioned into sections which are substantially independent of one another, and which may therefore be executed concurrently-that is, in parallel. When the several units of a multiprocessing system are utilized to process, in parallel, independent sections of a job, we exploit the macro-parallelism [Lehman, 1965] of the job, which is to be distinguished from micro-parallelism [Lehman, 1965], the relative independence of individual machine instructions, exploited in look-ahead machines. This mode of operation is termed Parallel Processing and, as in PL/I [IBM OS/360, PL/I Language Specification, Form C28-6571, p. 74], the execution of any program string is termed a Task. We note that parallel processing may, and normally will, include multiprocessing activity.
2. The approach to parallel processing system design
In the previous section we indicated that the prime impetus for the development of parallel processing systems arose from their potential for high performance and reliability. These systems may operate as pools of resources organized in symmetrical classes and it is this property that promises High Availability. They also possess a great reserve of power which, when applied to a single problem with the appropriate degree of parallelism, can yield high
1Proc. IEEE, vol. 54, no. 12, pp. 1889-1901, December, 1966.