previous | contents | next

468 EVOLUTION OF COMPUTER BUILDING BLOCKS

of structure and multiprocessors is that all the ALUs in the multi-ALU processor support a single instruction stream, as shown in Figure 4, while each of the processors in the multiprocessor supports its own instruction stream.

Figure 4. Multi-ALU processor.

If we define a processor to be a unit capable of both decoding and executing instructions, then the multi-ALU processor is not really a multiple processor system. However, multi ALU organizations are often considered as alternatives to multiprocessors and derive the same benefits from advances in LSI technology as multiprocessors.

A number of well-known computer systems fall into the multi-ALU category. Classical examples include the CDC 6600, with its ten functional units (specialized ALUs), the IBM 360/91 with independent and pipelined floating-point add/subtract and multiply/divide units. Array or vector processors such as ILLIAC IV and CRAY I also fall into this category, but use a specialized vector instruction stream to direct the execution of an array of arithmetic units or a highly pipelined arithmetic unit.

Comparing Alternative Multiple Processor Structures

Networks, multiprocessors, and multi-ALU computers have been presented as three generic methods of organizing processors to build highly parallel computer systems. The three classes can be thought of as varying along a single dimension - the degree of coupling between processors in the system. This term is often used in a general way, but let us define it to be the worst case processor s minimum access time to a global data structure in the system. For example, in the computer network of Figure 3, the minimum data access time for a processor is the access time to local memory. Assuming that the global data structure in this particular network resides in the primary memory of computer 1, an access to global data by computer 1 would take a single memory fetch (on the order of 1 microsecond), while computer 5 will have to send a message to computer 1 requesting the necessary information (on the order of 50 milliseconds). However, the worst case access time is seen by computer 4, which must access the data in computer 1 via a three-hop sequence involving computers 3 and 2, and this might take more than 100 milliseconds.

In a multiprocessor, each processor has direct access to global data stored in primary memory. Since interprocessor communication occurs by sharing primary memory, the interaction times are on the order of 1 to 50 microseconds. In a multi-ALU computer, the analog of inter- processor communication is the transfer of control information that occurs between the control unit and its associated processing elements. Typically, this information is transferred over direct control lines and does not involve memory fetches, making it considerably faster than interprocessor communication in a multiprocessor.

previous | contents | next