Learnings

Saturday, September 08, 2007

Computer Evolution

Summary of Chapter 2 from William Stalling's Computer Arch book:
The first generation of computers used vaccuum tubes, the second generation transistors and the third generation, integrated circuits.

Two architecture styles evolved, the one followed in IBM 7094 and mainframes and the second which was used in PDP-8 and subsequent minicompters and PC's.

The IBM machines had Data Channels (a precursor to modern direct memory access) which have a dedicated processor for handling I/O requests and which manage data transfers to one or more peripheral devices. The CPU sent a control signal to the Data Channel along with the memory location which contained the I/O request instructions and the Data Channel then executed the requests (meaning transfer of data into and out of main memory) and sent a signal back to the CPU. This greatly reduced the burden on the main CPU. As there are now multiple processors contending for main memory, a Multiplexor is required to schedule access to memory.

The PDP-8 from DEC followed an omnibus architecture where there is a common omnibus to which the CPU, the main memory, console controller and I/O modules were connected. The CPU had to manage all the co-ordination and use of the shared bus (does this make the CPU slower?). The omnibus had 96 separate signal paths to carry data, control and address signals. This architecture is highly flexible, allowing modules to be plugged into the bus to create various configurations.

Semiconductor Memory
Initially, computer memory was constructed from tiny magnetic rings called cores strung up on grids of fine wires suspended from a screen. The act of reading from memory was destructive though (the data would be lost when read and had to be re-written!). This changed in the 70's when semiconductor memory was introduced.

Microprocessors
Intel achieved a major breakthrough when it developed its 4004 which was the first chip to contain all the components of a CPU - the microprocessor was born. This chip could read or write 4 bits at a time from the CPU. Also, the registers were 4-bits meaning that two 4-bit numbers could be added at a time. This evolved into 8008, 8080, 286, 386, 486, Pentium and Itanium processors which went up from 8-bit to 16-bit, 32-bit and then 64-bit.

Microprocessor Speed
Moore predicted that the number of transistors that could be put on a chip would double every 18 months and this has held true. Chip designers are able to pack more and more into a chip but processor designers are having to innovate to make use of all that awesome power. Some techniques used are:
  1. Branch prediction - Processor looks ahead in instruction code and predicts which branches are likely to be executed next and pre-fetches them. If the guess is right mostly, the processor is kept busy.
  2. Data flow analysis - The instructions are analyzed for dependencies and an optimized schedule created (which need not be the same as the original program order) for execution.
  3. Speculative execution - Using branch prediction and data flow analysis, some processors speculatively execute instructions ahead of their actual appearance in program execution and hold the results in temporary locations.
Performance Balance
While processor speed is racing ahead, other critical components of the computer have not kept up. System designers hence have to adjust the architecture and organization to balance the mismatch among different components.
The biggest mismatch is in the interface between CPU and main memory. Dynamic RAM density has gone up a 1000 times in 15 years, the CPU speed has gone up 200 times but the dynamic RAM speed has only gone up 2 times! The interface between main memory and CPU is the most crucial pathway in the entire computer because it is this which carries instructions and data for processing. If the memory or the inteface is slow, the processor will stall in a wait state.
Also, the speed at which dynamic RAM density is increasing is greater than the rate at which main memory requirements are going up. As a result, systems typically need fewer memory chips over time. This has a negative impact as it reduces the chances for parallelism.
The CPU-memory interface problem is solved by using one or more of these techniques:
  • Make the DRAM's wider rather than deeper and use a wider data bus
  • Increase the interconnect bandwidth by using higher-speed buses and by using a hierarchy of buses to buffer and structure data flow.
  • Change the DRAM interface to include some sort of cache on the DRAM chip
  • Introduce fast and efficient cache structures on the CPU chip and in between CPU and memory.
Another problem area is the handling of I/O devices. As computers become faster, the demand for I/O also goes up rapidly. E.g., graphics application needs 30MB/s, video about 70MB/s, disk controller about 10MB/s, network about 30 MB/s and so on. While the processors are capable of handling these data volumes, the problem is with moving the data from the processor and peripheral. Again, the strategies used are buffering, dedicated processors for I/O and higher-speed interconnection buses.

Designers have to build systems which adapt to the different rates at which system components are evolving and also to the different application demands (is this why specialised systems are built, such as Pentium 3 for 3d graphics, Pentium 2 for multi-media, etc?).

IBM along with Motorola, developed the PowerPC chips used in Apple Macintoshes, RS/6000's and numerous embedded chip applications (like network management chips). The PowerPC chips had both an L1 and an L2 cache on chip and this seems to have evolved much, much more now.


Labels:

0 Comments:

Post a Comment

<< Home