Computer

Definition and delimitation

The invention of the computer formed the basis for the establishment of the scientific discipline of computer science. A computer is a mathematical machine that can automatically execute sequences of instructions formulated as programmes. A universal computer is a computer that can, in principle, execute all problems that can be described by algorithms as programmes and solve them in a finite amount of time. Most modern computers are universal computers that have one or more processors for executing programmes, a working memory and an input/output system. The components of a computer communicate with each other via a suitable connection structure.

Each processor consists of a control unit, one or more arithmetic logic units (ALU), a register set for data operands and control information and usually a hierarchy of fast caches, which reduce the access time to repeatedly required instructions and data from the main memory. Every processor has an instruction set that contains instructions for loading and saving data operands, for conditional and unconditional jumps to other parts of a programme and for processor and system control, in addition to arithmetic and logical operation instructions. A modern processor is also able to process programme interruptions (interrupts) that are triggered by external signals (including from sensors or other computers) or the operating system.

The computer’s working memory (often referred to as main memory) contains the programmes and data required at runtime. The input/output system creates the connection to mass storage devices and for communication between individual computers and other systems in computer networks. The transition from individual computers to computer networks is seamless. Within a computer, however, data is usually exchanged at much higher speeds than in computer networks.

The majority of computers in existence today work with digital logic, which is realised in the form of integrated electronic circuits on microchips. Depending on the desired computing power, complete computers can consist of a single chip or be implemented as very complex systems with thousands of communicating chips.

The hardware structure and organisation of its processors is decisive for the processing speed and flexibility of a computer. The control centre of a processor controls the processing of the individual programme instructions as they pass through the phases of their life cycle. Instruction processing is roughly divided into the phases fetch instruction (memory access), decode instruction (by decoding units in the control unit), fetch operands (by register or memory access), execute instruction (by a functional unit of the arithmetic unit) and write back result (by register or memory access). Processors are usually clocked synchronously. An instruction processing phase lasts one clock cycle. The higher the clock frequency, the more correspondingly shorter phases the instruction processing must be divided into.

If a single processor executes the phases overlapped on several instructions simultaneously, this is referred to as phase pipelining. If a single processor can process several instructions simultaneously in each pipeline phase, this is referred to as superscalar processing. Modern superscalar cores control the order of instruction execution in such a way that as many independent instructions as possible are executed simultaneously (Doweck 2017).

If a superscalar processor can simultaneously process instructions from several different programmes, processes or threads (simultaneous execution threads of a process), this is referred to as multithreading. The individual or combined use of these three forms of organisation (pipelining, superscalar processing, multithreading) significantly increases the instruction throughput, as there are then many more instructions in the processor simultaneously in a given time than without these acceleration techniques.

If even more computing power is required, multiprocessor systems are used. A distinction is made between memory-coupled systems, in which all processors have direct access to the main memory and can use shared data, and message-coupled systems, in which each processor controls its own memory address space and communication between processors takes place by sending and receiving messages via a connection network.

As a result of advances in microelectronics, it is now standard to accommodate complete multiprocessors on one chip. This is referred to as multicore processors or chip multiprocessors. Each core is to be regarded as a complete individual processor, but several cores can share common resources, such as a very large cache, which is also located on the multicore chip. In addition to processor cores and caches, parts of the main memory and components such as signal processors and sensors can also be located on a single chip. This is referred to as systems-on-chip (SoC).

History

The first ideas and design plans for a universal computer emerged as early as the 19th century. As early as 1834, the English philosopher, mathematician and inventor Charles Babbage came up with a concept for the so-called Analytical Engine, a mechanical construction consisting of columns of decimal wheels to display and process individual digits in the arithmetic unit (mill) and memory (store). Communication was to take place via transport mechanisms consisting of ropes and cogwheels. Programme commands and data were to be read from punched cards. Only a small part of the Analytical Engine was completed as a demonstration model. Babbage’s role as the “great uncle of the computer” (Wilkes 1995) is also due to his involvement with algorithmic programming. Together with the publicist Ada Lady Lovelace, he developed the basics of a programming language which, in addition to linking instructions and repetition loops, even contained the concept of the conditional jump. This made his computability concept more universal and powerful than that of the first digital computers in the 20th century.

Konrad Zuse, the inventor of the first digital computer, the purely mechanical Zuse Z1 (1938) and its relay-controlled successor Zuse Z3 (1941), already used the dual number system and a self-developed circuit algebra. Programme commands were applied to punched 35 mm film rolls, which were read out by a reading mechanism. Instead of a fixed number size, he also invented the floating point format with sign, mantissa and exponent, which greatly simplified numerical calculations. The instruction set comprised only nine instructions. However, it lacked the conditional jump, which is why the Zuse computers, like the first computers developed in the USA, the Harvard Mark I relay computer (by Aiken, 1944) and the ENIAC tube computer (by Eckert/Mauchly, 1945), were not yet universal computers (Märtin 2001).

The theoretical foundations for the concept of computability had already been laid between 1931 and 1936 by Kurt Gödel and Alan Turing. In 1944, the Hungarian mathematician John von Neumann at the University of Pennsylvania, together with Arthur Burks and Herman Goldstine, built on the theoretical work of Gödel and Turing and designed the first universal computer and so-called stored programme computer EDVAC. The finished design paper was published in 1946. However, this concept was not realised in computer hardware at Princeton University until 1952 with the IAS computer.

With knowledge of the EDVAC design, the first executable universal computers were developed in England. As early as 1948, Williams and Kilburn completed the Manchester Mark I at the University of Manchester, which still had a bit-serial arithmetic unit. One year later, Maurice Wilkes realised the first universal computer with a bit-parallel arithmetic unit, the EDSAC computer, at the University of Cambridge.

Application and examples

After that, things continued apace and the computer became a commercial product. Eckert and Mauchly built the first commercial universal computer UNIVAC I for Remington Rand in 1951. IBM delivered its first commercial computer, the IBM 701, in 1952. With the IBM System /360 and /370 mainframe computer families, the company achieved a major breakthrough in the marketing of computers from 1964 onwards. The extensive, complex instruction sets of what are now known as CISC (Complex Instruction Set Computers) made programming easier, but slowed down the processing speed. At the same time, the mainframe computer families from Control Data Corporation (CDC) with simple, speed-optimised instruction sets were successful in the technical field. They were the forerunners of the RISC systems (Reduced Instruction Set Computers) that appeared on the market in the mid-1980s. From the mid-1970s, the first vector supercomputers from Cray Research, NEC, Fujitsu and Hitachi were built on an architecturally similar basis, but enhanced by vector processing.

The Digital Equpiment Corporation (DEC) began producing inexpensive minicomputers such as the PDP-8 and PDP-11 in the mid-1960s. The 32-bit VAX 11//780 machine announced in 1978 with a large memory and extensive CISC instruction set was frequently used in the following years as a flexible and compact departmental computer and in the scientific field.

The invention of the microprocessor by Intel in 1971, the introduction of PCs by IBM and others, the 32-bit CISC processors from Intel and Motorola and the appearance of RISC microprocessors from companies such as MIPS computers, IBM, Sun Microsystems and ARM gradually pushed classic mainframes and minicomputers with proprietary instruction sets out of the market. Today, manufacturers such as Intel (Core i and XEON), AMD (Ryzen and EPYC), IBM (Power) and ARM licencees such as Apple and Samsung dominate the computer market with their 64-bit multicore processors. CISC and RISC systems have converged in terms of their architectures and provide the desired computing power in all types of computers, from single-chip microcomputers to smartphones, laptops, desktops, workstations and servers through to supercomputers with several million processor cores.

Criticism and problems

For a long time, the triumph of microprocessors went hand in hand with the increase in performance in microelectronics through the miniaturisation of switching elements, as dictated by Moore’s Law. Gordon Moore, one of the founders of Intel, postulated as early as 1965 that the number of transistor switching elements on a given chip surface would double approximately every two years. This trend continues to this day, albeit at a somewhat slower pace. At the same time as the miniaturisation of the transistors, the switching speed of the transistors and, in principle, the clock frequency of the entire processor increased 1.4-fold between every two technology generations and up to around 2004, without the electrical power required for the entire chip increasing as a result, as the smaller transistors meant that the supply voltage of the chips could be reduced to 0.7-fold. This means that each new generation of computer has 2.8 times the computing power without increasing energy consumption. This trend is known as Dennard scaling, after Robert H. Dennard, the inventor of DRAM memory at IBM. However, lowering the supply voltage below the limit of one volt (1 V), which was already reached in 2004, is virtually impossible with today’s CMOS technology (Märtin 2014).

In order to further increase the computing power without increasing the energy consumption on the chip surface, the switch was therefore made to multicore chips from 2004. Since 2004, the basic clock frequency has been left at around 3 GHz and many similar (symmetrical architecture) or different (asymmetrical architecture) cores have been realised on the chip. This increases the throughput for parallel software loads. In the case of sequential or less parallel software loads, all but one or a few cores can be temporarily switched off and the available electrical power used to temporarily boost the clock frequency for the remaining cores (dynamic architecture). The embedding of energy-saving hardware accelerators for computationally intensive algorithms (e.g. for AI, image processing, cryptography) in a multicore chip (heterogeneous architecture) is also common today. However, all these options do not prevent multicores from consuming more and more power due to post denard scaling if all the resources installed on the chip are to be operated simultaneously (Esmaeilzadeh 2013). The power supply of today’s multicore chips is organised like tiles. It is therefore possible to switch off parts of the chip simultaneously and in fractions of a microsecond, such as individual functional units of a core through to complete cores and parts of cache areas that are not currently needed. This is referred to as dark silicon. In order not to exceed a certain threshold of electrical power, with each new generation of CMOS technology only half of the cores can be active at the same time with full computing power compared to the previous generation (Taylor 2013).

Research

In addition to research into evolutionary and alternative more closely packed and 3D-arranged transistor technologies and integration techniques in order to keep Moore’s Law alive and reduce the energy requirements of future computers, research in the field of computer architecture is focussing primarily on the development of new types of hardware to make machine learning and the training and execution of artificial neural networks even more effective and efficient through high computational parallelism and/or data-related processing resources. Commercial and research-related approaches based on classic GPUs (Graphics Processing Units) (Hennessy, 2019), such as the Tensor Core GPU systems from Nvidia (Nvidia), are contrasted with brain-inspired computing (Boahen, 2017), which relies on the realistic replication of neural networks in computer chips. Examples of this are the AI chips TrueNorth and North Pole from IBM (IBM) and Loihi and Loihi 2 from Intel (Intel).

Major research efforts have also been underway for years in the field of quantum computers. In the future, it is hoped that quantum computers will be able to solve a number of economically relevant and challenging problems much faster than today’s conventional supercomputers.

Further links and literature

Research and development trends on future trends in computer architecture:

IEEE Task Force on Rebooting Computing

A widely used standard work on computer architecture:

Hennessy, J.L./Patterson, D.A. (2017). Computer Architecture: A Quantitative Approach. 6th Edition. Morgan Kaufmann

Sources

_{Boahen, K. (2017): A Neuromorph’s Perspective. In: Computing in Science and Engineering 19(2), 17-28.}

_{Doweck, J. et al. (2017). Inside 6th-Generation Intel Core: New Microarchitecture Code-Named Skylake. In: IEEE Micro, 52-62.}

_{Esmaeilzadeh, H. et al. (2013).Power Challenges May End the Multicore Era. In: Communications of the ACM 56(2), 93-102.}

_{Hennessy, J. L./Patterson, D. A. (2019). A New Golden Age for Computer Architecture. In: Communications of the ACM 62(2), 48-60.}

_{IBM. IBM North Pole. [26.02.2024].}

_{Intel. Intel Loihi 2nd [26/02/2024].}

_{Märtin, C. (ed.) (2001). Rechnerarchitekturen. Hanser.}

_{Märtin, C. (2014). Multicore Processors: Challenges, Opportunities, Emerging Trends. Proceedings Embedded World Conference, Nuremberg, 2014.}

_{Nvidia. Nvidia H100 Tensor Core GPU. [26.02.2024].}

_{Taylor, M.B. (2013). A Landscape of the New Dark Silicon Design Regime. In: IEEE Micro, 8-19.}

^{Wilkes, M. V. (1995). Computing Perspectives. Morgan Kaufmann.}

Definition and delimitation

History

Application and examples

Criticism and problems

Research

Further links and literature

Sources

Prof. Dr.-Ing. Christian Märtin

Digital competition

Fit for voice-based AI systems

Paradigm shift: new approach to measuring opinion power in the platform age

Monitoring opinion power: A new approach to ensure diverse opinion formation in the platform age