Internet

Definition and delimitation

The term internet is nowadays used colloquially as a generic term for a multitude of different technologies, services, (entertainment) offers, social networks and much more. Originally, the term was used exclusively for the technical basis of what is now subsumed under the term internet. Internet or internetworking means the connection of independent and autonomously operated computers or computer networks with the help of standardised protocols, i.e. the internet in its original definition is a network of autonomous networks.

History

The history of the Internet goes back to the 1960s ^[1]. At that time, time-sharing of computers, i.e. the simultaneous use of a computer by several people, as well as the change from circuit-switched communication to packet switching were topics of scientific discourse. The American Department of Defence with its military research agency Defense Advanced Research Projects Agency (DARPA) promoted these two research topics. In 1966/67, the funding of the network project “Advanced Research Projects Agency Network” (ARPANET) began.

However, the development of ARPANET must also be seen against the background of the Cold War. According to ARPA Deputy Director Stephen Lukasik, ARPANET was also promoted to meet the needs of military command and control of nuclear threats [und] to achieve survivable control of US nuclear forces ^[2]. The stability and resilience of the network was an important design criterion. The network was also designed to be able to withstand failures in large (sub-) networks and to compensate for the unreliability of nodes and network links.

In this early phase, circuit-switched versus packet-switched communication for the ARPANET was also already being discussed. In the first case, all mainframe computers were to be connected directly via lines. In the packet-switched approach, messages, or packets, were to be exchanged between the computers. For this purpose, packet-switching nodes, the so-called Interface Message Processors (IMPs), were introduced to connect individual computers or entire networks to the ARPANET. In the early phases of ARPANET, when the network consisted of only a few nodes (see Figure 1), it would have been possible to switch dedicated lines between the computing nodes. But it was the technology of packet switching and the introduction of specialised switching computers – the IMPs of the early phase were later called routers – that enabled the extreme growth, scalability, robustness and ultimately the success of the modern Internet.

Figure 1: Network topology of the ARPANET in December 1972 ^[3]

The communication networks at that time were proprietary and dependent on the manufacturer of the corresponding computers. In order to couple these heterogeneous networks, it was necessary to agree on a uniform set of rules and data structures called protocols.

The specifications of these protocols and technologies were published as so-called RFCs (Request for Comments), put up for discussion and subjected to peer review by the community. This procedure democratised development, served to improve the protocols and was intended to lead to the best solution being developed jointly. This approach was deliberately intended to distance the classical standardisation process from heavy-handed standardisation organisations. To this day, anyone can participate in this process and publish an RFC. The RFC with the number 1 was published by Stephen D. Crocker in April 1969 ^[4] and to date (2021) more than 9,000 RFCs have been published. Often, the authors also published a reference implementation of the corresponding protocol with the RFC. This made it easy to use the corresponding protocol for one’s own computer, and this also contributed to the success of the Internet protocols.

The most important and eponymous Internet protocols TCP, IP and UDP were first published during this period. Vinton Cerf published the Internet Transmission Control Protocol (TCP in 1974, ^[5]) and is therefore considered one of the “fathers of the internet”. TCP is a protocol for the IMPs that realised a reliable end-to-end connection between any nodes on a packet-switched network. With the User Datagram Protocol (UDP, ^[6]) a connectionless and unsecured data exchange was defined. In September 1981, DARPA published the Internet Protocol (IP, ^[7]), a protocol that enables the connectionless, packet-oriented and non-guaranteed transmission of data packets (also known as datagrams) from a sender across several networks to a receiver. Already with this RFC, the basic principle of the addressing scheme of the internet, which is still valid today, was introduced.

University networks were established in the early 1980s: BITNET (Because It’s Time NETwork) and CSNET (Computer Science Network) in 1981, EUnet (European Unix Network) shortly after in 1982. In the same year, ARPA and the Department of Defense (DoD) declared TCP/IP to be standard protocols in their networks. This led to the first definition of the term Internet, on the one hand because of the (Internet) protocol family TCP/IP, on the other hand because of the connection between the autonomous subnets through a connecting “Internet”.

in 1983, the EARN (European Academic and Research Network) was founded, which was very similar to the BITNET, and the ARPANET was split into a civilian and a military part (MILNET). Then, in 1984, the Domain Name System (DNS) was specified, with the help of which (computer) names could be translated into IP addresses (and vice versa). This meant that it was no longer necessary to remember 32-bit IP addresses in order to address a computer. Computers could now be addressed by means of a name. It is much easier for humans to remember www.lrz.de instead of 129.187.254.9 (or in the case of IPv6: 2001:4ca0:0:103::81bb:fe09). And in the same year, 1984, the threshold of 1,000 networked hosts (computers) was exceeded.

The Munich Science Network and the Leibniz Computing Centre (LRZ) were connected to EARN and BITNET in 1985 ^[8]. This made it possible for the first time to offer an e-mail service between Europe and the USA. The following year, the high-speed backbone NSFNET of the National Science Foundation (NSF) was established. It connected five supercomputing centres in the USA at the then legendary speed of 56 kbps (kilobits per second). An average DSL connection in 2021 has a download speed of 136 Mbps ^[9]the following year, the first connection between China and Germany was established, and on 20 September 1987, the first e-mail was sent from China to Germany. in 1989, the number of hosts exceeded the 100,000 mark.

in 1990, the ARPANET ceased to exist. There were research networks around the world that were interconnected, and these formed the early Internet. in 1991, the National Science Foundation (NSF) lifted restrictions on commercial use of its network, and the first company to benefit was Commercial Internet eXchange (CIX), which became namesake for a great many Internet nodes. The company DeCIX, which operates one of Europe’s largest Internet exchange points in Frankfurt, has CIX in its own company name. Only three years after 1989, the next mark was reached in 1992 with 1 million hosts.

Starting in 1989, a so-called hypertext system and the protocol HTTP were developed by Tim Berners-Lee at the CERN research centre near Geneva, as well as the language HTML for representing electronic documents. The first programme created by Berners-Lee to display HTML and follow hypertext links (known today as a browser) was named the World Wide Web by him.

The Munich science network was connected to the Internet in 1992 via the national research network (WIN) operated by the German Research Network Association (Deutscher Forschungsnetz Verein) ^[10].

in 1993, the CERN Directorate released the World Wide Web to the public free of charge, and thus began the boom of the Internet as we know it today. In the same year, the first graphics-capable web browser, Mosaic, was released. Mosaic was followed by many other browsers and the World Wide Web became the “killer application” for the Internet. The number of users as well as the number of web pages offered increased rapidly and the commercialisation of the internet began to gain significant momentum. In July 1995, for example, a website called Amazon was launched and sold books worldwide via the internet.

All content on the web, i.e. web pages, documents etc., can be accessed via so-called Uniform Resource Locators (URLs). Information about new, interesting content and URLs was shared or published in other media. Finding a certain content (again) could thus grow into a very time-consuming task. Then, on 15 September 1997, a search engine called Google went online, making it easy to search the entire Internet.

Until the beginning of the 2000s, the web had a small number of providers who created (mostly static) content compared to a very large number of users who “consumed” this content; this is also referred to as Web 1.0. With Web 2.0, this changed dramatically: the content became more dynamic and suddenly users could create their own content and actively participate in the web. This is also referred to as the Participatory Web or Social Web. Examples include Wikipedia (2001), Facebook (2004), YouTube (2005), Twitter (2006), Instagram (2012), TikTok (2016) and many more.

Application and examples

In research institutions and universities, Unix was widely used as an operating system for computers in the days of ARPANET and was characterised by the fact that the Internet protocol family was an integral part of the operating system and could therefore be used easily. The most important applications in the 1970s were the transfer of files (File Transfer Protocol, FTP) between computers and the sending of messages and news. In Unix, a programme called uucp (Unix-to-Unix copy) was used for this. This was then used to set up the Usenet system, a worldwide distributed system for discussing any topic. Users could use it to write posts or news that could be read and commented on by everyone else. With the help of thematic and hierarchical structures in newsgroups, a content-related or organisational structuring took place. The newsgroups were used as professional discussion forums for a wide range of topics and represented an important collection of knowledge.

When the MWN was connected to the EARN and later to the Internet, the main Internet services were: e-mail, news, file transfer and gopher. Gopher is a protocol developed in 1991 that allowed documents to be accessed over the internet. Gopher was completely superseded with the advent of the World Wide Web and no longer plays a role today. In global terms, the File Transfer Protocol (FTP) and Telnet were the most widely used Internet applications until the early 1990s. Telnet is a protocol for logging on to a remote computer and executing commands there. Both protocols were overtaken by the World Wide Web in terms of frequency of use in 1994/95. This also marked the beginning of the transformation of the Internet from a network of researchers, computer scientists and administrators to Web 2.0 and to commercialisation and widespread use.

The Internet protocol family was used as the basis for many other services and applications. Radio stations began to “broadcast” over the internet. Even services that were traditionally circuit-bound for decades used the internet and packet switching. One example is traditional telephony. With ISDN, analogue telephony was digitised, but a “line” had to be switched for the connection between the caller and the called party. With protocols for Voice over IP (VoIP), line switching was broken and telephone calls were made possible via packet-switched networks such as the internet. This means that internet and telephony providers do not have to operate two infrastructures – for internet and ISDN – in parallel, but can use one network infrastructure – based on IP – for both or any services. This convergence of the networks saves both investment and operating costs on the part of the providers.

Criticism and problems

The ARPANET and the basis of the Internet were conceived as open research networks. For a long time, the issue of IT security played no role at all, because with the help of the Internet, equals communicated with equals and no one meant any harm. In 1988, the student Robert T. Morris painfully demonstrated how vulnerable the Internet was with the Internet worm he developed. Within a few days, the worm was able to infect about 6,000 hosts, which corresponded to about 10 % of the Internet at that time. As a result, CERTs (Computer Emergency Response Teams) were founded worldwide, the topic of security moved into focus and became increasingly important.

The Internet and the protocols are an evolved infrastructure whose basis goes back to the 1970s. At that time, no one could have imagined that there would one day be billions of computers or internet-capable devices. For example, an addressing scheme with 32-bit addresses was initially defined, in which a maximum of a good 4 billion (²³²) addresses could be assigned. On 25.11.2019, the last free IPv4 address was allocated ^[11]. The security mechanisms in IP protocol version 4 (IPv4) were also rudimentary at best. This led to the development of IPv6 and a sometimes bitter dispute about which protocol was better or even necessary. Today, both versions are operated in parallel in very many networks.

Questions of data protection became increasingly important. The revelations by Edward Snowden in the USA and the lawsuits by Max Schrems against Facebook acted as a catalyst for this discussion. The introduction of the EU General Data Protection Regulation, which came into force bindingly for all member states in 2018, attempts to achieve a uniform level of data protection for the whole of Europe.

Since Web 2.0 and the sharp increase in the use of social media, it has become increasingly important for providers (and especially the marketing industry) to present users with content that is as tailored to them as possible and meets their interests. This leads to algorithms that are used to predict what information users want to see. As a result, there is a danger that users will only be shown information that corresponds to their point of view and that they will be isolated in terms of information technology. This is also referred to as a filter bubble in which users move on the web.

Research

The internet is huge, and so are the research questions that deal with it. A complete list or classification is impossible at this point. Therefore, only a few research areas are listed selectively as keywords:

Research on hardware, network components, cabling, transmission techniques, mobile radio, radio technologies
Protocols
High-speed networks
Energy efficiency
Autonomous systems
Internet of Things, Smart Homes
Standardisation
IT management, process-oriented management
Self-X: Self-Healing, Self-Operation, Self-Management etc.
IT and network security
Robustness and resilience
Automation
Further development of the network: green-slate approach versus evolutionary research
and much more

Further links and literature

At this point we would like to refer to the internet as an almost inexhaustible source of information, literature and knowledge.

Sources

^[1] R. Zakon. Hobbes’ Internet Timeline. RFC 2235, Nov. 1997.

^[2] S. Lukasik. Why the Arpanet Was Built. In: IEEE Annals of the History of Computing 33, 3, 4-21, March 2011. DOI: 10.1109/MAHC.2010.11

^[3] N. Neigus. Network Logical Map. RFC 432, Dec. 1972.

^[4] S. Crocker. Host Software. RFC 1, April 1969.

^[5] V. Cert. Specification of Internet Transmission Control Program. RFC 675, Dec. 1974.

^[6] J. Postel. User Datagram Protocol, RFC 768, Aug. 1980.

^[7]Defense Advanced Research Projects Agency. Internet Protocol. RFC 791, Sept. 1981.

^[8] H.-G. Hegering, H. Reiser, D. Kranzlmüller. Das Münchner Wissenschaftsnetz – Vergangenheit, Gegenwart, Zukunft. In: Bode, A., Broy, M., Bungartz, H.-J., Matthes, F. (Hrsg.). 50 Jahre Universitäts-Informatik in München. Springer Vieweg, Berlin, 2017 (E-Book). DOI: 10.1007/978-3-662-54712-0

^[9] statista. Durchschnittliche Verbindungsgeschwindigkeit der Internetanschlüsse (Festnetz) in Deutschland von Oktober 2020 bis Oktober 2021.

^[10] H.-G. Hegering. 50 Jahre LRZ – Das Leibniz-Rechenzentrum der Bayerischen Akademie der Wissenschaften. Chronik einer Erfolgsgeschichte. Garching, 2012.

^[11] RIPE NCC. The RIPE NCC Has Run out of IPv4 Addresses. Nov. 2019.