Chapter 1

 Internet Basics

1.1 Introduction

You must have heard of the Internet, with such publicity in the newspapers. You must have wondered what this computer-related phenomenon is, and how you could use it. Now, as you already have this book, the answers to your questions about Internet are close at hand.

Here are some of the ways the Internet can be useful to you:


Enumerated above are only some of the uses of the Internet, which are commonly known at this time. These are like the tip of an iceberg -- no one knows what all is possible. The important thing to realize is that the Internet permits almost anything. Your imagination is the only limit.

1.2 A Brief History of the Internet

The word Internet flashes many images upon the canvas of the mind. The dominant one may be hundreds or thousands of computers and computer networks connected with each other, exchanging information. This is the hardware aspect of Internet. Its application aspect is the multitude of different services Internet offers, e.g. E-mail and others, listed and discussed in detail later in this book. Yet another image is that of everyone doing their own thing. Governments have attempted to control and legislate it, but have failed.

Strangely, Internet is the product of a military undertaking. The Pentagon's Advanced Research Project Agency (ARPA) funded its creation in 1969, as ARPAnet. The initial intention was simple: to develop a geographically dispersed, reliable communication network for military use that would not be disrupted in case of partial destruction from a nuclear attack. That aim was accomplished by splitting the data being transmitted into small packets, which can take different routes to their destination. The "packet-switched" network can resist a nuclear attack in that the packets can take a different route if one route is knocked off. This main feature of Internet technology also makes it almost impossible to eavesdrop on the messages.

The procedure developed for interconnecting ARPAnet computers and communicating the data was called TCP/IP, an acronym for Transmission Control Protocol/ Internet Protocol. ARPAnet allowed engineers and scientists working on military contracts all over America to share computers and computer resources. As a second thought, the computer scientists developed a way to exchange messages. This feature, "E-mail", turned the network into a new communication link. The ARPAnet was first confined to organizations and individuals having US government security clearance and working on government contracts. It soon merged with a non-governmental, parallel academic network called Usenet News, launched in 1979, which grew and eventually became known as the Internet. In the late 1980s, the American government, through its agency the National Science Foundation (NSF), set up five supercomputer centers, which became the main nodes of the Internet, to which the university and research lab networks became connected.

The number of computers connected to Internet has been growing exponentially. In 1983 there were less than 500 "host" computers, mostly government laboratories and academic computer science departments. The rest of the academic community got a whiff of its information exchanging ability, and by 1987 there were about 30,000 host computers at different universities and research labs. By 1995 this number had increased to 5 million hosts.

In the early 1980s, using the Internet was still difficult. However, its power was obvious. There was no other method to connect up universities and research labs around the world which was so fast, convenient, and flexible. So, the Internet users at universities came up with software to participate in discussions over the network. They created document and software libraries on the network, which were accessible to all users. During this period, the Internet remained within the narrow confines of the academic and research-lab world.

In 1980s, another computer-related event happened -- the personal computer became very popular. Prior to that, businesses used either a minicomputer or a mainframe. But now they went in a very big way to micro computers, which are known as desktop computers, personal computers, or PCs. These were stand-alone machines that lacked the capability of sharing data and resources. To remedy that shortcoming, the concept of local area networks (LANs) became important. For large companies, wide area networks (WANs) also came into being. With these in place, E-mail became a means of daily communication.

With the prices of PCs coming down, more and more individuals also had computers. There was a demand by these people to connect up their machines with other machines. In response to that demand, on-line services like Telenet (don't confuse with telnet)and Compuserve came into being. For a fee, individuals could connect up to them and communicate with other users on the same service, as well as use their repository of information and software, which people could download. Further, along with on-line services came the concept of the Bulletin Board Service (BBS), which is individuals connecting up to another computer in their vicinity and exchanging information and sharing software, etc.

Initially these private networks, both corporate as well as commercial, had different hardware and software platforms and could not talk to each other, but very quickly, TCP/IP came to be used by them. Interconnection of these networks, BBS communities, and individual PCs in homes and offices by adoption of the Internet technology, TCP/IP, gave birth to the Internet as we know it today. All that is required to connect any network or computer up with the Internet, is the capability to use TCP/IP for exchanging information. This is how the Internet became the Network of Networks.


During its evolution, the Internet was supported and controlled to a greater or lesser degree by American government agencies, first ARPA, and then the NSF, but now it has become a diversified, in some sense uncontrollable, global entity. Its nodes are supported by diverse sources. In the 1980s ARPA was reorganized, its funding was cut, and the American defense networks were mostly detached from the Internet. Its funding continued through the NSF, and, until recently, the NSF paid for connecting the computers of academic institutions and government agencies everywhere in the USA to the Internet. Slowly, the NSF permitted commercial networks to be connected to the Internet, initially for educational and research purposes, while forbidding primarily commercial use. This started the rapid growth of Internet. Gradually, commercial use increased as the restrictions were eased. In 1995, companies passed universities as main users. In April 1995, NSF moved out of the scene, and today the American government has no part in running and maintaining the Internet. It is now self-sustaining.

Two other important developments underlie the present explosive growth of the Internet. The first took place at CERN, the European high energy physics lab near Geneva. There, in 1990, physicists developed a software for publishing, searching, and accessing information on the Internet, as a way for scientists to share documents with their colleagues at large. This came to be known as the World Wide Web (WWW).

The second occurred at the University of Illinois, where a young student named Marc Andreessen developed a graphical browser called Mosaic, to access information from the WWW. These two developments have catapulted Internet from the laboratory to the mainstream of life. In the last year the growth of WWW has been even faster than the exponential growth of Internet.

1.3 How does the Internet work ?

If you were asked how the Internet works, chances are that you might say since it is a global computer network, it is run by some central organization called Internet, who collects the fee for your use, and if you want to put up any information, you will have to make arrangements with them. If you said that, you would have been dead wrong.

We Indians, like people in some other places in the world, where the ideas of central planning and socialism have held sway, have the tendency to assume that any complicated activity like a national economy or a global computer network can only function through a central authority.

However, in case of the Internet this is not so. All the computers and wires that make up the thousands of smaller networks connected by the Internet, work because they follow a simple rule, TCP/IP, mentioned above. TCP/IP says simply that all data shall be broken up into small packets, and that the first part of each packet has the address where the packet is meant to go. That is about it. How it should work is not laid down in a master plan. There is no central computer or authority. Instead of having the data go to a central computer and then to its destination, with Internet, the data has many ways to get from one point to another, over the web of computers.

For transmission hardware, the Internet is dependent on the existing infrastructure developed by long-haul telephone companies and other telecommunications companies. Internet service providers lease data circuits from the telephone networks and have dedicated computers at the end points or nodes. These rely on the distributed intelligence of networking equipment known as "routers", thus bypassing the telephone company's expensive switching computers, while using their transmission lines.

All the content of Internet is held by computers known as the "servers", which are owned by organizations and companies, e.g. University of Kansas, Microsoft etc., who want to distribute the information.

When request is made of these servers for the information, they bundle the requested information in small packets, with address as to where it is to be sent, and send them down to the nearest connection to the Internet. When they arrive at the Internet, the packets are read by the router, which is nothing more than a traffic cop, and sent down in the same general direction as the address. A similar thing happens at the next junction on the Internet. This goes on till the packet is delivered to the right address, where it is put together again with other packets, to make up the original information.

Say for example you are sending a message from Mumbai to Palo Alto, California, to a server named svpal.org. The message will be broken up into packets of approximately 1500 bytes, and some may travel from VSNL here to the MCI router in the US, some may travel to Madras and then to the MCI router, and so forth. There is no predetermined path and even individual packets of the same message may follow different paths. It all depends on the traffic at that node, at that moment in time. As the packets reach svpal.org, they are all put together as in the original message and delivered to the given address.

In order to accomplish the task of messaging across a network, computers use a networking protocol. Taking the analogy of diplomacy, the relations and interactions between the representatives of different countries follow a set of rules laid down by tradition and treaty, which is called diplomatic protocol. Similarly, all computers wanting to talk to each other have to conform to a standard set of rules defined in the networking protocol. This enables different types of computers running different types of operating system to communicate efficiently. The de-facto standard today is TCP/IP (Transmission Control Protocol/Internet Protocol). All this is accomplished by dedicated but fast computers known as routers that work in unison.

Every organization has its own network and every individual user, his own system and setup. What kind, does not matter as long as they talk the same protocol to the external world.

1.4 The Domain Name System

In order to use TCP/IP for transferring data from one computer to another, an addressing system has to be in place. When the number of computers on the Internet was small, this was not a problem. But now with 5 million hosts it is a serious matter. How does one assign and keep track of all the unique numbers assigned to each computer, so that every other computer knows its existence and sends data to it? All this information cannot reside on just one computer and be accessed every day by all the other computers to update their address books. The Domain Name System (DNS) was developed to solve this problem.

DNS is a distributed database. This allows local control of the overall database, and yet the data in each small segment is available across the entire network.

Other than the distributed nature of the domains, the other main attribute of this system is its hierarchical nature. This allows responsibility for maintaining a domain to be distributed and also allows for the information of the hosts to reside on different computers.

Since Internet was conceived and developed in the USA, Americans defined the top level domains. Initially these were designated as follows

DOMAIN ORGANIZATION

com For commercial organizations (i.e. businesses)
edu Educational organizations (Universities, secondary schools etc)
gov Governmental organizations (Non military)
mil Military (army, navy etc.)
net Network resources e.g. Internet Service Provider
org Other organizations


Initially, the success of Internet was not anticipated and hence no provision was made to include other countries. Now because of the overwhelming global success of the Internet, new top level domains are reserved, but not necessarily created, to correspond to individual countries. These national domain names follow an existing international official standard of two-letter abbreviations for every country in the world.

An example of other countries represented with domains include:

au Australia
ca Canada
fr France
uk The United Kingdom.
in India.

Fig 1-1 gives a graphical representation of the original top level domains in the USA and the modified inclusion of countries in the top level domains. This primary domain list is maintained by InterNIC, one of the several loosely-knit voluntary organizations that oversee various aspects of the Internet.


Fig. 1-1 The tree structure of primary domains and countries.

Let us discuss our top level domain "in" and its sub-domains, shown in Fig. 1-2. This top level domain is maintained by the National Center for Software Technology (NCST), as they were the first Internet node in India.

Fig. 1-2 Domains under "in" with net, ernet and co.

This scheme distributes the responsibility of keeping track of all the new additions of computers to the Internet. Within each category or hierarchy, this is done by the designated domain administrator or authority. For example, under the sub-domain "net" is the sub-domain "vsnl", which is responsible for all the hosts they may have e.g. giasbma, giasdl01, giascl01 etc. VSNL comes under "net" because they are a network service provider. The server or host on which our Terminal accounts reside is "giasbm01", under "vsnl".

An example of a full Internet address under this arrangement is:

surekha@giasbm01.vsnl.net.in

Reading from left to right, this is the mail address for the user known as surekha, on the server called giasbm01. This server is in the organizational sub-domain vsnl.net. This organization is located in the national domain "in" (India).

VSNL's server in Delhi, "giasdl01" is also in the same subdomain, "vsnl.net.in", and an address on that machine will look like:

ravi@giasdl01.vsnl.net.in

If the organization has a large number of systems, then the organizational domain might also contain sub-domains, each of which contains a number of systems. This is often done on a departmental or service basis, as in our case the systems in Mumbai are giasbm01, giasbma etc. Similarly there may be many systems in Delhi. VSNL may in the future have thousands of systems connected to their network.