What The Tech

TCP/IP

The Backbone Of The Internet

Mike Hansell

Issue 7, January 2018

We use it every day, but there may be some interesting things about the TCP/IP prototol that you didn't know.

TCP/IP? What on Earth is “tic pip”? Well, it’s pronounced like RPM and like as so many things related to computers, it’s an acronym. “TCP/IP” stands for “Transmission Control Protocol Internet Protocol”. I can almost hear you thinking - “it’s no clearer yet”.

TCP/IP is a very large suite of protocols and programs. A protocol is a way to communicate. We’re using one right now. It’s called English. There are certain rules to English, and certain words that form part of the language, although this evolves with the more you learn - just ask your kids. There are rules that come into play; for example, if you don’t understand what is being said or maybe you missed part of the conversation you could say “pardon me?” and you would expect the other person to repeat what was said. There is some error detection and to a degree, a mechanism to check that your message got through to the other person. All of this is part of TCP/IP.

A LITTLE HISTORY

Way back before the internet was created, a number of universities wanted to share data but there was a variety of hardware involved and not all systems could communicate together. Bright sparks at these universities and the US Department of Defence worked on this for many years and several projects were developed and abandoned along the way. TCP/IP was developed in the 1970s but the internet didn’t exist at this stage. Finally, in the 1980s research by Tim Berners-Lee turned into the world wide web, and for better or worse, the internet was born. It should be noted that back in those early days there were few users and they were considered trustworthy. Users just expected that things would work. Security was lax. Now with so many users, it has become common for some “enthusiasts” (aka hackers) to try to access systems that they have no right to, or to use “things” in ways that they were not designed to be used. This generally leads to failures but occasionally a way is found around security or other systems (i.e., a “hack” is discovered). Let’s just hope it’s never related to your credit card or bank account. This has caused the gradual evolution and hardening of the TCP/IP suite. Despite this, data is not always as secure as it could be.

HOME NETWORKING

If your home (or office) computer can access the internet then you have a Local Area Network (LAN). Local because the network exists in the confines of your premises. When you connect to the internet you have become a node in a Wide Area Network (WAN), best known as the “internet”. Before we talk more about the internet, let’s discuss some basic concepts and protocols related to your LAN. For the purposes of this article, we are discussing IPV4, which is older and more established that IPV6.

Let’s talk about your home address (i.e., where your house is located). If you live at 37 Bloggs Street, your address is similar but different to 48 Bloggs Street. You live in the same “local area” but not in the same house. You have a unique home address. In contrast, 137 Nerk Avenue is not in your “local area”.

Each node (computer, printer, router, IoT device) must have a unique address assigned to it. IP addresses at home typically look like this: 10.0.0.32 (which is pronounced as “ten dot zero dot zero dot thirty two"), 192.168.0.55 or 192.168.1.42. In these cases, 10.0.0, 192.168.0 and 192.168.1 are the “local area” component, or more correctly, the network address component. Devices with IP address with the same network address can generally “see” or communicate with each other. Even if you have just one computer (or tablet or smartphone) connected to a router via WiFi, you have a local area network. All of your devices need to have an IP address on the same network. An IPV4 address is composed of four numbers. In our case, the first three numbers (10.0.0, 192.168.0, 192.168.1) form our network address. We could say “we are on the 10.0.0 network”.

Within most home networks you could typically have up to 254 devices. On a 10.0.0 network these would be 10.0.0.1 to 10.0.0.254. 10.0.0.0 and 10.0.0.255 are reserved for special purposes, which we’ll discuss later.

If your ISP is Telstra/Bigpond, you probably have a 10.0.0 network. The router is generally set to 10.0.0.138. Now, if all devices on the network need unique addresses, how do the devices get their address?

There are two methods for this: static assignment and dynamic assignment. A static IP address is manually assigned (entered into the device) and does not change. This would be the case for your router. For devices other than a router, this would most likely be used with a printer. Issuing dynamic addresses is normally a function assigned to your router. As all of your devices ultimately connect to the router, it knows each one. It can say, “Oh look, there’s Mike’s computer starting up. It needs an IP address. I have 10.0.0.55 available so I will tell his device to use that address”. Now Mike’s wife starts her computer, which goes through the procedure to get an IP address. The router won’t assign the same address to two different devices. You are not guaranteed to get the same address one session to the next.

DHCP (Dynamic Host Control Protocol)

The process of assigning IP addresses is controlled by DHCP, which stands for Dynamic Host Control Protocol. There’s that “protocol” term again. DHCP has the properties we discussed earlier. It has standard rules of how it works that all devices understand. Note that your router should be told about static address assignments, so that it doesn’t issue that same address to another device on your network.

DHCP normally assigns two more IP-related parameters to your device. These are “subnet mask” and “default gateway”. The subnet mask is a deep subject in its own right and we won’t be delving deeply into it here. Suffice it to say, that in a home network your subnet mask is quite likely to be 255.255.255.0, which means that the first three numbers of your IP address represent your network address (i.e., 10.0.0 or 192.168.1) and you can have up to 254 devices connected. Your address range is 10.0.0.1 to 10.0.0.254, or maybe 192.168.0.1 to 192.168.0.254. Of course we can’t use the same as the router, but I’m sure 253 addresses will be adequate for any maker. If you have more, please send a picture to Ripleys, Believe It Or Not. You may have a single device or six computers, four tablets, seven smartphones, two WiFi printers, a WiFi-connected burgular alarm, and a WiFi GPS tracker on your dog; but you are unlikely to exceed the limit of this class of network.

There are three classes of networks: Class C networks can have 254 devices. Class B networks can have 65,534 devices, and Class A networks can support 16,777,214 devices. Can you imagine the power bill?

DHCP can also issue the “default gateway”. This is generally the IP address of your router. The default gateway is the IP address that a network request (a device on the network trying to access another device) will use if it can’t find the required address on your LAN. You use this transparently when you try to access www.google.com.au for example. The google servers (probably) aren’t on your LAN, so to get to them the request is “routed” to the outside world of the internet.

DNS (Domain Name System)

Neither your computer (tablet, etc) nor your router know what www.google.com.au is. Computers don’t think in English like we do. The URL needs to be translated to an IP address, but what is the IP address of google? What if it should change? This is where DNS (dynamic name system) comes into play. Those clever boffins that put the internet together and keep developing it, have thought of this common situation. DNS provides the IP address of whatever URL you give it. Well, that’s assuming that it actually exists.

For example, enter http://www.gribbitwaffle.com/ into your browser and see what happens. Depending on your device and browser you’ll get some error relating to DNS, as in “This site can’t be reached. www.gribbitwaffle.com’s server DNS address could not be found.”

It couldn’t be found because it doesn’t exist (although why I don’t understand - I mean, the name is so enticing!)

Earlier we mentioned that there are two IP addresses out of the 256 that you can’t use. In our example 10.0.0 network, 10.0.0.0 refers to the actual network itself. It’s sort of like saying “I live in Duckburg”, but not specifying your home address. 10.0.0.255 is the broadcast address, meaning that sending data to this address sends the same data to all devices on the network. Devices on your network use broadcasts to find the DHCP server.

DATA PACKETS

There are strict rules regarding the format of the data that zooms around the internet. You can’t just send what you like when you like. This is all transparent but data is “packetised”. Your data, maybe you are watching a movie on Netflix, is divided into chunks of data called packets. Packets generally have extra data included related to error checking/correction, as well as the source (who created it), and destination (where’s it going).

WHAT IS IPV6?

IPV6 has been designed to replace IPV4. Sounds like LP records being replaced by cassette tapes, being replaced by CDs, being replaced by MP3s and now Spotify. It’s just the evolution of the technology of the internet. IPV6 has several advantages over IPV4, mainly a much expanded address range. However, IPV4 is not inter-operable with IPV6. Several conversion mechanisms have been developed, but this is not something that will worry any of us. It’s a problem for ISPs upward; we are low on the pecking order in this regard.

IPV4 can provide 4.3 billion addresses. You may think that is more than adequate forever, but no. IPV6 provides approximately 3.4 × 10e38 addresses. Can you imagine how many addresses that is? Let’s see: the earth’s surface area is about 510 million square kilometres, or 5.1e+21 sq mm. Yes, I know, who is interested in the earth’s surface in square mm? Well, spread the 10e38 connections that IPV6 is capable of, over the earth’s surface and that’s 6.7 x 10e16 per sq mm. Again, it’s hard to understand 6.7 x 10e16, it’s tens of millions of billions. Tens of millions of billions of connections per sq mm! At this rate microbes will be using google soon!

This is a valid IPV6 address: 2001:cdba:0000:0000:0000:0000:3257:9652. IPV6 addresses are made of eight groups of four hexadecimal characters (0-9 and A-F), where A = 10, B = 11, etc, quartets (four characters in one chunk) separated by colons. Any four-digit group of zeros within an IPv6 address may be reduced to a single zero or altogether omitted. We prefer 10.0.0.41 and it is doubtful that you will need to concerned with this in the foreseeable future.

ADDRESS BLOCKS ALLOCATions

There are over four billion IP addresses available via the IPV4 protocol. Of these, almost 600 million are reserved and cannot be used on the internet. The rest are allocated to countries by a body called the Internet Assigned Numbers Authority (IANA).

With so many addresses available you may wonder “Do certain countries have their own block of IP addresses?” The answer is “yes”. For instance, the USA had a population of 313 million at the time the blocks were constructed. They had 1.5 billion IP addresses available. That’s nearly five for each person. This doesn’t mean that each person in USA or even on average, uses five IP address, but that’s the number available. Now consider North Korea with a population of 24.6 million. Do you think they would have maybe five million connections? No, how about a total of 1024!

We also found it fascinating to see that 20% of IP addresses used on the internet are fake or wrong. They have not been assigned by the relevant authorites. They could be produced by misconfigured equipment, but more likely by malicious (read “hacker”) users. As Australians we found it quite amusing to see that these IP addresses are referred to as “bogons”. It seems that bogans are everywhere - even the internet!

HOW DOES MY ROUTER WORK?

You’ll recall that data is sent and received in packets on any network. Each packet has a source address (IP address of the device that is sending the packet), and a destination IP address (where is it going?). You can’t use your internal LAN IP address on the internet because there’s literally millions of users using 10.0.0 addresses, etc. The router knows this so it substitutes the IP address of the source device with the IP address that has been assigned by your ISP, your external IP address, then sends the packet on its way. The router makes note of each packet sent and which device on the LAN sent it. When a reply is received from the destination, the router substitutes your local address into the packet and forwards it to you. It does this for every packet in and out and for every device on your LAN. If your router sees a packet from you that is destined for another device on your LAN, maybe the printer, it does not forward it to the internet as there is no need.

A network switch, for example, is a similar device. A switch is intelligent and directs packets to device meant for it only. An older dumb device is a network hub. It sends every packet it receives to every connected device. This can cause excess LAN traffic.

Your router quite likely has four hardware ports to connect devices. You can run one cable from your router to a switch to provide more ports for devices to connect to or simply to extend the range. Maybe your router is in one room and your computers and printer are in another. Of course these days we mostly use WiFi connections, so the number of ports available on the router may be less of a consideration.

Types Of Protocols:

There are many other protocols that you will encounter or that are used transparently.

http

HTTP AND HTTPS (HYPERTEXT TRANSFER PROTOCOL AND HYPERTEXT TRANSFER PROTOCOL SECURE): Web browsers (Chrome, Firefox, Safari, and nothing good from Microsoft), use HTTP and HTTPS in a client/server (slave/master) scenario. The client is the device running a browser. It talks to a web server. The clients send HTTP request messages to the web server, which returns content such as HTML files or web pages.

802x

802.X: These are WiFi IEEE 802.11 protocols. WiFi has been around since 1997 and has gone through many versions, each of which bring greater bandwidth, security and other benefits. But the truth is, we rarely consider it. It’s just another one of those transparent protocols

voip

VOIP (VOICE OVER IP): With the NBN gradually being deployed in Australia, VoIP home telephones have become the norm. These convert your voice to a digital signal and transmit over the internet. VoIP is common in office systems these days too.

ip

IP (INTERNET PROTOCOL): This operates on a best effort delivery model, in that it does not guarantee delivery, nor does it assure proper sequencing or avoidance of duplicate delivery. It depends on TCP to provide reliability.

tcp

TCP (TRANSPORT CONTROL PROTOCOL): This provides reliable, ordered, and error-checked delivery of a stream of data between applications running on hosts communicating by an IP network. This gives reliability to data copies, etc.

ftp

FTP (FILE TRANSFER PROTOCOL): This is sometimes used to transfer data. FTP is another early software development and was initially used before TCP/IP was developed. FTP generally offloads its overhead of error checking, etc to underlying mechanisms of TCP/IP.

ntp

NTP (NETWORK TIME PROTOCOL): This can be used to ensure your device has an accurate clock. It’s not as trivial as it seems. Can you imagine saving a file right now, but the computer says it was saved yesterday or tomorrow? Fred in the next cubicle says, “No, I can see that it was created last week”. That would be complete chaos.

tls and ssl

TLS AND SSL: Transport Layer Security (TLS) and its predecessor, Secure Sockets Layer (SSL) are cryptographic protocols to provide security. This is what provides the S in HTTPS. It is often used with financial transactions, and browsers (particularly Chrome) are starting to insist that web sites use the HTTPS protocol.

Email Protocols:

email

SMTP (SIMPLE MAIL TRANSFER PROTOCOL): The process of sending an email can be a little complicated with many intermediate mail servers being involved to relay the email to its eventual destination, say yahoo or gmail. These servers generally use SMTP to send and receive email messages. Email client software like Outlook, Thunderbird, etc may use SMTP to send the email off, but generally use either IMAP or POP3 to retrieve email.

POP3 (POST OFFICE PROTOCOL VERSION 3): This is an older standard and is being superseded by IMAP. POP3 generally keeps a copy of the email (often preferred) on the local device (i.e., your PC, Mac, smartphone or tablet).

IMAP (INTERNET MESSAGE ACCESS PROTOCOL): This is designed to allow multiple clients to access the email. To do this, the email messages are generally left on the email server and are accessed only over the internet. So if you have no connection to the internet you probably can’t access any email including messages that you have looked at previously.

UNCOMMON PROTOCOLS

These following protocols are a little techie and probably more obscure for the average maker, but have a look anyway.

TELNET (TELETYPE NETWORK): This provides an interactive text-based, two-way communication facility for accessing remote computers. Due to it not being encrypted, SSH is recommended instead.

SSH (SECURE SHELL): This is similar to Telnet but uses cryptography to secure communications over otherwise insecure connections. It was designed to be a replacement for the insecure Telnet. Common applications include remote command-line login and remote command execution.

SNMP (SIMPLE NETWORK MANAGEMENT PROTOCOL): This is often used to manage and control routers, switches, servers etc in a large environment. You are unlikely to need or want to use this at home. SNMP can be used to determine if a device is functioning correctly.

MQTT (MESSAGE QUEUE TELEMETRY TRANSPORT): This is used to send messages to other devices where connections may have low bandwidth (i.e., they’re sloooow), and devices have limited hardware (perhaps IoT devices).

RDP (REMOTE DESKTOP PROTOCOL): This is a proprietary protocol developed by Microsoft. After authentication with the remote device (maybe your work server) you will have an interactive GUI. You can use your mouse and keyboard as usual. Applications are executed on the server, which sends screen updates to the user. The user in turn sends mouse movements and key presses to the server. There are other programs like TeamViewer and LogMeIn that provide similar facilities.

Note: Be VERY aware that malicious entities can use these tools to do nasty things to your device. If you get that infamous “Hello, we can see your computer has a virus” or similar, just hang up.

CONCLUSION

TCP/IP is a very deep pool which we have just stuck our toe into here. Working knowledge of TCP/IP used to be the mark of a computer tech’s knowledge. It’s still being actively developed and many and varied applications are constantly being developed.