Unit 4 Lab 1: Website Data, Page 1

This page is long. Consider splitting up long pages like this next year. --MF
GH Feedback 1/6/16: In the redundancy quiz, it seems like a more interesting and relevant-to-reliability question is "How many nodes can fail and still let Sender and Receiver communicate?"
Dan Feedback 1/17/16: You have the following as "integrate as needed next year":
- Bandwidth: a measure of bit rate—the amount of data (measured in bits b) that can be sent in a fixed time. Usually b/s
- Latency: the time elapsed between the transmission and the receipt of a request. Usually ms.
BUT bandwidth and latency are part of the CF so they need to occur somewhere this year...
GH Feedback 1/20/16: add more nodes
- I assume this is a reference to the animated gif, in which case, I see no point; it will only complicate the visual. Does anyone see anything else to which it could be referring? --MF
GH Feedback 1/22/16: The green box "For you to do" shoul be all green. It's distracting to see the white parts of the page within the green. The information about the Internet under green #1 and the brief history of the Internet should go on the top of the page.
- I disagree, but Paul is really the master at these things. Thoughts from PG? --MF

The Internet is pretty much everywhere humans are. Even astronauts use the Internet on the International Space Station. astronaut at NASA

Here is a 2012 map of global Internet usage as a percentage of population from Wikipedia:

People talk as if "the Internet" and "the World Wide Web" are the same thing, but they are not. The World Wide Web is the collection of interlinked website documents you can view with a web browser by typing URLs such as http://bjc.berkeley.edu/website/privacy.html.

GH Feedback 1/22/16: Please clarify the definition of protocol in the section labeled as "What is a URL?", bullet point number 1.
GH Feedback 1/22/16: Please re-write the sentences in the "What is a URL" box. The sentences are confusing to read.

What is a URL?

A URL or Uniform Resource Locator is like an address for accessing specific web data. URLs can be broken into three parts:
parts of a URL

The protocol (usually "http" or "https") that your computer uses to access the site, followed by a colon (:).
The domain name (or host), which is everything after // and before another slash. This is the name of the computer that hosts the page.

For example, the URL http://bjc.berkeley.edu/website/privacy.html tells your computer to use http to access the bjc.berkeley.edu server (the computer that serves web pages for that site).

Everything after the domain name. Different hosts interpret the rest of the URL in different ways, but usually additional slashes describe a path in a hierarchy of folders. Other punctuation, such as ?, =, and & are used to provide the server with parts of a request other than the path to a page.

The URL http://bjc.berkeley.edu/website/privacy.html tells your computer to go to the "website" folder on the bjc.berkeley.edu server, and open the "privacy.html" file.

But the Internet is more general than the World Wide Web. It also supports email (which can sometimes be seen through a web browser but is transmitted with a different protocol), file transfer, mobile apps, and many other ways computers communicate behind the scenes.

The Internet is a massive network of computers that communicate primarily using a pair of protocols (standards):

This content from from Dan's PPTX files. Integrate as needed next year. --MF

The size and speed of systems affect their use.
- E.g., Netflix on dial-up? Nope.
Bandwidth
- a measure of bit rate—the amount of data (measured in bits b) that can be sent in a fixed time. Usually b/s
Latency
- the time elapsed between the transmission and the receipt of a request. Usually ms.

Make the following into a quizlet?

What has the highest bandwidth?

Wireless networks
- 802.11ac = 1.3 Gbps
Wired networks
- 10 GigE = 10 Gbps
Your hard drive and your computer
- Thunderbolt 2 = 20 Gbps
Your CPU and its scratch space
- At 4 GHz, 4 bytes / .25 ns = 16 GBps = 128 Gbps
A truck of MicroSD cards going next door
See also: what-if.xkcd.com/31
- xkcd's author calculates it to be 177 petabytes/s = 177,000,000 Gbps

A Brief History of the Internet

The beginnings of the Internet were designed in the 1960s with the connection of the first 4 computers of what was then called the Arpanet. Since then, the number of people connected to the internet has been increasing exponentially. These days, several different systems of communication run on the Internet:

Take a look at the graph of "Internet Users in the World" on http://www.internetlivestats.com/internet-users/. Notice how dramatically the number of Internet users has increased in the last 20 years.
- What does it mean for the growth of the Internet to be exponential?
- What do you expect will happen to these numbers in the next 20 years? Will the number of Internet users increase forever?

Visit the page http://www.internetlivestats.com/one-second/, and click the arrow button (shown right) to scroll through the pictographs and get a sense of how much data is moving across the internet in just one second.

Read through the Internet History from computerhistory.org

Network Redundancy

With such an enormous amount of data traveling around, the Internet needs to be reliable, and we have achieved this by making the Internet redundant. Wherever information is going, there is more than one way to get there. If part of the Internet fails, the rest remains connected even if the failed part is in the usual path from one place to another. This increases the Internet's fault tolerance (ability to work around problems) and helps the Internet scale (expand) to more devices and people.

In this model of a network, what is the fewest number of nodes (connection points) that can stop working before the sender and the receiver can't communicate?

There are no nodes that are vital to the system. Pick any node to stop working, and you can still find another path.

Correct! If the node with 6 connections goes down and also either of the two to its left, the sender and receiver can't communicate.

Try to find a smaller number of nodes that can stop working and still break communication.

Domain Name Hierarchy

Before the Internet, there were smaller networks of computers (for example, about 200 on the Arpanet at its peak), and every computer on a network knew the name of all the other computers on that network. This worked pretty well back when a computer cost roughly a million dollars and so there weren't very many of them. But that's not a strategy that can be scaled up for over 3 billion computers!

That's why Internet host names are hierarchical. If you want to know where snap.berkeley.edu is, your computer only has to know where to find an edu name server. And that server only has to know where to find a berkeley.edu name server. That server at UC Berkeley directs your computer to snap.berkeley.edu.

There are two kinds of top-level name domains: country codes and categories. Every country has an official two-letter country code (such as.us for the United States, .uk for the United Kingdom, or .th for Thailand).

Originally, there were six category domains:

.gov for government,
.edu for educational institutions,
.com for commercial businesses,
.mil for (US) military sites,
.org for general organizations, and
.net for network-related sites.

More recently, new top-level category domains have been created, e.g., .biz for businesses, as an alternative to .com. These new categories are intended to relieve crowding in the original domains. For example, let's say you start a company called Garply Web Services. You can't have garply.com, which is taken by an online telephone directory service. But (as of October 2015) garply.biz is available.

There is still competition even with these new category domains. As soon as they were introduced, people bought huge numbers of common words in order to resell them at higher prices. For example, Mr. Robert Skelton of Cheswick, Australia offers the domain name snap.biz for $7500. By contrast, you can buy garply.biz for $7.99 if you hurry!

The Basics of the Internet

What is a URL?

A Brief History of the Internet

Network Redundancy

Domain Name Hierarchy