This page is long. Consider splitting up long pages like this next year. --MF
GH Feedback 1/6/16: In the redundancy quiz, it seems like a more interesting and relevant-to-reliability question is "How many nodes can fail and still let Sender and Receiver communicate?"
Dan Feedback 1/17/16: You have the following as "integrate as needed next year":
Bandwidth: a measure of bit rate—the amount of data (measured in bits b) that can be sent in a fixed time. Usually b/s
Latency: the time elapsed between the transmission and the receipt of a request. Usually ms.
BUT bandwidth and latency are part of the CF so they need to occur somewhere this year...
GH Feedback 1/20/16: add more nodes
I assume this is a reference to the animated gif, in which case, I see no point; it will only complicate the visual. Does anyone see anything else to which it could be referring? --MF
GH Feedback 1/22/16: The green box "For you to do" shoul be all green. It's distracting to see the white parts of the page within the green. The information about the Internet under green #1 and the brief history of the Internet should go on the top of the page.
I disagree, but Paul is really the master at these things. Thoughts from PG? --MF
From Observation Notes Analysis:
U4L1: "Students unsure how a URL compares to its visible version (many browsers / URLs hide the "http://" or "https://" part."
U4L1: "There was some confusion about why URLs don't have (show) HTTP in front of them in Chrome."
The Internet is pretty much everywhere humans are. Even astronauts use the Internet on the International Space Station.
Image from NASA
Here is a 2012 map of global Internet usage as a percentage of population from Wikipedia:
People talk as if "the Internet" and "the World Wide Web" are the same thing, but they are not. The World Wide Web is the collection of interlinked website documents you can view with a web browser by typing URLs such as http://bjc.berkeley.edu/website/privacy.html.
GH Feedback 1/22/16: Please clarify the definition of protocol in the section labeled as "What is a URL?", bullet point number 1.
GH Feedback 1/22/16: Please re-write the sentences in the "What is a URL" box. The sentences are confusing to read.
What is a URL?
A URL or Uniform Resource Locator is like an address for accessing specific web data. URLs can be broken into three parts:
The protocol (usually "http" or "https") that your computer uses to access the site, followed by a colon (:).
The domain name (or host), which is everything after // and before another slash. This is the name of the computer that hosts the page.
For example, the URL http://bjc.berkeley.edu/website/privacy.html tells your computer to use http to access the bjc.berkeley.edu server (the computer that serves web pages for that site).
Everything after the domain name. Different hosts interpret the rest of the URL in different ways, but usually additional slashes describe a path in a hierarchy of folders. Other punctuation, such as ?, =, and & are used to provide the server with parts of a request other than the path to a page.
But the Internet is more general than the World Wide Web. It also supports email (which can sometimes be seen through a web browser but is transmitted with a different protocol), file transfer, mobile apps, and many other ways computers communicate behind the scenes.
The Internet is a massive network of computers that communicate primarily using a pair of protocols (standards):
You'll learn more about Internet protocols in Unit 4 Lab 3.
IP: routing (finding paths to distant computers) and
TCP: assuring reliable transmission of data.
Watch this video from code.org:
This content from from Dan's PPTX files. Integrate as needed next year. --MF
The size and speed of systems affect their use.
E.g., Netflix on dial-up? Nope.
Bandwidth
a measure of bit rate—the amount of data (measured in bits b) that can be sent in a fixed time. Usually b/s
Latency
the time elapsed between the transmission and the receipt of a request. Usually ms.
xkcd's author calculates it to be 177 petabytes/s = 177,000,000 Gbps
A Brief History of the Internet
The beginnings of the Internet were designed in the 1960s with the connection of the first 4 computers of what was then called the Arpanet. Since then, the number of people connected to the internet has been increasing exponentially. These days, several different systems of communication run on the Internet:
Email was invented in the early 1960s.
The World Wide Web wasn't invented until over 20 years later, in 1989.
Search engines for the Web were created in 1993. Before that, there was a complete list of all servers, which would not be realistic today!
The term "Web 2.0" was coined in 1999 to describe a gradual shift in the use of the Web technology from a style in which almost all content was generated by large corporations to one in which individual users could post content (blogs, photos, etc.), leading to a revival of the social networking that had been common on the Internet prior to the development of the Web.
Take a look at the graph of "Internet Users in the World" on http://www.internetlivestats.com/internet-users/. Notice how dramatically the number of Internet users has increased in the last 20 years.
What does it mean for the growth of the Internet to be exponential?
What do you expect will happen to these numbers in the next 20 years? Will the number of Internet users increase forever?
Visit the page http://www.internetlivestats.com/one-second/, and click the arrow button (shown right) to scroll through the pictographs and get a sense of how much data is moving across the internet in just one second.
With such an enormous amount of data traveling around, the Internet needs to be reliable, and we have achieved this by making the Internet redundant. Wherever information is going, there is more than one way to get there. If part of the Internet fails, the rest remains connected even if the failed part is in the usual path from one place to another. This increases the Internet's fault tolerance (ability to work around problems) and helps the Internet scale (expand) to more devices and people.
Internet scalability is the ability of the net to keep working even as the size of the network and the amount of traffic over the network increase. The page Internet 2012 in numbers has some astonishing numbers about Internet traffic from a few years ago.
In this model of a network, what is the fewest number of nodes (connection points) that can stop working before the sender and the receiver can't communicate?
1
There are no nodes that are vital to the system. Pick any node to stop working, and you can still find another path.
2
Correct! If the node with 6 connections goes down and also either of the two to its left, the sender and receiver can't communicate.
3
Try to find a smaller number of nodes that can stop working and still break communication.
4
Try to find a smaller number of nodes that can stop working and still break communication.
5
Try to find a smaller number of nodes that can stop working and still break communication.
DomainName Hierarchy
Before the Internet, there were smaller networks of computers (for example, about 200 on the Arpanet at its peak), and every computer on a network knew the name of all the other computers on that network. This worked pretty well back when a computer cost roughly a million dollars and so there weren't very many of them. But that's not a strategy that can be scaled up for over 3 billion computers!
A hierarchy is an arrangement in which things are ranked one above the other according to status or inclusion. For example in biology, we use a taxonomic hierarchy: kingdom, phylum, class, order, family, genus, species.
That's why Internet host names are hierarchical. If you want to know where snap.berkeley.edu is, your computer only has to know where to find an edu name server. And that server only has to know where to find a berkeley.edu name server. That server at UC Berkeley directs your computer to snap.berkeley.edu.
In the future, I would like to do this for a URL that is more meaningful to the students in this class (e.g. snap.berkeley.org or bjc.edc.org). --MF
Here's a model of the hierarchy for york.cs.berkeley.edu:
There are two kinds of top-level name domains: country codes and categories. Every country has an official two-letter country code (such as.us for the United States, .uk for the United Kingdom, or .th for Thailand).
Originally, there were six category domains:
.gov for government,
.edu for educational institutions,
.com for commercial businesses,
.mil for (US) military sites,
.org for general organizations, and
.net for network-related sites.
More recently, new top-level category domains have been created, e.g., .biz for businesses, as an alternative to .com. These new categories are intended to relieve crowding in the original domains. For example, let's say you start a company called Garply Web Services. You can't have garply.com, which is taken by an online telephone directory service. But (as of October 2015) garply.biz is available.
There is still competition even with these new category domains. As soon as they were introduced, people bought huge numbers of common words in order to resell them at higher prices. For example, Mr. Robert Skelton of Cheswick, Australia offers the domain name snap.biz for $7500. By contrast, you can buy garply.biz for $7.99 if you hurry!