Reading the Web with Snap!

From Observation Notes Analysis:

The Internet is full of information that you can use in your programs. For example in Lab 4, you will write an umbrella? predicate block that looks up the weather forecast for your location and tells you whether or not to carry an umbrella. You'll also build a current temperature block that reports the current temperature for a certain location:
current temperature in (New York) reporting 34

Weather forecast sites change their pages from time to time, so we're going to start by reading a website we control.
  1. Load this project.
  2. Talk with Your PartnerClick on thehttp:// block block. What do you think the http block is reporting?
    html text reported by http block
  3. In another browser window, open the page http://snap.berkeley.edu for reference. The page starts with two pictures (the Snap! logo and a screenshot), then some text.
    snap.berkeley.edu as seen in browser
  4. It may not look like there is anything in the speech balloon that corresponds with the actual page, but look at the page title (in the top bar of the browser window or tab). Can you find that text in what the http block reported?

The output of the http block is the HTML code that creates the page at the address in the input slot.

Every web page has a head (that tells the browser how to format the page) and a body (that contains the actual information on the page). You'll see something more interesting if you skip over the head by looking for the text "<body":
substring of (http://(snap.berkeley.edu)) starting with (<body) reporting

There are two custom substring blocks included in this project. They each report only a portion of a text string:

For example, if the variable string contains "This is an example of the substring blocks.", then the two substring blocks will report: substring of (string) starting with (example) reporting 'example of the substring blocks.'  substring of (string) up to (example) reporting 'This is an '

  1. Look at the report from the substring of (http://(snap.berkeley.edu)) starting with (<body) block. Identify the code for the two pictures and the beginning of the text that appear on the web page.
  2. Try the http block with a different URL. (Hint: it's probably not going to work. So, just try it and keep reading.)
    Be sure not to include the "http://" in the input slot. That's already included as part of the http block.
    Don't include the 'http://' in the http block input slot.

A security feature of what? Snap!? The browser? --MF

I'm not sure, actually. Not Snap!. Probably the browser. --bh

You very likely got an empty result back. This is due to a security feature. One web site (such as the Snap! server) is not allowed to send a request to another site. Fortunately for us, it's not a very solid attempt at security...

  1. Try the proxied http:// block with your URL. (Hint: this time, it should work. Be sure not to include the "http://" in the input slot.)
  2. The proxied http block works by fooling the other web site into sending you data it wouldn't normally send. Most websites receive a request coming from Snap! and say "this isn't a real web browser query, so I'm not answering." The proxied http block uses proxy (a replacement or stand-in) servers (in this case, alloworigin.com) to call the URL you enter. The proxy server performs the request properly (as a browser would), gets the information, and sends it to Snap! as though things went smoothly.

    In short, the proxied http block works the way we want the http block to work.

  3. Save Your WorkYou will use this project as a starter for other projects.