Unit 4 Lab 3: Number Representation, Page 2

GH Feedback 1/16/16 (from Dan??):

big wall of text, this could very easily be split into a few discrete chunks.
Sci notation: This should be more optional / review focused. Sci Notation, I believe is common core 8th grade math.
unless requried by the AP spec, floating point details clearly need to be labeled as optional, or at the very least "this is tricky"
- No. The AP is not the standard of what's optional or not in BJC. BJC isn't a CS Principles course; it's a serious computer science course that also happens to prepare you for the CS Principles exam. --bh
bit context: deserves it's own page, perhaps with simpke multiple choice questions
though an important sentence, that highlight is terrible, distracting and confusing. If we want it emphasized, make it a heading and repeat it 2-3 times.

GH Feedback 1/26/16 from Dan:

I can't tell you how happy I am to see bignums make their way (back) into Snap! and BJC!!
I do hope it can become a library soon. Also, I noticed a few things weren't "bignum-ified":
10^X block when X is an integer
pick random between two bignums doesn't really pick a random number (they're all in the 100000s)
floor of (bignum) doesn't return a bignum
LOVE the rationals though...

Other Feedback:

GH Feedback 4/22/16: The text "In great languages, that data type code is attached to the value itself. [OK] In ordinary languages, when you make a variable, you have to say what type of value it will contain, [OK] and the data type is attached to the variable, [OK] so you can't get exact answers when the values are integers and also be able to handle non-integer values of the same variable." [umm, so I can handle integers or something else (float or string] but can't operate on universal data to exact precision. Is that it? I'm not sure this paragraph made me care that I can't do this.]

Huh? Apparently 30! is a decimal fraction, 2.65..., and what's that "e+32" at the end?

What you're seeing is scientific notation. The "e" means "times ten to the power," so this notation means

Fixed Width Computer Hardware

So why did Snap! display 20! in ordinary whole number representation, just a string of digits, but 30! in scientific notation? You already know that computers use bits to store information, including numbers. The new thing for you to learn is that every computer model is designed with a certain width, the number of bits that the processor reads from memory or writes into memory at a time. The object composed of that number of bits is called a word. As we write this in 2016, most new computers are 64 bits wide—that is, they have 64 bit words—but there are still a lot of 32-bit computers around. The first microcomputer, the Intel 4004, first sold in 1971, was just 4 bits wide!

On a 64-bit computer, if you want to represent both positive and negative integers, one word can fit 2⁶³=9223372036854775808 of each (2⁶⁴ nunbers altogether). 2⁶³ is about 9 quintillion, 9×10¹⁸. That means that the 19-digit value of 20! just fits in a word, but the 33-digit 30! doesn't.

Bignums

All computers have a finite amount of memory. So it's not a surprise that there's a limit to the size of integers a computer can represent exactly. But that limit is way more than 64 bits; your computer probably has 137 billion or more bits of memory. Why can't programming languages just use more than one word if necessary to represent an integer?

They can. It's just that the language that computers "natively" know adds one-word numbers with a single machine intruction, whereas the person who designs the programming language that we write in must work a little harder to make "add" work with multiple-word values. (A multiple-word integer value is called a bignum.) The best programming languages do that; remember that the whole point of computer science is abstraction, and one important kind of abstraction is protecting the user from having to know about hardware limitations. Alas, all the languages you are most likely to have heard of limit integer values to what fits in a word.

This business about better and worse languages isn't just a question of taste; it can be a matter of life and death. Between 1985 and 1987, a therapeutic X-ray machine called the Therac-25 killed four patients and seriously injured two more because of several bugs in its software; one of the bugs was that a usage counter that was kept in an eight-bit-wide variable would reach its maximum value of 127 and then overflow to zero instead of 128. When the variable was zero, an important safety check was not performed. This would not have happened if the Therac software had been written in a better programming language.

Click on this block in the scripting area:
Now try 30! again.

Note that this (exactly correct) value is different from the (rounded off) floating point value above. (More about floating point in a moment.)
Try 200!. The resulting speech balloon won't fit on your screen, but you can see it this way:
1. Hold down the shift key and right-click or control-click the ! block.
2. In the menu that appears, click on the red "script pic with result" item.
3. You'll see a new browser tab with an unreadably small picture. Click on the picture and it should expand to readable size with a scroll bar at the bottom of the screen.
How many digits are there in 200! ? Hint: Don't count by hand; you have a computer.

Floating Point

What's up with that? We've already seen that 200!, although it's very large, isn't infinite. Computers are perfectly capable of handling it. As you might guess, this is also the result of a size limitation. The floating point representation that computers use for non-integers (and, in a non-bignum language, integers that are too big for the integer representation) can only represent numbers up to about 10³⁰⁸. But 200! is about 8×10³⁷⁴ (as you should know if you did problem 7). If the result of a computation is bigger than that, floating point hardware returns a special code that represents Infinity.

Floating point is essentially a binary version of scientific notation. (Click for details.)

Most real numbers, not just especially large ones, are not exactly representable in this format. Think about the decimal representation of 1/3, 0.33333... It has infinitely many digits, so the closest you can come in floating point isn't exactly 1/3. That's why even great programming languages set a limit to the number of digits representable in floating point notation.

Imagine a decimal floating point representation with one significand digit, and a range of exponents from 10^-2 to 10². The smallest positive number representable in this notation is 0.01 (1×10^-2) and the largest is 900 (9×10²). Sketch a number line from 0 to 1000 and mark all of the positive values representable in this notation. What can you say about the spacing of values? How many fractional values are representable? How many integer values less than 1000 are not representable? What are the strengths and weaknesses of this choice of representable values? (Real floating point has many more representable values, of course, but the way they're spaced on the number line is similar to this.)

Binary Sequences in Context

As an integer, that pattern of bits represents 243290200817664000. But that same pattern of bits represents

when considered as a floating point representation. The same pattern of bits may also represent an instruction in the machine language of whatever computer you're using. This is so important we're going to say it in boldface: The meaning of a sequence of bits depends on the context in which it is used. That's the take-away understanding you should get from this page, after you've forgotten details such as how many bits there are in a word.

What exactly do we mean by "context"? How does a programming language know whether to interpret a bit sequence as an integer, a float, a string of characters, an instruction, or something else? Here, too, programming languages differ. There's always another bit sequence somewhere that encodes the data type of the bit sequence. In great languages, that data type code is attached to the value itself. In ordinary languages, when you make a variable, you have to say what type of value it will contain, and the data type is attached to the variable, so you can't get exact answers when the values are integers and also be able to handle non-integer values of the same variable. So instead of seeing

We're telling you all these things because Snap! has strengths that many programming languages do not, and it's very likely that your next year's computer science class will use one of those other languages. For example, if you take the AP Computer Science A course, it will use Java.

All through this page we've been talking about numbers, but don't forget that, at the lowest level of software abstraction, everything in a computer is represented as a binary sequence. A Boolean value is a single bit, 0 for false and 1 for true. A character string is a sequence of Unicode character codes, each of which is a small integer— a sequence of bits. Lists and procedures, too, are binary sequences, but the precise format is beyond the scope of this course.