MF: lightly clean up to make the text more concise
On this page, you will learn more about how information is represented inside the computer as sequences of bits.
Welcome...
" string might actually occupy 44 bytes.So far we've been working with small chunks of data, from Boolean values (one bit) to characters (eight bits). But of course some information in your computer or smartphone is much bigger than that. For starters, characters aren't generally used one at a time; they're used in text strings such as "Welcome to the Beauty and Joy of Computing.
" These 43 characters occupy 43 bytes of computer memory. But the real champion users of space are media files: pictures, sounds (mostly music), and video.
If we could see inside the memory's bits, a section of the memory might look something like this:
01000001001000000111001101101101011000010111001001110100001000000111000001101000011011110110111001100101001011000010000001101100011010010110101101100101001000000111100101101111011101010111001000100000011011000110000101110000011101000110111101110000001000000110111101110010001000000110010001100101011100110110101101110100011011110111000000100000011000110110111101101101011100000111010101110100011001010111001000101100001000000110001101100001011011100
A binary sequence (also called a bitstream) is a string of ones and zeros.
That shows just 449 bits. A 16GB cell phone has 16 gigabytes (about 16 billion bytes) of storage with each byte containing 8 bits. That's 128,000,000,000 bits. Printed on paper as ones and zeros, the 16GB phone's memory would take nearly 40,000,000 pages. The information in storage—whether it is a text message, a photograph, a song, a computer program, or a list of phone numbers—all looks the same, like a sequence of bits that are either On or Off (one or zero), a binary sequence.
How much information fits in a gigabyte?
Here are a few rough examples of what kind of data would fit in how much memory:
name | amount | example |
---|---|---|
bit | either a 1 or a 0 | 1 |
byte | 8 bits | 11011001 |
kilobyte | 210 (1,024) bytes | a couple of paragraphs |
megabyte | 220 (1,048,576) bytes | about 1 book |
gigabyte | 230 (1,073,741,824) bytes | a little more than 1 CD |
terabyte | 240 (1,099,511,627,776) bytes | about 1,500 CDs |
petabyte | 250 (1,125,899,906,842,624) bytes | about 20 million filing cabinets of text |
exabyte | 260 (1,152,921,504,606,846,976) bytes | about 20% of all the words ever spoken by humankind |
As we write this in 2017, it's common to have a terabyte disk drive on your desk. Web services deal with petabytes or exabytes of data.
Where do these prefixes like "tera-" and "peta-" come from?
When we write big numbers, we put commas every three digits (counting from the right). Each group of three has a name: thousand, million, billion, and so on. So, the number 1,234,567,890 is pronounced "one billion, 234 million, 567 thousand, 890." Those group names ("thousand" and so on) also have prefix names used in metric measurements:
prefix | amount | amount as numeral |
---|---|---|
kilo- | thousand | 1,000 |
mega- | million | 1,000,000 |
giga- | billion | 1,000,000,000 |
tera- | trillion (a million million) |
1,000,000,000,000 |
peta- | quadrillion | 1,000,000,000,000,000 |
exa- | quintillion (a billion billion) |
1,000,000,000,000,000,000 |
Digits for groupings smaller than one (fractions) have metric prefixes too:
prefix | amount | amount as fraction |
---|---|---|
milli- | thousandth | 1/1,000 |
micro- | millionth | 1/1,000,000 |
nano- | billionth | 1/1,000,000,000 |
pico- | trillionth | 1/1,000,000,000,000 |
femto- | quadrillionth | 1/1,000,000,000,000,000 |
atto- | quintillionth | 1/1,000,000,000,000,000,000 |
"Binary sequence" is a very broad category, and often, several layers of abstraction are built on it. For example, you can include a picture in an email or text message, in which case, the message includes a picture, which is a kind of file, which is a kind of binary sequence.
It's unclear what to do here. It might be better to give them some strings to translate. --MF, 6/1/20
Brian has more or less convinced Mary to change the block names from "convert decimal to binary" to "convert number to binary" and vice versa AND they both agreed to drop the recursion inside the convert number to binary
block. The starter project, solutions, and references to it in the page will all need edits. We haven't decided if changes are needed on the Binary Representation page. --MF, 6/5/20
Take a look at these 3 custom blocks that you will use to explore binary sequences:
set (
output) to...
instruction and change the input text to a short text string of your choosing. The reported binary sequence will be stored in the output variable with quotes around it.translate binary sequence to text
block and run it. (It may take a moment to report.)
translate binary sequence to B&W image
block and run the block. You are not likely to see anything meaningful. Why not?translate binary sequence to B&W image
with the second input set to 14 pixels wide:00000110000000000001000110000000010000000000001100100110000011111111000001100111100000010010110011000111001111100000100110110000000001000000000000110000000000111000000011000100011000010000000100000110000110000000111111000000
Analog data have values that change smoothly, unlike digital data which change in discrete intervals.
Sampling means measuring values, called samples, of an analog signal at regular intervals.
The sampling rate is the number of samples measured per second.
Not all data are naturally digital. (That is, they may not be individual values that can be represented in the form of binary sequences.) Some real-world values (such as the pitch and volume of music, the colors of a painting, or the position of a sprinter during a race) change smoothly over time or position; they are analog. When analog data are encoded digitally (as bits on a computer), their values are approximated. This is an example of abstraction. The continuously changing air pressure of a sound, for example, is sampled (measured) thousands of times a second, and the samples are stored as bits.
So if pictures, music, and words all look the same in memory—all binary sequences—how can the computer tell what any chunk of memory actually is? For example, should the sequence 01000001 be interpreted as the number 65, the letter A, a rather dark shade of red, or something else?
The meaning of a sequence of bits depends on the context in which it is used. What exactly do we mean by "context"? How does a programming language know whether to interpret a bit sequence as an integer, a picture, a string of characters, an instruction, or something else? There's always another bit sequence somewhere that encodes the data type of the bit sequence.
But different languages use data types differently. In high-level languages, that data type code is attached to the value itself. In lower-level languages, when you make a variable, you have to say what type of value it will contain, and the data type is attached to the variable, so you can't get exact answers when the values are integers and also be able to handle non-integer values of the same variable. So instead of seeing
you see things like
Snap! has strengths that many programming languages do not, and it's very likely that your next year's computer science class will use one of those other languages. If that's the case, you'll have to make sure that the data type you declare for a variable matches what you are going to put in it.
translate text to binary sequence
and translate binary sequence to text
reporters. Describe how these two reporters work. There are several custom blocks inside:
pack 8-bit byte
takes a binary sequence of 8 bits or less and add enough zeros to the front to make a whole byte. How is this used?translate text to Unicode list
takes a text string and outputs a list of each character's Unicode value. Why is a list output helpful here?