The Software Domain: Programming Languages

KEEP SUCH THAT needs to be fixed. --MF, 6/22/20

This page is getting long. After we finish going through the standards, we should consider how much of this is really needed. --MF, 1/29/19

Also, is this a page where we might want to make any of the "not a baby language" argument? --MF, 1/29/19

MF: lightly clean up to make the text more concise

On this page, you will consider why there are different programming languages and look at some of the ways that languages differ.

JavaScript, Python, Snap!, C++, Java, Scheme, Prolog... Why are there so many programming languages? Why don't we just pick the best one, or design a new best one, and stick with that?

Some languages have very narrow purposes; these are called special-purpose languages. For example, Microsoft Word has a programming language built into it called "Word macros" that's just for generating data and formatting in a document. Likewise, HTML (Hypertext Markup Language) is just for structuring web pages.

General-purpose languages don't have a narrow purpose in mind. In a sense, these languages are all the same: if an algorithm can be expressed in one language, it can be expressed in all of them. Several basic features are included in nearly all languages including arithmetic operators (+, -, ×, ÷) and Boolean operators (and, or, not). The differences among languages are mostly about levels of abstraction.

High-Level and Low-Level Languages

diagram of common programming languages listed in order of abstraction level; there is a vertical double-headed arrow on the right indicating that the first row of languages (Snap!, Scheme, Prolog, Ruby, Lisp) are 'high level languages,' the second row (JavaScript, Python, Java, Alice, Scratch) falls between, and the third row (C, C++) are 'low level languages'

A high-level language (like Snap! or Scheme) includes many built-in abstractions that make it easier to focus on the problem you want to solve rather than on how computer hardware works. A low-level language (like C) has fewer abstractions, requiring you to know a lot about your computer's architecture to write a program.

Why do programmers use high-level languages?

High-level languages can produce safer programs—ones that are less likely to have bugs—because the abstractions manage messy details that can trip up programmers.

High-level languages reduce bugs in memory use. Older, low-level languages required the programmer to manage the use of the computer's memory with instructions saying "get me a block of memory big enough to hold 100 numbers" and other instructions saying "okay, I'm finished using this block of memory; it can be allocated for some other purpose."

This is a nuisance to have to think about, and human programmers are bad at it. In low level languages, a very common bug is for one part of a program to say "I'm done with this block of memory" while another part of the program is still using it. High-level languages take care of this for us by using a technique called garbage collection that puts the computer in charge of knowing when a block of memory is no longer in use.

High-level languages can also make programming much more convenient because they offer more abstractions. One example is higher-order functions (like map, keep, combine, and for each), which allow the programmer to write shorter, cleaner code.

  1. This code is similar to a higher-order procedure that you have learned. Talk with Your Partner Identify the procedure that this code imitates:
    script variables (result) (index); set result to (list); set index to (0); repeat (length of (words)){ change index by (1); add (join (item (index) of (words)) s) to (result)}; report (result)
    for each (word) of (words) {report ((join (word) (s)))}
    report (keep items such that (join () (s)) from (words))
    report (map (join () (s)) over (words))
    report (combine with (join () (s)) items of (words))

In C, you can do this the long way:
script variables (result) (index); set result to (list); set index to (0); repeat (length of (words)){ change index by (1); add (join (item (index) of (words)) s) to (result)}; report (result)
but C doesn't let you take an expression (like join () (s) or ((5) × ( )) + (7)) and stick it into a higher-order function like map:
report (map (join () (s)) over (words))

Why do programmers use low-level languages?

The best reason to use low-level languages is to write operating systems (like Windows, Mac OS X, Android, or iOS). You'll learn more about Operating systems on the The Software Domain: Operating Systems page.

Why else would a programmer use a low-level language?
I still think this is too long and needs wordsmithing the next time around. --MF, 11/16/17

Application programmers don't often decide "I'm going to write this program in a low level language." They may simply not realize that higher levels of abstraction are possible. For example, a computer's hardware limits the size of numbers that its arithmetic unit can add in a single step. Four billion—about ten digits—is a common size limit for integers. Programmers who use Java, JavaScript, Python, C or C++ may think that this limit is unavoidable. But programmers who use really high level languages, such as Scheme or Common Lisp, know that they can do arithmetic on numbers with millions or billions of digits, limited only by the size of the computer's memory. As you will see later, Snap! has a library that lets it do this, too.

People often say that different programming languages are good for different kinds of programs, but except for 3-D video processing (next paragraph), it's hard to imagine an application that would be harmed by things like garbage collection or higher-order functions. There are just a few cases in which people deliberately design languages with features that might not be wanted for some applications. Here's one such example: In Snap!, a text string of only digits is considered to be a number; you can do arithmetic on it. In a language for learners, requiring explicit conversion between data types just makes it harder to get started programming. But most languages that aren't meant for beginners keep the two data types separate.

Programmers may think that abstraction is too slow. This used to be true, and programmers of 3-D video games still need all the speed they can get because their programs strain the speed of modern computers. So they often write part of their programs, the part that actually puts pictures on the screen, in machine language, just for speed. But most programmers write applications that don't strain computers at all. When you send an email or text message, the limiting factor is how fast you can type, not how fast your computer can run programs.

From Michael:
Somewhere the was a comment about the speed of abstraction. IMO, we shouldn't have this in the curriculum -- but probably in the teachers guide. "Abstraction slows things down" is an argument students won't hear if they're just learning CS from BJC. By trying to address the argument before it happens, we just introduced to students who I think would have no reason to believe otherwise.
  • Brian thinks that Michael would not have this concern if the link was still, "There are also less-good reasons." However, Mary and Paul find that text problematic. Brian and Mary agreed to record and ignore this issue for now. --MF, 11/21/17

Legacy code. Programmers in industry hardly ever get to write a program from the beginning. Much more often, they're maintaining a program that somebody wrote years ago, and that person might not even work for that company anymore. In the long run, it might be better to rewrite the program in a more modern language, but in the short run, there's no time to do that so they end up modifying the existing code in the existing programming language.

What is machine language?

Both high- and low-level languages are used by people to write computer programs. Computer hardware understands a sort of ultra-low-level language, called machine language. Special programs called compilers and interpreters are used to translate human programming languages into machine language to be run by the computer.

Read more about compilers and interpreters.

A compiler is a program that takes a high- or low-level language program (the source code) as input and produces a machine language program (the object code) as its output. Once produced, the machine language program can be run repeatedly without needing to be compiled again.

An interpreter is a program that takes a high- or low-level program as input and carries out machine language instructions as needed to run the program. It does not produce a stand-alone machine language program as output and will have to repeat the process again next time.

Does that mean compilers are better?

It would mean that, except that the process of writing a program includes debugging. During the debugging, an interpreter can help by providing information about the progress of the program, like the visual stepping feature in Snap!, and allowing small changes in the source program without having to run a compiler repeatedly. For example, in Snap! you can drag a block into a script while it's running, and a compiler couldn't allow that.

For professional programmers, the best arrangement is to have both an interpreter and a compiler for the same language. The programmer writes and debugs the program using an interpreter, and once they're sure it works, they compile it. Then, the compiler can run slowly, putting a lot of effort into optimizing the machine language code, so they get the fastest possible compiled program.

  1. These questions are similar to those you will see on the AP CSP exam.
    Which of the following statements are correct about a low-level programming language compared with a high-level programming language?
    1. Low-level language programs are generally harder for people to understand than programs written in a high-level language.
    2. A low-level language provides programmers with more abstractions than a high-level language.
    3. Low-level language programs are generally harder to debug than programs written in a high-level language.
    I only.
    I and III only.
    II and III only.
    I, II, and III.
    A program is written in a high-level programming language. Identify the correct statement about the program?
    The program can also be written in machine language using binary code, but then it will be less easily understood by people.
    The program can also be written in machine language using binary code, which will decrease the possibility of mistakes.
    The program cannot be written in binary code as only data can be represented by using binary notation.
    Simple parts of the program can be written in binary code, but control elements such as conditionals, loops must be expressed in a high-level programming language.

Code Readability

One of the features that Snap! gives you is that you can put title text in the middle of a block.

You built polygon in Unit 1: Graphics and Art.
polygon, sides: (30) side length: (15)
Compared to some other languages where the function has one name at the beginning and then all the inputs, this increases clarity and readability of your function.

polygon(30, 15)

Also, in a text-based language, when you see something like 3 × 5 + 4, you need to have memorized that multiplication comes before addition (so the answer is 19). If you want it the other way, you have to use parentheses: 3 × (5 + 4) to get 27. In a blocks-based language, the blocks show you what was intended: 3 × (5 + 4). You've learned order of operations for +, –, ×, and ÷ in math class, but you probably haven't learned order of operations for an expression like this:
x && y << z
. How you know which comes first
&&
or
<<
?
How do you know?

See for example, C Operator Precedence.

Parallelism

5.2.1H A process may execute on one or several CPUs.

One reason to create new programming languages is to make it easier to write parallel programs—programs that can use more than one processor at the same time. Today in 2017, computers and smartphones have multicore processor chips that may include 2, 4, or 8 processors all running code at the same time. (The number of processors will increase even further over time.) Big companies such as Google use parallelism even more; they have clusters of thousands of computers, all working on the same program.

4.1.2D Different languages are better suited for expressing different algorithms.
4.1.2E Some programming languages are designed for specific domains and are better for expressing algorithms in those domains.

Functional programming languages (languages in which programmers never change the value of a variable) are particularly well suited to parallelism because there's no danger of one processor changing the value of a variable that another processor is using. We've introduced you to functional programming techniques wherever possible throughout this course, including writing reporters and using higher-order functions (map, keep, and combine).

Snap! isn't a functional programming language, but it would be if the Snap! developers removed just a few procedures, including set (instead, you'd use input variables of recursive functions) and these four list commands: add, delete, insert, and replace (instead, you'd use in front of, item 1 of, and all but first of to report a new list with different values instead of changing the old list).