On this page, you will learn how running multiple scripts in parallel can reduce the total time it takes to run an algorithm.
In Snap!, you are accustomed to seeing a bunch of scripts that all run independently, which may or may not be associated with different sprites. This is kind of like parallel computing. So, if we had a different computer for each sprite, that would be true parallelism. As it is, there is only one computer, and it divides its attention among the processes by running a little bit of one and then running a little bit of the next one. Specifically, it switches at the bottom of loops (forever
, repeat
, etc.).
This section covers two computational models:
You can compare the efficiency of two different algorithmic solutions to a problem by comparing the time it takes them to perform the same task.
The run time of a sequential algorithm is the sum of the run times of all its steps.
when I receive
tasks happen in parallel, not one after the other.
Broadcast and wait
waits until all the tasks that it started have finished.
Distributed computing is a form of parallel computing that uses multiple computers (perhaps even spread out around the world).
Writing a program that does nothing but wait is, of course, unrealistic, but what is realistic is that in most problems, there isn't a solution that's purely parallel. Some part of the computation has to be done sequentially. In the previous question, the sequential part is modeled by wait 6 secs
. Parallelization with this silly example feels trivial, but imagine you work for Google. Millions of search queries and web page edits have happened today, and it's your job to have to process them. If they didn't have huge server farms with thousands of computers in each building, they couldn't keep up at all. Distributed computing lets you scale to very large problems.
As a more specific example, suppose you want to know the average word length in list of 100,000 words. You can divide the task among several computers (one for each starting letter). Each computer adds the lengths of all the words assigned to it (all the "A" words, all the "B" words, etc). Then one computer has to add the 26 partial results and divide by the total number of words to find the average. To calculate the run time of this parallel solution, you would add the run time of the longest parallel portion (the run time for the letter with the most words) to the run time of the sequential portion (adding the 26 partial results and dividing the sum by the total number of words).
Because every computation includes a sequential portion, there is a limit to how much you can speed up a task by adding processors.
A processor is a piece of circuitry inside a computer that processes the instructions from computer programs.
Image credit: Wikipedia user Solipsist
Number of Processors | Total Time Required | Solution Type |
---|---|---|
1 | 61 minutes (1 hr + 1 min) | sequential solution |
2 | 31 minutes (30 min + 1 min) | parallel solutions |
3 | ||
4 | ||
10 | ||
20 | ||
30 | ||
60 | ||
120 | ||
240 | ||
480 |
Programmers refer to the speedup of parallel solution to describe how many times as fast the parallel solution is compared to the sequential solution:
\text{speedup} = \frac{\text{sequential time}}{\text{parallel time}}
Computer hardware is very reliable; it's rare for the hardware to break down while you're running a program. But in distributed computing with tens of thousands of computers working on the same problem, it's likely that one of them will fail. (If the probability of one computer failing during a program run is 0.0001, and you use 10,000 computers, then the probability of one of them failing is about 0.368—about a third of the time!) So software has to compensate by checking that the machines are running and reassigning the tasks of a failed computer to a working one.
Also, when you are doing things in parallel, the parallel processes can interfere with each other.
If one person withdraws money from an ATM at the same time as someone at a different ATM withdraws money from the same account, you could run into a situation like this:
![]() |
ATM 1 | ATM 2 |
---|---|---|
looks up the balance ($100) | ||
looks up the balance ($100) | ||
checks if the amount requested ($40) is available | ||
adjusts the balance ($60) | ||
dispenses the money | ||
check if the amount requested ($20) is available | ||
adjusts the balance ($80) | ||
dispenses the money |
Because ATM 2 changed the account balance after ATM 1 looked up the balance, ATM 1 didn't know that the balance had been changed. The bank account ended up a balance of $80 even though it started at $100 and a total of $60 was withdrawn.
For these reasons and others, a parallel program is harder to write and debug than a sequential program.
Broadcast and wait
waits until all the tasks that it started have finished.