5.1.3
AAP-2.P part b, AAP-2.P.2
In order to use a binary search, the data must be...
binary
All data in a computer are represented using binary (ones and zeros), but that has nothing to do with binary searches, which compare against the middle value to choose which of two halves to eliminate.
sorted
Correct! If the data are sorted, then comparing to the middle value will give you good information about which half of the data to keep.
unsorted
If the data are unsorted, you can't be sure that everything before or everything after the middle value can be eliminated.
linear
"Linear" is the name of another search algorithm, not a property of the data.
5.1.3
Which of the following questions can be answered with a binary search, assuming the data are sorted? Check all that apply:
What is my friend Rasheed's phone number?
Correct! You are searching for one phone number in the list.
Give me a list of all the Beyoncé songs.
We have to find all the Beyoncé songs, not just one.
Tell me if bread is on my shopping list.
Correct! You are searching for one item in the list.
Who in my contact list lives on Grand Avenue?
Your contact list is probably sorted by name, not by address. Also, there may be more than one person who lives on Grand Avenue.
5.1.5
The table below shows the computer time it takes to complete various tasks on the data of different sized towns.
Task |
Small Town (population 1,000) |
Mid-sized Town (population 10,000) |
Large Town (population 100,000) |
Entering Data |
2 hours |
20 hours |
200 hours |
Backing up Data |
0.5 hours |
5 hours |
50 hours |
Searching through Data |
5 hours |
15 hours |
25 hours |
Sorting Data |
0.01 hour |
1 hour |
100 hours |
Based on the information in the table, which of the following tasks is likely to take the
longest amount of time when scaled up for a city of population 1,000,000.
Entering data
As the population size is multiplied by 10, the time needed for entering data is also multiplied by 10, so for a population of 1,000,000, it should take about 10×200=2000 hours.
Backing up data
As the population size is multiplied by 10, time needed for backing up data is multiplied by 10, so for a population of 1,000,000, it should take about 10×50=500 hours.
Searching through data
Searching through the data seems to go up by about 10 hours each time the population is multiplied by 10, so for a population of 1,000,000, it should take about 35 hours.
Sorting data
Correct! As the population size is multiplied by 10, the time needed for the sorting of data is multiplied by 100. So, for a population of 1,000,000, it should take about 100×100=10,000 hours.
5.1.6
In which of the following problems is a heuristic solution appropriate?
Find the biggest item in a list.
We can find the solution to this problem in polynomial time.
Find the best combination of ingredients for spaghetti sauce.
There is no perfect (correct) solution to this problem because different people have different tastes.
Playing chess.
Correct! This is a good example because there is a solution (a way to determine the outcome of a perfectly played game), the solution can't be found in polynomial time, and an approximate solution would be helpful.
Find the combination to a lock with n numbers.
There is no possibility of a heuristic because it's not helpful to have an almost correct combination (an approximate solution).
5.1.8
CSN-2.A part b, CSN-2.A.5
How long will this
sequential program take to run?
8
A sequential solution takes as long as the sum of the run times of all of its steps.
4
A sequential solution takes as long as the sum of the run times of all of its steps.
6
A sequential solution takes as long as the sum of the run times of all of its steps.
5.1.8
CSN-2.A part b, CSN-2.A.6
How long will this
parallel program take to run?
18
The two when I receive
tasks happen in parallel, not one after the other.
8
The longest parallel time does matter, but it's not the only thing that contributes to the total time.
6
Broadcast and wait
waits until all the tasks that it started have finished.
5.1.8
\frac{14}{18}
Speedup is calculated by dividing the sequential time by the parallel time.
\frac{18}{6}
Broadcast and wait
waits until all the tasks that it started have finished.
\frac{18}{8}
The longest parallel time does matter, but it's not the only thing that contributes to the total parallel time.
5.3.2
Scientists studying birds often attach tracking tags to migrating birds. For each bird, the following data is collected regularly at frequent intervals:
- Date and time
- Latitude and Longitude
- Altitude
- Temperature
Which of the following questions about a particular bird could
not be answered using
only the data gathered from the tracking tags.
Approximately how much time does the bird spend in the air and on the ground?
This could be determined from the “Altitude” data.
Does the bird travel in groups with other tracked birds?
This could be determined from the “Latitude and Longitude” data.
Is the migration path of the bird affected by temperature patterns?
This could be determined from the “Temperature” data.
What are the effects of industrial pollution on the migration path of the bird?
Correct, there is no data collected on pollution in the bird’s environment.
5.3.2
Using computers, researchers often search large data sets to find interesting patterns in the data. Which is of the following is not an example where searching for patterns is needed to gather desired information?
An online shopping company analyzing customers purchase history to recommend new products.
This is an example of searching for patterns to gather desired information.
A high school analyzing student attendance records to determine which students should receive a disciplinary warning.
Correct, there is no need here for pattern analysis, just sorting the data to get a list of students with poor attendance records.
A credit scoring company analyzing purchase history of clients to identify cases of identity theft.
This is an example of searching for patterns to gather desired information.
A college analyzing high school students’ GPA and SAT scores to assess their potential college success.
This is an example of searching for patterns to gather desired information.
5.3.2
A car hailing company uses an app to track the travel trends of its customers. The data collected can be filtered and sorted by geographic location, time and date, miles traveled, and fare charged for the trip. Which of the following is least likely to be answerable using only the trends feature?
What time of the day is the busiest for the company at a given city.
Filtering by geographic location and sorting through time information would yield this information.
From which geographical location do the longest rides originate.
Sorting through miles traveled and noting geographic location would yield this information.
How is competition with the local cab companies affecting business in a given district.
Correct, there is no information on the competition available in the data collected.
How much money was earned by the company in a given month.
Filtering by date and summing up fares charged would yield this information.
5.3.2
An online music download company stores information about song purchases made by its customers. Every day, the following information is made publicly available on a company website database.
- The day and date of each song purchased.
- The title of the song.
- The cities where customers purchased each song.
- The number of times each song was purchased in a given city.
An example portion of the database is shown below. The database is sorted by date and song title.
Day and Date |
Song Title |
City |
Number of Times Purchased |
Mon 07/10/17 |
Despacito |
Boston, MA |
117 |
Mon 07/10/17 |
Malibu |
Chicago, IL |
53 |
Mon 07/10/17 |
Malibu |
New York, NY |
197 |
Mon 07/10/17 |
Bad Liar |
Anchorage, AK |
11 |
Tue 07/11/17 |
Despacito |
San Diego, CA |
241 |
Which of the following
cannot be determined using only the information in the database?
The song that is purchased the most in a given week.
This information can be found by summing all the purchases of every song in a given week.
The city with the fewest purchases on a particular day.
This information can be found by summing all the purchases of every city on a given day.
The total number of cities in which a certain song was purchased in a given month.
This information can be found by listing the cities for all the purchases of a given song in a given month.
The total number of songs purchased by a particular customer during the course of a given year.
Correct, there is no data publicly displayed on individual customers.
5.3.6
A new mobile phone company—unbeknownst to its customers—periodically records random snippets of their conversations and considers the recordings as data. In addition, it collects the following metadata on the conversations:
- The start and end time of the conversation
- The phone numbers of the users in the conversation
- The GPS locations of the users in the conversation
For which of the following goals would it be more useful to analyze the data instead of the metadata?
To determine if any of its users was present at the time and place of a crime.
For this purpose, GPS and time information (in the metadata) would be more useful.
To determine the time of day the app is used most frequently in a certain geographic location.
For this purpose, GPS and time information (in the metadata) would be more useful.
To determine the language most commonly used in user conversations.
Correct. For this purpose, the conversation data itself would be analyzed.
To determine the most active users of the app for a given year.
For this purpose, user phone numbers and time information (in the metadata) would be more useful.
5.3.6
Which of the following is not an example of metadata?
Data about a digital image describing the size of the image, image resolution, color depth, and when the image was created.
This is an example of data about data.
Data about a text document containing information about the length of the document, its author, the date the document was written, and a short summary of the content.
This is an example of data about data.
Data about a pack of wolves describing their habitat, hunting habits, diet, and sleep cycles.
Correct. Data about wolves is not data about data.
Data about a web page containing a description of page content and a list of key words linked to the content.
This is an example of data about data.