ST EK List:
3.1.1A Computers are used in an iterative and interactive way when processing digital information to gain insight and knowledge.
3.1.1B Digital information can be filtered and cleaned by using computers to process information.
3.1.1C Combining data sources, clustering data, and data classification are part of the process of using computers to process information.
3.1.1D Insight and knowledge can be obtained from translating and transforming digitally represented information.
3.1.1E Patterns can emerge when data is transformed using computational tools.
3.1.2A Collaboration is an important part of solving data driven problems.
3.1.2B Collaboration facilitates solving computational problems by applying multiple perspectives, experiences, and skill sets.
3.1.2C Communication between participants working on data driven problems gives rise to enhanced insights and knowledge.
3.1.2D Collaboration in developing hypotheses and questions, and in testing hypotheses and answering questions, about data helps participants gain insight and knowledge.
3.1.2E Collaborating face-to-face and using online collaborative tools can facilitate processing information to gain insight and knowledge.
3.1.2F Investigating large data sets collaboratively can lead to insight and knowledge not obtained when working alone.
3.2.1A Large data sets provide opportunities and challenges for extracting information and knowledge.
3.2.1B Large data sets provide opportunities for identifying trends, making connections in data, and solving problems.
3.2.1C Computing tools facilitate the discovery of connections in information within large data sets.
3.2.1D Search tools are essential for efficiently finding information.
3.2.1E Information filtering systems are important tools for finding information and recognizing patterns in the information.
3.2.1F Software tools, including spreadsheets and databases, help to efficiently organize and find trends in information.
3.2.2A Large data sets include data such as transactions, measurements, text, sound, images, and video.
3.2.2B The storing, processing, and curating of large data sets is challenging.
3.2.2C Structuring large data sets for analysis can be challenging.
3.2.2D Maintaining privacy of large data sets containing personal information can be challenging.
3.2.2E Scalability of systems is an important consideration when data sets are large.
3.2.2F The size or scale of a system that stores data affects how that data set is used.
3.2.2H Analytical techniques to store, manage, transmit, and process data sets change as the size of data sets scale.

Self-Check: Big Data

On this page, you will prepare for data questions on the AP exam.

Here are two BJC videos about data from University of California, Berkeley.
Set Up Your Headphones or Speakers

If your connection blocks YouTube, watch the first video here and watch the second video here.


  1. These questions are similar to those you will see on the AP CSP exam.
    Scientists studying birds often attach tracking tags to migrating birds. For each bird, the following data is collected regularly at frequent intervals:
    • Date and time
    • Latitude and Longitude
    • Altitude
    • Temperature
    Which of the following questions about a particular bird could not be answered using only the data gathered from the tracking tags.
    Approximately how much time does the bird spend in the air and on the ground?
    Does the bird travel in groups with other tracked birds?
    Is the migration path of the bird affected by temperature patterns?
    What are the effects of industrial pollution on the migration path of the bird?
    Using computers, researchers often search large data sets to find interesting patterns in the data. Which is of the following is not an example where searching for patterns is needed to gather desired information?
    An online shopping company analyzing customers purchase history to recommend new products.
    A high school analyzing student attendance records to determine which students should receive a disciplinary warning.
    A credit scoring company analyzing purchase history of clients to identify cases of identity theft.
    A college analyzing high school students’ GPA and SAT scores to assess their potential college success.
    A car hailing company uses an app to track the travel trends of its customers. The data collected can be filtered and sorted by geographic location, time and date, miles travelled, and fare charged for the trip. Which of the following is least likely to be answerable using only the trends feature?
    What time of the day is the busiest for the company at a given city.
    From which geographical location do the longest rides originate.
    How is competition with the local cab companies affecting business in a given district.
    How much money was earned by the company in a given month.
    An online music download company stores information about song purchases made by its customers. Every day, the following information is made publicly available on a company website database.
    • The day and date of each song purchased.
    • The title of the song.
    • The cities where customers purchased each song.
    • The number of times each song was purchased in a given city.
    An example portion of the database is shown below. The database is sorted by date and song title.
    Day and Date Song Title City Number of Times Purchased
    Mon 07/10/17 Despacito Boston, MA 117
    Mon 07/10/17 Malibu Chicago, IL 53
    Mon 07/10/17 Malibu New York, NY 197
    Mon 07/10/17 Bad Liar Anchorage, AK 11
    Tue 07/11/17 Despacito San Diego, CA 241
    Which of the following cannot be determined using only the information in the database?
    The song that is purchased the most in a given week.
    The city with the fewest purchases on a particular day.
    The total number of cities in which a certain song was purchased in a given month.
    The total number of songs purchased by a particular customer during the course of a given year.