4.2 - Introduction to Using Data Sets

mg8mer

Introduction

Welcome back! Today we’re covering topic 4.2 of AP CSA!

This topic’s learning objectives as per the AP® Computer Science A Course and Exam Description 2025 are as follows:

4.2.A: Represent patterns and algorithms that involve data sets found in everyday life using written language or diagrams.

This is another “intro” article similar to the last, where we’re gonna zoom in closer on what exactly data sets are and their importance. Doing so will allow us to transition naturally into learning about Arrays and ArrayLists!

Without further ado, let’s not waste any further time and jump straight in!

What is a Data Set?

As per the exact words of the CED, a data set is a collection of specific pieces of information or data. Rather than working with individual, unrelated values, a data set groups related information together so that things are more organized.

When you think of a data set, you may be tempted to immediately think about scientific/technical data. Actually, data sets are all around us! 

For example, a teacher's gradebook contains test scores for all students in a class. Each score is a piece of data, and together they form a data set!

Processing Data Sets

Last article, we learned how to collect data, particularly in an ethical manner. But once you have that data… what do you actually use it for? At face value, data sets are just boxes of information. But actually using it to solve problems and answer research questions is how we actually get somewhere!

To analyze data sets, we access particular values, utilizing and processing them one at a time to obtain evidence that can be used to resolve our question!

Consider finding the highest test score in a class:

  1. Start with the first score
  2. Compare it to the second score, keeping track of the higher one
  3. Compare the higher score to the third score
  4. Continue through all scores
  5. Return the highest score found

Here, we take each score and analyze it carefully, comparing them to one another to determine the highest score.

Visual Representations of Data Sets

More often than not, data is still quite abstract even with meaningful context. What makes that data more accessible is visualizing it via a chart or table; this way, the data has more structure and patterns are more readily apparent.

Common Visual Representations

1. Tables.

Example: Student Test Scores

2. Lists

Example: Daily Temperatures (°F)

[72, 75, 71, 68, 70, 73, 76, 78, 74, 72]

3. Index Tables

Example: Inventory Counts

4. Bar Charts and Graphs

Using Diagrams to Plan Algorithms

Using visuals also allows us to more effectively plan out the algorithm that will be used to manipulate said data.

For example, let’s say we’re given daily temperatures for a week  (visualized below) and we’re asked to find the highest. 

Day:  Mon  Tue  Wed  Thu  Fri  Sat  Sun

Temp:  72   75   71   68   70   73   76

With a visual in hand, it is much easier now to plan what we have to do step by step:

  1. Start with Monday's temperature (72) as current maximum
  2. Compare Tuesday (75) to current max (72) → 75 is higher, update max to 75
  3. Compare Wednesday (71) to current max (75) → 71 is not higher, keep 75
  4. Compare Thursday (68) to current max (75) → 68 is not higher, keep 75
  5. Compare Friday (70) to current max (75) → 70 is not higher, keep 75
  6. Compare Saturday (73) to current max (75) → 73 is not higher, keep 75
  7. Compare Sunday (76) to current max (75) → 76 is higher, update max to 76
  8. Return 76 as the maximum

Ok. I think you’ve seen enough content. It’s now time… to practice!

Practice Questions