These exercises use National Park visitation data from 1979–2024. For more context about the dataset, see the data essay.
Concepts covered:
- Groupby with aggregation (mean, count)
- Counting distinct values within groups
- Descriptive statistics by category
Load the data
import pandas as pd
np_data = pd.read_csv("https://raw.githubusercontent.com/melaniewalsh/responsible-datasets-in-context/main/datasets/national-parks/US-National-Parks_RecreationVisits_1979-2024.csv")Exercise 1
What is the average number of visits for each state?
Save as avg_state_visits and then view the resulting dataframe.
Your code hereDiscuss/consider: What state has the most and least average visits? What patterns or surprises do you notice?
Exercise 2
What is the average number of visits for each National Park?
Save as avg_park_visits and then view the resulting dataframe.
Your code hereDiscuss/consider: Which National Park has the most and least average visits? What patterns or surprises do you notice?
Exercise 3:
How many National Parks are there in each state?
Save your answer as distinct_parks.
Your code hereDiscuss/consider: What state has the most and least average visits? What patterns or surprises do you notice?