Introduction to Pandas with National Park Visitation Data (Exercise)

pandas
exercise
Published

February 26, 2024

Exercises

Introduction to Pandas with National Park Visitation Data

Solutions

These exercises use National Park visitation data from 1979–2024. For more context about the dataset, see the data essay.

Concepts covered:

  • Selecting columns
  • Filtering rows by a condition
  • Aggregation (sum)
  • Comparing summary statistics across groups

Load National Park Visitation data

Code
import pandas as pd

np_data = pd.read_csv("https://raw.githubusercontent.com/melaniewalsh/responsible-datasets-in-context/main/datasets/national-parks/US-National-Parks_RecreationVisits_1979-2024.csv")
np_data.head()

Exercise 1

Select 2 columns from the data. Save this 2-column dataframe to the variable smaller_df.

Code
# Your code here

How does the number of visits to Washington national parks compare to another state?

Exercise 2

Filter the dataframe for only values in the state of Washington and save to the variable wa_parks.

Code
# Your code here

Exercise 3

Calculate the sum total of RecreationVisits to Washington by using .sum() on the smaller dataframe wa_parks.

Code
# Your code here

Exercise 4

Filter the dataframe for only values in another state (your choice) and save to a variable. Calculate the sum total of RecreationVisits to this state by using .sum().

Code
# Your code here

How do the number of visits to these 2 states compare to one another?