Solution
Introduction to Pandas with National Park Visitation Data
Exercise Without Solutions
These exercises use National Park visitation data from 1979–2024. For more context about the dataset, see the data essay .
Concepts covered:
Selecting columns
Filtering rows by a condition
Aggregation (sum)
Comparing summary statistics across groups
Load National Park Visitation data
Code
import pandas as pd
np_data = pd.read_csv("https://raw.githubusercontent.com/melaniewalsh/responsible-datasets-in-context/main/datasets/national-parks/US-National-Parks_RecreationVisits_1979-2024.csv" )
np_data.head()
0
Acadia NP
Northeast
ME
1979
2787366
1
Acadia NP
Northeast
ME
1980
2779666
2
Acadia NP
Northeast
ME
1981
2997972
3
Acadia NP
Northeast
ME
1982
3572114
4
Acadia NP
Northeast
ME
1983
4124639
Exercise 1
Select 2 columns from the data. Save this 2-column dataframe to the variable smaller_df.
Code
smaller_df = np_data[['Year' , 'RecreationVisits' ]]
smaller_df.head()
0
1979
2787366
1
1980
2779666
2
1981
2997972
3
1982
3572114
4
1983
4124639
Question: How does the number of visits to Washington national parks compare to another state?
Exercise 2
Filter the dataframe for only values in the state of Washington and save to the variable wa_parks.
Code
wa_parks = np_data[np_data['State' ] == 'WA' ]
wa_parks.head()
1913
Mount Rainier NP
Pacific West
WA
1979
1516703
1914
Mount Rainier NP
Pacific West
WA
1980
1268256
1915
Mount Rainier NP
Pacific West
WA
1981
1233671
1916
Mount Rainier NP
Pacific West
WA
1982
1007300
1917
Mount Rainier NP
Pacific West
WA
1983
1106306
Exercise 3
Calculate the sum total of RecreationVisits to Washington by using .sum() on the smaller dataframe wa_parks.
Code
wa_parks['RecreationVisits' ].sum ()
Exercise 4
Filter the dataframe for only values in another state (your choice) and save to a variable. Calculate the sum total of RecreationVisits to this state by using .sum().
Code
ca_parks = np_data[np_data['State' ] == 'CA' ]
ca_parks['RecreationVisits' ].sum ()
Question: How do the number of visits to these 2 states compare to one another?
Code
wa_parks['RecreationVisits' ].sum () - ca_parks['RecreationVisits' ].sum ()