## Code

`<- read.csv("https://raw.githubusercontent.com/melaniewalsh/Neat-Datasets/main/1979-2020-National-Park-Visits-By-State.csv", stringsAsFactors = FALSE) np_data `

dplyr

exercise

solution

Published

August 1, 2024

What is the average number of visits for *each state*?

Save as `avg_state_visits`

and then view the resulting dataframe.

Discuss/consider: What state has the most and least average visits? What patterns or surprises do you notice?

What is the average number of visits for *each National Park*?

Save as `avg_park_visits`

and then view the resulting dataframe.

```
`summarise()` has grouped output by 'ParkName'. You can override using the
`.groups` argument.
```

Discuss/consider: Which National Park has the most and least average visits? What patterns or surprises do you notice?

How many National Parks are there in *each state*?

Save your answer as `distinct_parks`

.

Discuss/consider: What state has the most and least average visits? What patterns or surprises do you notice?

```
---
title: "DPLYR Groupby with National Park Visitation Data (Solution)"
date: "2024-08-01"
categories: [dplyr, exercise, solution]
format:
html:
code-links:
- text: R Script
href: NP-Data-Groupby-Solutions.R
icon: file-code
code-overflow: wrap
code-fold: show
editor: visual
df-print: kable
R.options:
warn: false
code-tools: true
execute:
eval: true
---
# Load the data
```{r}
np_data <- read.csv("https://raw.githubusercontent.com/melaniewalsh/Neat-Datasets/main/1979-2020-National-Park-Visits-By-State.csv", stringsAsFactors = FALSE)
```
# Load dplyr library
```{r warning="ignore"}
library("dplyr")
```
# Exercise 1
What is the average number of visits for *each state*?
Save as `avg_state_visits` and then view the resulting dataframe.
```{r}
avg_state_visits <- np_data %>%
group_by(State) %>%
summarize(avg_visits = mean(RecreationVisits))
```
Discuss/consider: What state has the most and least average visits? What patterns or surprises do you notice?
# Exercise 2
What is the average number of visits for *each National Park*?
Save as `avg_park_visits` and then view the resulting dataframe.
```{r}
avg_park_visits <- np_data %>%
group_by(ParkName, State) %>%
summarize(avg_visits = mean(RecreationVisits))
```
Discuss/consider: Which National Park has the most and least average visits? What patterns or surprises do you notice?
# Exercise 3:
How many National Parks are there in *each state*?
Save your answer as `distinct_parks`.
```{r}
distinct_parks <- np_data %>%
group_by(State) %>%
summarize(num_parks = n_distinct(ParkName))
```
Discuss/consider: What state has the most and least average visits? What patterns or surprises do you notice?
```