These exercises explore checkout data from the Seattle Public Library for the Top 500 “Greatest” Novels — the novels most widely held in libraries according to OCLC. For more context about the dataset, see the data essay.
Concepts covered:
Groupby and aggregation (sum of checkouts)
Sorting and ranking (top N values)
Time series line plots (monthly checkouts over time)
String filtering (finding titles that contain a keyword)
Pivot tables
Correlation matrices and heatmaps — a correlation coefficient measures how closely two variables move together, ranging from -1 (perfect inverse relationship) to 1 (perfect positive relationship), with 0 meaning no relationship
Load the data
Code
import pandas as pdimport plotly.express as pxspl_df = pd.read_csv("https://responsible-datasets-in-context.s3.us-west-2.amazonaws.com/top_500_spl_df.csv")spl_df.head()
/tmp/ipykernel_8224/3933414714.py:4: DtypeWarning:
Columns (12) have mixed types. Specify dtype option on import or set low_memory=False.
Unnamed: 0
usageclass
checkouttype
materialtype
checkoutyear
checkoutmonth
checkouts
title
subjects
creator
...
gr_num_ratings
gr_num_reviews
gr_avg_rating_rank
gr_num_ratings_rank
oclc_owi
author_viaf
gr_url
wiki_url
pg_eng_url
pg_orig_url
0
0
Physical
Horizon
BOOK
2006
1
1
The merry adventures of Robin Hood / Howard Py...
Robin Hood Legendary character Legends Juvenil...
Pyle, Howard, 1853-1911
...
64,249
1,428
166
370
4.922740e+05
12428253
https://www.goodreads.com/book/show/1277190.Th...
https://en.wikipedia.org/wiki/The_Merry_Advent...
https://www.gutenberg.org/cache/epub/10148/pg1...
NaN
1
1
Physical
Horizon
BOOK
2006
1
5
King of the Wind; illus. by Wesley Dennis.
Horses Juvenile fiction, Horses Fiction
Henry, Marguerite, 1902-1997
...
26,436
920
67
446
1.025728e+08
91717300
https://www.goodreads.com/book/show/423156.Kin...
https://en.wikipedia.org/wiki/King_of_the_Wind
NA_not-pub-domain
NaN
2
2
Physical
Horizon
BOOK
2006
1
7
A painted house / a novel by John Grisham.
Boys Arkansas Fiction, Cotton growing Arkansas...
Grisham, John
...
93,602
4,927
423
331
6.890420e+05
7449742
https://www.goodreads.com/book/show/5360.A_Pai...
https://en.wikipedia.org/wiki/A_Painted_House
NA_not-pub-domain
NaN
3
3
Physical
Horizon
BOOK
2006
1
1
Don Quijote de la Mancha / Miguel de Cervantes...
Cervantes Saavedra Miguel de 1547 1616 Don Qui...
Cervantes Saavedra, Miguel de, 1547-1616
...
269,435
12,053
318
211
1.810748e+09
17220427
https://www.goodreads.com/book/show/3836.Don_Q...
https://en.wikipedia.org/wiki/Don_Quixote
https://www.gutenberg.org/cache/epub/996/pg996...
https://www.gutenberg.org/cache/epub/2000/pg20...
4
4
Physical
Horizon
BOOK
2006
1
3
Sula.
African Americans Juvenile fiction, Friendship...
Morrison, Toni
...
97,287
7,449
207
324
1.812330e+05
109406177
https://www.goodreads.com/book/show/11346.Sula...
https://en.wikipedia.org/wiki/Sula_(novel)
NA_not-pub-domain
NaN
5 rows × 46 columns
Exercise 1
Find the top 10 authors and top 10 books by total checkouts in the SPL Top 500 dataset. Display them as styled tables.