2024-06-05
Session 2 - Shake It Off: Mastering Data Tables with Taylor Swift’s Tracks © 2024 by Raymond Balise and Catalina Canizares is licensed under CC BY-NC-ND 4.0
This material is freely available under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. Some sections are based on content from other presentations, which are credited at the end of this presentation.
For more information on this license, please visit: Creative Commons License
table1 to make tablesgtand gtsummaryto make complex tables

Tables are useful to show the exact values of your data or estimates.
They are not the best solution to show a lot of data or if you want to show the data in a compact space.
They are not usually intended to give a quick, visual representation of data.











gt, kableExtra.flextable, huxtabledt, reactableNot all packages support all R Markdown output formats.


# A tibble: 240 × 29
album_name ep album_release track_number track_name artist featuring
<chr> <lgl> <date> <int> <chr> <chr> <chr>
1 Taylor Swift FALSE 2006-10-24 1 Tim McGraw Taylo… <NA>
2 Taylor Swift FALSE 2006-10-24 2 Picture To Bu… Taylo… <NA>
3 Taylor Swift FALSE 2006-10-24 3 Teardrops On … Taylo… <NA>
4 Taylor Swift FALSE 2006-10-24 4 A Place In Th… Taylo… <NA>
5 Taylor Swift FALSE 2006-10-24 5 Cold As You Taylo… <NA>
6 Taylor Swift FALSE 2006-10-24 6 The Outside Taylo… <NA>
7 Taylor Swift FALSE 2006-10-24 7 Tied Together… Taylo… <NA>
8 Taylor Swift FALSE 2006-10-24 8 Stay Beautiful Taylo… <NA>
9 Taylor Swift FALSE 2006-10-24 9 Should've Sai… Taylo… <NA>
10 Taylor Swift FALSE 2006-10-24 10 Mary's Song (… Taylo… <NA>
# ℹ 230 more rows
# ℹ 22 more variables: bonus_track <lgl>, promotional_release <date>,
# single_release <date>, track_release <date>, danceability <dbl>,
# energy <dbl>, key <int>, loudness <dbl>, mode <int>, speechiness <dbl>,
# acousticness <dbl>, instrumentalness <dbl>, liveness <dbl>, valence <dbl>,
# tempo <dbl>, time_signature <int>, duration_ms <int>, explicit <lgl>,
# key_name <chr>, mode_name <chr>, key_mode <chr>, lyrics <list>
| Name | taylor_album_songs |
| Number of rows | 240 |
| Number of columns | 29 |
| _______________________ | |
| Column type frequency: | |
| character | 7 |
| Date | 4 |
| list | 1 |
| logical | 3 |
| numeric | 14 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| album_name | 0 | 1.00 | 5 | 29 | 0 | 11 | 0 |
| track_name | 0 | 1.00 | 3 | 68 | 0 | 240 | 0 |
| artist | 3 | 0.99 | 12 | 12 | 0 | 1 | 0 |
| featuring | 217 | 0.10 | 4 | 33 | 0 | 20 | 0 |
| key_name | 3 | 0.99 | 1 | 2 | 0 | 12 | 0 |
| mode_name | 3 | 0.99 | 5 | 5 | 0 | 2 | 0 |
| key_mode | 3 | 0.99 | 7 | 8 | 0 | 19 | 0 |
Variable type: Date
| skim_variable | n_missing | complete_rate | min | max | median | n_unique |
|---|---|---|---|---|---|---|
| album_release | 0 | 1.00 | 2006-10-24 | 2024-04-19 | 2021-11-12 | 11 |
| promotional_release | 228 | 0.05 | 2017-10-20 | 2023-11-29 | 2021-05-31 | 12 |
| single_release | 211 | 0.12 | 2006-06-19 | 2024-04-19 | 2020-01-27 | 29 |
| track_release | 0 | 1.00 | 2006-06-19 | 2024-04-19 | 2021-11-12 | 27 |
Variable type: list
| skim_variable | n_missing | complete_rate | n_unique | min_length | max_length |
|---|---|---|---|---|---|
| lyrics | 0 | 1 | 238 | 4 | 4 |
Variable type: logical
| skim_variable | n_missing | complete_rate | mean | count |
|---|---|---|---|---|
| ep | 0 | 1.00 | 0.00 | FAL: 240 |
| bonus_track | 0 | 1.00 | 0.15 | FAL: 203, TRU: 37 |
| explicit | 3 | 0.99 | 0.14 | FAL: 204, TRU: 33 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| track_number | 0 | 1.00 | 11.98 | 7.31 | 1.00 | 6.00 | 11.00 | 17.00 | 31.00 | ▇▇▆▃▁ |
| danceability | 3 | 0.99 | 0.58 | 0.12 | 0.29 | 0.50 | 0.59 | 0.65 | 0.90 | ▂▅▇▃▁ |
| energy | 3 | 0.99 | 0.56 | 0.18 | 0.13 | 0.42 | 0.56 | 0.70 | 0.93 | ▂▆▇▇▃ |
| key | 3 | 0.99 | 4.39 | 3.54 | 0.00 | 1.00 | 4.00 | 7.00 | 11.00 | ▇▂▂▃▃ |
| loudness | 3 | 0.99 | -7.77 | 2.78 | -15.49 | -9.73 | -7.38 | -5.77 | -1.91 | ▁▃▆▇▂ |
| mode | 3 | 0.99 | 0.90 | 0.30 | 0.00 | 1.00 | 1.00 | 1.00 | 1.00 | ▁▁▁▁▇ |
| speechiness | 3 | 0.99 | 0.06 | 0.05 | 0.02 | 0.03 | 0.04 | 0.06 | 0.52 | ▇▁▁▁▁ |
| acousticness | 3 | 0.99 | 0.34 | 0.33 | 0.00 | 0.03 | 0.21 | 0.67 | 0.97 | ▇▂▂▂▃ |
| instrumentalness | 3 | 0.99 | 0.00 | 0.03 | 0.00 | 0.00 | 0.00 | 0.00 | 0.33 | ▇▁▁▁▁ |
| liveness | 3 | 0.99 | 0.14 | 0.08 | 0.04 | 0.09 | 0.12 | 0.15 | 0.61 | ▇▂▁▁▁ |
| valence | 3 | 0.99 | 0.38 | 0.19 | 0.04 | 0.25 | 0.37 | 0.51 | 0.92 | ▅▇▇▃▁ |
| tempo | 3 | 0.99 | 124.14 | 31.94 | 68.53 | 96.97 | 119.97 | 148.04 | 208.92 | ▆▇▆▅▁ |
| time_signature | 3 | 0.99 | 3.96 | 0.34 | 1.00 | 4.00 | 4.00 | 4.00 | 5.00 | ▁▁▁▇▁ |
| duration_ms | 3 | 0.99 | 237577.34 | 47151.74 | 131907.00 | 210240.00 | 233627.00 | 257773.00 | 613027.00 | ▆▇▁▁▁ |
table1The simplest way to create a nice table (in my opinion)
| Overall (N=240) |
|
|---|---|
| album_name | |
| 1989 (Taylor's Version) | 23 (9.6%) |
| evermore | 17 (7.1%) |
| Fearless (Taylor's Version) | 26 (10.8%) |
| folklore | 17 (7.1%) |
| Lover | 18 (7.5%) |
| Midnights | 26 (10.8%) |
| Red (Taylor's Version) | 30 (12.5%) |
| reputation | 15 (6.3%) |
| Speak Now (Taylor's Version) | 22 (9.2%) |
| Taylor Swift | 15 (6.3%) |
| THE TORTURED POETS DEPARTMENT | 31 (12.9%) |
| energy | |
| Mean (SD) | 0.560 (0.179) |
| Median [Min, Max] | 0.565 [0.131, 0.934] |
| Missing | 3 (1.3%) |
| danceability | |
| Mean (SD) | 0.578 (0.117) |
| Median [Min, Max] | 0.589 [0.292, 0.897] |
| Missing | 3 (1.3%) |
| explicit | |
| Yes | 33 (13.8%) |
| No | 204 (85.0%) |
| Missing | 3 (1.3%) |
table1Let’s explore which album has the most explicit songs
We have to wrangle the data a bit:
Create the table
| FALSE (N=204) |
TRUE (N=33) |
Overall (N=237) |
|
|---|---|---|---|
| album_name | |||
| 1989 (Taylor's Version) | 22 (10.8%) | 0 (0%) | 22 (9.3%) |
| evermore | 11 (5.4%) | 6 (18.2%) | 17 (7.2%) |
| Fearless (Taylor's Version) | 26 (12.7%) | 0 (0%) | 26 (11.0%) |
| folklore | 12 (5.9%) | 5 (15.2%) | 17 (7.2%) |
| Lover | 18 (8.8%) | 0 (0%) | 18 (7.6%) |
| Midnights | 15 (7.4%) | 9 (27.3%) | 24 (10.1%) |
| Red (Taylor's Version) | 28 (13.7%) | 2 (6.1%) | 30 (12.7%) |
| reputation | 15 (7.4%) | 0 (0%) | 15 (6.3%) |
| Speak Now (Taylor's Version) | 22 (10.8%) | 0 (0%) | 22 (9.3%) |
| Taylor Swift | 15 (7.4%) | 0 (0%) | 15 (6.3%) |
| THE TORTURED POETS DEPARTMENT | 20 (9.8%) | 11 (33.3%) | 31 (13.1%) |
| energy | |||
| Mean (SD) | 0.572 (0.180) | 0.483 (0.155) | 0.560 (0.179) |
| Median [Min, Max] | 0.577 [0.131, 0.934] | 0.462 [0.240, 0.782] | 0.565 [0.131, 0.934] |
| danceability | |||
| Mean (SD) | 0.577 (0.115) | 0.580 (0.128) | 0.578 (0.117) |
| Median [Min, Max] | 0.588 [0.292, 0.897] | 0.604 [0.316, 0.867] | 0.589 [0.292, 0.897] |

gtsummarytaylor_no_na %>%
tbl_summary(
by = explicit_factor,
include = c(album_name, energy, danceability),
label = list(album_name = "Album Name",
energy = "Energy",
danceability = "Danceability"),
percent = "row"
) %>%
add_p() %>%
modify_header(label = "") %>%
modify_caption("**Taylor's explicit albums**")| FALSE, N = 2041 | TRUE, N = 331 | p-value2 | |
|---|---|---|---|
| Album Name | |||
| 1989 (Taylor's Version) | 22 (100%) | 0 (0%) | |
| evermore | 11 (65%) | 6 (35%) | |
| Fearless (Taylor's Version) | 26 (100%) | 0 (0%) | |
| folklore | 12 (71%) | 5 (29%) | |
| Lover | 18 (100%) | 0 (0%) | |
| Midnights | 15 (62%) | 9 (38%) | |
| Red (Taylor's Version) | 28 (93%) | 2 (6.7%) | |
| reputation | 15 (100%) | 0 (0%) | |
| Speak Now (Taylor's Version) | 22 (100%) | 0 (0%) | |
| Taylor Swift | 15 (100%) | 0 (0%) | |
| THE TORTURED POETS DEPARTMENT | 20 (65%) | 11 (35%) | |
| Energy | 0.58 (0.45, 0.71) | 0.46 (0.36, 0.60) | 0.006 |
| Danceability | 0.59 (0.50, 0.66) | 0.60 (0.51, 0.65) | 0.8 |
| 1 n (%); Median (IQR) | |||
| 2 Wilcoxon rank sum test | |||
gttaylor_no_na %>%
mutate(explicit_factor = factor(explicit, labels = c("Not Explicit", "Explicit"))) %>%
tbl_summary(
by = explicit_factor,
include = c(album_name),
label = list(album_name = "Album Name"),
percent = "row",
) %>%
modify_header(label = "") %>%
modify_caption("**Taylor's explicit albums**") %>%
as_gt() %>%
tab_style(
style = cell_text(color = "darkgrey",
align = "right"),
locations = cells_body(
columns = c(stat_1,stat_2)
)) %>%
tab_style(
style = cell_text(color = "#E5446D",
weight = "bold"),
locations = cells_body(
columns = c(stat_2),
rows = label == "Midnights"
)
)| Not Explicit, N = 2041 | Explicit, N = 331 | |
|---|---|---|
| Album Name | ||
| 1989 (Taylor's Version) | 22 (100%) | 0 (0%) |
| evermore | 11 (65%) | 6 (35%) |
| Fearless (Taylor's Version) | 26 (100%) | 0 (0%) |
| folklore | 12 (71%) | 5 (29%) |
| Lover | 18 (100%) | 0 (0%) |
| Midnights | 15 (62%) | 9 (38%) |
| Red (Taylor's Version) | 28 (93%) | 2 (6.7%) |
| reputation | 15 (100%) | 0 (0%) |
| Speak Now (Taylor's Version) | 22 (100%) | 0 (0%) |
| Taylor Swift | 15 (100%) | 0 (0%) |
| THE TORTURED POETS DEPARTMENT | 20 (65%) | 11 (35%) |
| 1 n (%) | ||
| Not Explicit, N = 2041 | Explicit, N = 331 | |
|---|---|---|
| Album Name | ||
| 1989 (Taylor's Version) | 22 (100%) | 0 (0%) |
| evermore | 11 (65%) | 6 (35%) |
| Fearless (Taylor's Version) | 26 (100%) | 0 (0%) |
| folklore | 12 (71%) | 5 (29%) |
| Lover | 18 (100%) | 0 (0%) |
| Midnights | 15 (62%) | 9 (38%) |
| Red (Taylor's Version) | 28 (93%) | 2 (6.7%) |
| reputation | 15 (100%) | 0 (0%) |
| Speak Now (Taylor's Version) | 22 (100%) | 0 (0%) |
| Taylor Swift | 15 (100%) | 0 (0%) |
| THE TORTURED POETS DEPARTMENT | 20 (65%) | 11 (35%) |
| 1 n (%) | ||
Slides 3 to 16 are an exact copy of Dr. Ray Balise’s Tables lesson for the BST-623 class.