Loading the data

This week's tidy Tuesday data set for 2023-09-19 has information about CRAN packages. The dataset denotes the cross package connections between developers, as per the DESCRIPTION file. Let's begin by loading the data:

Show the code

pacman::p_load(tidyverse,gt,reactablefmtr,reactablefmtr)

cran_20230905 <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-09-19/cran_20230905.csv')

Lets take a peak at the data:

Show the code

cran_20230905 %>% 
  head(5)

# A tibble: 5 × 67
  Package   Version Priority Depends Imports LinkingTo Suggests Enhances License
  <chr>     <chr>   <chr>    <chr>   <chr>   <chr>     <chr>    <chr>    <chr>  
1 A3        1.0.0   <NA>     R (>= … <NA>    <NA>      randomF… <NA>     GPL (>…
2 AalenJoh… 1.0     <NA>     <NA>    <NA>    <NA>      knitr, … <NA>     GPL (>…
3 AATtools  0.0.2   <NA>     R (>= … magrit… <NA>      <NA>     <NA>     GPL-3  
4 ABACUS    1.0.0   <NA>     R (>= … ggplot… <NA>      rmarkdo… <NA>     GPL-3  
5 abaseque… 0.1.0   <NA>     <NA>    <NA>    <NA>      <NA>     <NA>     GPL-3  
# ℹ 58 more variables: License_is_FOSS <lgl>, ...

There are a lot of columns in the dataset, we will need to do some data cleaning before we can get it to a format that we can use. We select the import and package columns, we also remove special characters and spaces. Finally, we create a tally column so that we can aggregate the total package dependencies:

Show the code

packages<-cran_20230905%>%
  select(from=Imports,to=Package)%>%
  mutate(from = strsplit(from, ","))%>%
  unnest(from)%>%
  mutate(from=gsub("\\s*\\([^\\)]+\\)","",from))%>%
  mutate(from=str_replace_all(from, fixed(" "), ""))%>%
  mutate(n=1)%>%
  drop_na()

packages

# A tibble: 94,281 × 3
   from       to           n
   <chr>      <chr>    <dbl>
 1 magrittr   AATtools     1
 2 dplyr      AATtools     1
 3 doParallel AATtools     1
 4 foreach    AATtools     1
 5 ggplot2    ABACUS       1
 6 shiny      ABACUS       1
 7 httr       abbyyR       1
 8 XML        abbyyR       1
 9 curl       abbyyR       1
10 readr      abbyyR       1
# ℹ 94,271 more rows

We are left with columns: from, to and n. From is the package that the package in question depends on. For instance the AATtools has some dependencies from the magrittr package. This allows us to get the top 25 packages:

Show the code

n_25<-packages%>%
  group_by(from)%>%
  summarize(total=sum(n))%>%
  ungroup()%>%
  arrange(-total)%>%
  slice_head(n = 25) %>% 
  rename(Package=from,
        `Total Cited Dependencies`=total)
n_25

# A tibble: 25 × 2
   Package  `Total Cited Dependencies`
   <chr>                         <dbl>
 1 stats                          5028
 2 utils                          3138
 3 dplyr                          3004
 4 methods                        2951
 5 ggplot2                        2847
 6 Rcpp                           2425
 7 graphics                       2058
 8 rlang                          1843
 9 magrittr                       1725
10 stringr                        1456
# ℹ 15 more rows

Data Visualisation

With our data aggregated the way we want it, we can now proceed to visualise it. For this post we are going to explore interactive tables from the reactablefmtr package. Let’s plot a bar graph:

Show the code

reactable(
  n_25,
  pagination=FALSE,
  defaultColDef = colDef(
    cell = data_bars(n_25, 
                     round_edges = TRUE,
                     border_style = "solid",
                     border_color = "gold",
                     border_width = ".8px",
                     text_position = "above",
                     number_fmt = scales::comma)
  )
)

Alternatively we can create a bubble chart, to visualise the same patterns:

Show the code

n_25 %>%
reactable(
  defaultColDef = colDef(
    align = 'center',
    cell = bubble_grid(
      data = .,
      number_fmt = scales::comma,
      min_value = -5000,
      max_value = 6000,
    )
  )
)

That’s it for this post! Just some interactive tables with Tidy Tuesday data!

CRAN Package Authors - reactablefmtr

Loading the data

Data Visualisation