Exploring Energy Capacity and Prices

May 3, 2024

Loading the data

This post revisits an older Tidy Tuesday dataset, it is concerned with capacity and costs for solar and wind energy.

Show the code
pacman::p_load(tidyverse)
capacity <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-05-03/capacity.csv')
wind <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-05-03/wind.csv')
solar <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-05-03/solar.csv')
average_cost <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-05-03/average_cost.csv')

Cleaning the data

Let’s take a look at the average cost data:

Show the code
average_cost %>% 
  head(5)
    # A tibble: 5 × 4
       year gas_mwh solar_mwh wind_mwh
      <dbl>   <dbl>     <dbl>    <dbl>
    1  2009    57.6     168.      74.3
    2  2010    56.8     140.      65.5
    3  2011    46.0     111.      47.8
    4  2012    44.5      84.1     40.1
    5  2013    43.2      68.9     28.7

In order to make visualisation easier we are going to convert the data into long format, using the pivot_longer function:

Show the code
average_cost %>% pivot_longer(cols = gas_mwh:wind_mwh,names_to = "type",values_to = "cost") %>%  mutate(flag = if_else(type == 'solar_mwh', 'top', false = 'bottom')) -> average_cost #Assign to a new variable

average_cost %>% 
  select(-flag)
    # A tibble: 39 × 3
        year type       cost
       <dbl> <chr>     <dbl>
     1  2009 gas_mwh    57.6
     2  2009 solar_mwh 168. 
     3  2009 wind_mwh   74.3
     4  2010 gas_mwh    56.8
     5  2010 solar_mwh 140. 
     6  2010 wind_mwh   65.5
     7  2011 gas_mwh    46.0
     8  2011 solar_mwh 111. 
     9  2011 wind_mwh   47.8
    10  2012 gas_mwh    44.5
    # ℹ 29 more rows

The wrangled data is now far more suitable for visualisation.

Visualise the data

Let’s visualise the data looking at the costs over time for each of the energy sources.

Show the code
average_cost %>% ggplot(aes(year,cost,color=type))+
  geom_line(aes(linetype=flag), size=1.5,alpha =0.8)+
  scale_x_continuous(breaks = seq(2008, 2022, by = 2))+
  scale_y_continuous(labels = scales::dollar_format(), breaks = seq(0,175,by=25))+
  tidyquant::theme_tq()+
  expand_limits(y=0)+
  guides(linetype="none")+
  scale_linetype_manual(values = c("dashed","solid"))+
  labs(
    x="",
    y="Cost",
    title = "Energy costs over the years",
    subtitle = "Solar has experienced the most drastic decrease in costs",
    caption = "Data from Tidy Tuesday: 04-05-2022")+
    tidyquant::theme_tq()+
    tidyquant::scale_color_tq()+
    theme(legend.title = element_blank()) -> plot1

plot1

We can see that in solar has experienced the most drastic decrease in costs for the available date ranges in the data. Let’s get a sense of the spread of the prices for solar:

Show the code
solar %>% 
  ggplot(
    aes(solar_mwh)
  )+
  geom_histogram(color = 'white',
                 fill ='midnightblue')+
  geom_rug()+
  labs(
    title = "Distribution of price for solar",
    x = "",
    y= "Count"
  ) +
  scale_x_continuous(labels = scales::dollar_format(suffix = "/MWh")) +
  # scale_y_continuous(expand = c(0,0), limits = c(0,50))+
  tidyquant::theme_tq() -> plot2

plot2

The prices for solar energy vary from 12/MWh to just above 300/MWh, the majority between 12 and 100. Let’s also try to understand the relation ship between $/MWh and solar capacity.

Show the code
solar %>% 
  ggplot(aes(solar_capacity,solar_mwh, color=solar_capacity))+
  geom_jitter()+
  geom_smooth(color='midnightblue')+
  tidyquant::theme_tq()+
  theme(legend.position = "none")+
  scale_color_continuous(name = 'Wind Capacity') +
  scale_y_continuous(labels = scales::dollar_format(suffix = "/MWh"))+
  labs(
    x="Solar Capacity",
    y="Solar Projected Price",
    title = "Relationship between solar\nprojected price and capacity") -> plot3

plot3

The relationship is not too strong, but at an overall level we can say that the price decreases as solar capacity increases, up to a certain point where it seems to stabilise.

Bring it all together

Let’s bring all of these visualisations together using the patchwork package:

Show the code
library(patchwork)

((plot2/plot3|plot1))