Create Line Charts in R Using ggplot2

A Step-by-Step Tutorial

Vivekananda Das
4 min readDec 30, 2020
Photo by Isaac Smith on Unsplash

Line charts are frequently used for visualizing longitudinal data.

In this article, following a few super easy steps, we will learn how to create line charts in R using the ggplot2 package.

Data

For this tutorial, I am using the gapminder website’s data on per capita GDP of countries over the years. You can download the data easily from their website.

I assume you: 1) downloaded the data, 2) imported it in R, and 3) named it g (or whatever you may prefer).

Steps

Let’s have a quick look at the dataframe:

Before we can create a line chart using this dataframe, we need to make two changes to it:

  1. Remove X from the year names (i.e., X1800 should become 1800; although this is optional)
  2. There should be one column showing the years (i.e., 1800,1801,…etc.) and another column showing the GDP/capita for these years other than the first column showing the country name. We need the melt function from the reshape2 package to make this change.
#REMEMBER: I named the dataframe "g"
library(reshape2)
library(dplyr)
library(ggplot2)
library(ggthemes)
#Remove “X” from the beginning of column names
for (col in 1:ncol(g)){
colnames(g)[col] <- sub(‘X’, ‘ ’, colnames(g)[col])
}
#Change the shape of the dataframe [melt from the reshape2 package]
g <- melt(g)

To make the visualization easier, I will select only the eight South Asian countries.

We are trying to visualize how per capita GDP changed since 1800 and how it is expected to change till 2040 in these countries.

#Select 8 SAARC Countries [filter from dplyr package]
g<-g%>% filter(country==’Afghanistan’|country==’Pakistan’|country==’India’|
country==’Sri Lanka’|country==’Bangladesh’|country==’Nepal’| country==’Bhutan’|country==’Maldives’)

Amazing! Now, we are ready to make our line chart!

#Creating the initial line chart "p" [ggplot from ggplot2 package]
p<- ggplot(data=g, aes(x=variable, y=value, group=country,color=country)) +
geom_line()
p

Well, it worked, but not quite! ☹️

Notice that we have data for every year from 1800 to 2040; obviously, we need more space to mention the years’ names. To solve this issue, we will mention only one year every 25 years (we are not deleting the data of any year! No worries!) in the x-axis.

#Making the x-axis more manageable!
p2<- p + scale_x_discrete(breaks = c(1800,1825,1850,1875,1900,1925,1950,1975,2000,2025,2050))
p2

Now, let’s change the labels of the x and y axes and add a title!

p3<- p2 + xlab(“Year”)+ ylab(“GDP/Capita, PPP$ Inflation-Adjusted”)+ ggtitle(“Per Capita GDP in South Asian Countries (Historical and Projected)”)p3

Our line chart is ready!

Now, I would like this line chart to look more like a chart shown in the Economist magazine! Also, I want to move the title of the graph to the middle from the left (the title is left-aligned by default).

p4<-p3+theme_economist_white(gray_bg = FALSE)+theme(plot.title = element_text(hjust = 0.5))p4

Still, the above chart looks messy! It is because we have too many data points here. One possible solution is to restrict the timeline of visualization from 1950 to 2030 and consider one data point for every 10 years.

#Selecting only specific years
g<-g%>% filter(variable==1950|variable==1960|variable==1970|variable==1980|variable==1990|variable==2000| variable==2010| variable==2020| variable==2030)
p5<- ggplot(data=g, aes(x=variable, y=value, group=country,color=country)) +
geom_line(size=1.5) +
xlab("Year")+ylab("GDP/Capita, PPP$ Inflation-Adjusted")+ggtitle("Per Capita GDP in South Asian Countries (Historical and Projected)")
p5+theme_economist_white(gray_bg = FALSE)+theme(plot.title = element_text(hjust = 0.5))

This chart is more intelligible than any of the above ones!

If you are interested in learning how to make heatmaps in R, read my previous article:

Data Courtesy

Gapminder: https://www.gapminder.org/data/

--

--

Vivekananda Das
Vivekananda Das

Written by Vivekananda Das

Sharing synthesized ideas on Data Analysis in R, Data Literacy, Causal Inference, and Well-being | Assistant Prof @ UUtah | More: https://vivekanandadas.com

No responses yet