• Matt Baldwin

A forest plot in ggplot2

After conducting a meta-analysis, it is useful to display the effect sizes in a forest plot. I use the metafor package in R to conduct the analysis, which has a built in forest( ) function for plotting the model. Here's an example from a meta-analysis with subgroups:


Looks great right! Here's the code:

Forestplot( ) code. Taken from http://www.metafor-project.org/doku.php/plots:forest_plot_with_subgroups, by Wolfgang Viechtbauer

For an R noob like myself, this looks scary. So I decided to write some simple ggplot2 code for a forest plot.


Step 1

First, you'll need to conduct the meta-analysis in R. The data should include an effect size column and a variance column. If you are conducting a subgroups meta-regression then you also need a column with your grouping variable. I also have a column with study labels for plotting later. Mine looks like this:

For the plot, you'll need the overall effect from the RE model without subgroups (all studies), and the overall effects from separate RE models for each subgroup. In my case, I will have three meta-analytic effect sizes to plot along with all of the individual effect sizes.


As a reminder, the metafor code looks like this:

library(metafor)

#g = Hedge's g effect size column from data
#var = variance of effect size column from data 

#all studies
model1 <- rma(g, var, data = my_data, method="ML")
model1

#test for moderation by Comparison
model2 <- rma(g, var, mod=~Comparison, data = my_data, method="ML")
model2

#RE model within Similarities condition
model_sim <- rma(g, var, data = subset(my_data, my_data$Comparison=="Similarities"), method="ML")
model_sim

#RE model within Differences condition
model_diff <- rma(g, var, data = subset(my_data, my_data$Comparison=="Differences"), method="ML") 
model_diff

Step 2

After you've calculated the overall effect sizes (three in total), its time to add them to a new dataset for plotting. For the plot, the data file will need:

  1. An index variable for ordering effects on the plot

  2. Study label

  3. Effect size

  4. Variance (if you want to size the points by the inverse variance)

  5. 95% CI lower bound

  6. 95% CI upper bound

  7. Subgroup label

Now, I manually added the meta-analytic effects to a new .csv file, and used the effectsize package in R to calculate the 95% CIs (and again, adding them manually). One could choose to automate this process; for instance, by creating a data frame in R with all of the relevant data, adding the effect sizes directly from the RE models, and calculating CIs automatically from the variances and sample sizes (or standard errors). However you choose to do it is up to you! Here is what my plotting data look like:


Step 3

Now we're ready to plot. For now, this code plots the bare minimum: just the the effect sizes and confidence intervals. There is a way to add other relevant information to the plot and I will get to that in a separate post. But for now, here is the ggplot2 code for a forest plot:

library(ggplot2)

#just some code to format the x-axis label
xname <- expression(paste("Hedge's ", italic("g")))

#setting up the basic plot
p <- ggplot(data=my_plot_data, aes(y=Index, x=g, xmin=ci_l, xmax=ci_u))+ 

#this adds the effect sizes to the plot
geom_point()+ 

#this changes the features of the overall effects
#one could resize, reshape, or recolor these points if desired
geom_point(data=subset(my_plot_data, Comparison=="All"), color="Black", size=2)+ 

#adds the CIs
geom_errorbarh(height=.1)+

#sets the scales
#note that I reverse the y axis to correctly order the effect #sizes based on my index variable
scale_x_continuous(limits=c(-2.5,1), breaks = c(-2.5:1), name=xname)+
scale_y_continuous(name = "", breaks=1:19, labels = my_plot_data$Study, trans="reverse")+

#adding a vertical line at the effect = 0 mark
geom_vline(xintercept=0, color="black", linetype="dashed", alpha=.5)+

#faceting based on my subgroups
facet_grid(Comparison~., scales= "free", space="free")+

#thematic stuff
ggtitle("Target Effects")+
theme_minimal()+
theme(text=element_text(family="Times",size=18, color="black"))+
theme(panel.spacing = unit(1, "lines"))

p

And thats it! There is a lot of room to modify and add to this code as well (e.g., changing colors and shapes based on groups, sizing the points based on study weights, adding more subgroups to the faceting, adding annotations for data points, etc). You can save the plot as a high quality .pdf with ggsave:

ggsave(p, file="myplot.pdf", width = 10, height=8, dpi=300)

And the final product looks like this:


I hope this is useful, and happy plotting!


3,310 views

©2019 by Matt Baldwin. Created with Wix.com