

We can clearly see the linear relationship between gdpPercap and CO2, which was not clear until now. Now the scatter plot made by ggplot2 looks much better. We can make the variable on y-axis to be on log scale using scale_y_log10().

In this plot the variable on y-axis also needs to be on log scale. We can see that the variable on y-axis squished near zero. Scatter plot with ggplot2: log scale Scatter Plot tips: Log scale on x-axis and y-axis However, the plot is dominated by the outliers from variable on y-axis. On x-axis the data points are clearly spread out. Title="CO2 emission per person vs GDP per capita") + In ggplot2, we can easily make x-axis to be on log scale using scale_x_log10() function as an additional layer. Let us first make the variable on x-axis to log scale. This is often one of the best tips to make plot better and understand the relationship between two variables. One of the ways to make the plot better is to make the plot with log scale.

Notice that the scales of the two variables are very different and there are more data points squished towards left because of few outlier data points. Scatter plot with ggplot2: labels and title Scatter Plot tip 2: Log scale on x-axis Now the scatter plot looks definitely better than our first attempt. Title="CO2 emission per person vs GDP per capita")+

Labs(x="GDP per capita", y= "CO2 Emission per person (in tonnes)", To make the labels and the tick mark labels more legible we use theme_bw() with base_size=16. And in addition, let us add a title that briefly describes the scatter plot. Scatter plot with ggplot2 in R Scatter Plot tip 1: Add legible labels and title Another thing to notice that is x-axis and y-axis labels and ticks seem bit tiny when compared to the rest of the scatter plot. However, that trend seems to be dominated by the outlier data points. A couple of things strike at first when look at the scatter plot.įirst is that we do see linear trend between the variables. Now we have made our first scatter plot with gdpPercap on x-axis and CO2 emission on y-axis. The geom_() function for scatter plot is geom_point() as we visualize the data points as points in a scatter plot. x-axis and y-axis variables.Īfter we specify the variables for scatter plot, we add a geom_() layer for scatter plot. The basic aesthetics of scatter plot is specifying the variables to be plotted as scatter plot, i.e. We will feed the data frame to ggplot2 using pipe operator and specify aesthetics of the scatter plot using aes(). The way to make scatterplot with ggplot2 is simple. Here we will use gdpPercap on x-axis and co2-emission on y-axis. Let us first make a simple scatter plot using ggplot2 in R. We can see that in addition to six variables from gapminder data set we also have CO2 emission values. The augmented gapminder data contains extra variable on CO2 emission per each country. Let us use augmented gapminder data for our illustrations. Unofficially, changing ggplot2’s default grey theme is the first trick to make you ggplot2 look nicer -). Let us load tidyverse and set our ggplot theme to theme_bw(). In this post we will see 9 tips to make a better a scatter plot with ggplot2 in R to help us understand the relationship between two quantitative variables. However, when the relationship is subtle it may be tricky to see it. When there is strong association between two variables you would easily see the relationship with scatterplot. Scatter plot is one of the common data visualization method used to understand the relationship between two quantitative variables.
