Create correlation network plot in a quick way

June 8, 2018
R Visualization


Corrr package



From my previous post, I did mention about the use of network plot from ‘corrr’ package in visualizing correlation matrices. So, for this time, I’m going to share with you on how to create your own correlation network plot by using ‘corrr’ package in a quick way and also how do we actually interpret it in short.






Installation



Download and install the ‘corrr’ and ‘magrittr’ R packages from CRAN for the first time before we can make use of the functions from it.


install.packages("corr")

install.packages("magrittr")






Network plot



library('corrr')

library('magrittr')

iris[,1:4] %>% correlate( ) %>% network_plot(min_cor = 0.3)




We are required to load the ‘corrr’ package every time when we are intended to use the functions from it.

For this case, I select only the numeric variables from ‘iris’ dataset in plotting a correlation network plot.

correlate( ) function works just like a cor( ) function do, except it provides you the ability to render the output in a table form (tibble).

However, min_cor which included inside the function of network_plot specifies the minimum correlation value(in absolute scale) to be plotted in a network plot.

‘%>%’ is a forward pipe operator from the package of ‘magrittr’. It is used instead of parenthesis,‘( )’ because the use of parenthesis sometimes makes our R code difficult to read and understand. For an example:



network_plot(correlate(data_name), min_cor=0.3)


Compare with the use of pipe operator, ‘%>%’:



data_name %>% correlate( ) %>% network_plot(min_cor = 0.3)





All of these can be summarised as below:



Function Description

correlate( ) View all correlation values among variables in a table

network_plot( ) Plot a correlation network plot

min_cor Minumum correlation value to be plotted in a network plot

%>% Make R code easy to read and understand.






Explanation



  • Each path connecting from a variable to another variable represents a correlation value, r.

  • A path with blue color represents a positive correlation between two variables.

  • A red color path indicates a negative correlation between two quantitative variables.

  • Width and transparency of path explain the magnitude of a correlation between two variables. The path is narrow and transparent when the strength of correlation between two variables is weak.






Example


From the diagram above, it can be showed that the negative correlation between ‘hp’ and ‘mpg’ is much more stronger than the correlation between ‘hp’ and ‘drat’.



cor(mtcars$hp,mtcars$mpg)
[1] -0.7761684
# correlation coefficient for 'hp' and 'mpg'

cor(mtcars$hp, mtcars$drat)
[1] -0.4487591
# correlation coefficient for 'mpg' and 'drat'






That’s all from me. Thank you.

R vs Excel: lookup and text functions

June 29, 2018
R Excel

Correlation and Covariance

June 2, 2018
R Regression

A hint on how to read World Bank data into R

May 19, 2018
R Import
comments powered by Disqus