Lattice Package: Visualizing Multivariate Data in R

  • Lalit Salunkhe
  • Feb 18, 2021
  • R Programming
  • Updated on: Feb 18, 2021
Lattice Package: Visualizing Multivariate Data in R title banner

Lattice is an extensive package in R developed by Deepayan Sarkar in an attempt to customize and improve visuals in R Programming. The core benefits of this package are being one that has simple default values and an ability to deal with multivariate data. 

 

The Lattice package and list of graphs it contains allow us to visualize the relationship between two and more variables in a single frame. Also, while customizing the visuals under lattice, the functions are really kept short, simple, and extensive.

 

To work with the lattice plotting system, we first need to install the package itself into the R environment. See the code below where install.packages() function allows you to install the lattice package into R.


install.packages("lattice")

 


We can access the lattice package using the library() function.


library(lattice)

 


Let us explore the graphs from the lattice package using the mtcars and the iris dataset from R. which are built-in datasets in R.

 

(Read blog: Advanced Data Visualizations in R Programming)

 

 

The Scatterplot

 

To create a scatterplot in R programming using the lattice package, we have a dedicated function named xyplots(). This function allows us to create a basic scatter plot as well as the scatterplot based on multiple variables altogether. 

 

Let us run an example where we use the mtcars data to explore the xyplot() function and its various components.


library(lattice)

attach(mtcars)

attach(iris)

#Creating factors for gears and cyl variables

#Will need those latter

factor_gear <- factor(gear, levels = c(3, 4, 5),

                      labels = c("3 gear", "4 gear", "5 gear"))



factor_cyl <- factor(cyl, levels = c(4, 6, 8),

                     labels = c("4 cyl","6 cyl", "8 cyl"))



#xy Scatter plot

xyplot(mpg~wt,

       data = mtcars)

Here, in this code, we have converted the gear and the cyl variables into factors and stored as separate variables so that those could be used in future code whenever needed. 

 

Moreover, we have used the xyplot() function in order to generate a scatter plot which will show us the relationship between mpg and wt variables.

 

The basic scatterplot looks as below:


Scatterplot using functions from lattice package


If you would have noticed, the function xyplot() has two main arguments, first is the formula (mpg~data) and second one is data (mtcars, the data we are using for creating a scatterplot). This structure remains the same throughout the lattice package for other graphs as well.

 

(Recommend blog: Packages in R Programming)

 

We can change the graph type, add title, and change the axes labels as well (basically customizing the plot) under all lattice functions. See the code as well as output graphical representation below for your reference:


#xy Scatter plot with customization

xyplot(mpg~wt,

       data = mtcars,

       type = c("p", "r"),

       main = "Relation between wt and mpg",

       xlab = "Weight in lbs",

       ylab = "Miles/Gallon (US)")

Here, in this code - 

  • The type = argument specifies what type of scatter plot should be shown. “P” stands for data points that should be shown on a graph screen, “r” specifies we need the regression line to be shown on the graph as well. Therefore, we have combined them both with the c().

  • The main = argument specifies the main title for the graph

  • The xlab =  and ylab = arguments are developed in order to align the axes labels for the x-axis and the y-axis respectively.

 

See the output as shown below:


Customizing visuals in R under functions from lattice package.

Customized scatter plot with title, axes labels and regression line


We can create a multivariate scatter plot visual as well (lattice is well known for it’s way of creating beautiful multivariate visualizations) and same could be seen below:


#Multivariate xy scatter plot with customizations

xyplot(mpg~wt | factor_cyl,

       data = mtcars,

       type = c("p", "r"),

       groups = factor_cyl,

       main = "Relation between wt and mpg over cylinders",

       xlab = "Weight in lbs",

       ylab = "Miles/Gallon (US)")

 


Here, we have slightly changed the formula for xy plots. Instead of just checking correlation between mpg and wt, we have added a third factor which is the number of cyl a car has. 

 

(Must check: Descriptive Statistics in R)

 

This way, we are trying to find out the correlation between mpg and wt for every value of factor_cyl (for every number of cylinders basically). See the graph below which has a multivariate visual pane for scatterplot.


Multivariate scatterplot visual from xyplot() function through lattice in R

Creating a multivariate visual in R Programming using lattice functions


The 3D Scatterplot

 

In order to create a three dimensional scatterplot in R programming, we can use the cloud() function which has the same structure. Formula could give you a relationship based formula among variables and the data = argument will specify the data.


#Creating a 3D scatterplot

cloud(mpg~wt*qsec,

      data = mtcars,

      main = "3D scatterplot")

 


3D Scatterplot in R programming using cloud() function from lattice package

3D Scatterplot


We can also create a three dimensional scatter plot based on factor levels. See the code below where we create a 3D scatterplot over the factor levels of cylinders.


#Multivariate 3D scatterplot

cloud(mpg~wt*qsec | factor_cyl,

      data = mtcars,

      main = "Multivariate 3D scatterplot")

 


Here, we have added the factor_cyl as a multidimensional variable for each factor value of the same, we will be getting 3D scatterplot across the variables.


Multivariate 3D scatterplot using cloud() function from lattice package.

3D Scatterplot : Multivariate across number of cylinders present


The Kernel Density Plot

 

We can also have the kernel density plot (same as the density plot from the base graphing system of R) in association  with multiple variables from the same data set. See the code below where we can have the kernel density plot for mpg in association with factor levels of the gear variable. We can use the densityplot() function to create the same and the structure of the function remains the same (formula to specify the relationship, and data to specify the data source).


#Kernel Density Plot

densityplot(~mpg | factor_gear,

            data = mtcars,

            main = "Kernel density plot over number of gears",

            xlab = "Miles Per Gallon (US)")

 


Here, mpg has been plotted along with factor_gear. Below is the output for the code above.


Kernal Density plot for mpg over the number of gears using densityplot() function under lattice.

The Kernel Density Plot


 

 

We can also club all the density plots together on a single pane by setting groups = factor_gear and if you add the plot.points = FALSE argument, the points you are seeing at the bottom of the density plot will vanish. 

 

See the code and output as shown below:


#Kernel Density Plot together for all factors without points

densityplot(~mpg, 

            groups = factor_gear,

            data = mtcars,

            plot.points = FALSE,

            main = "Kernel density plot over number of gears",

            xlab = "Miles Per Gallon (US)")

 


See the output as shown below:


The Kernal Density Plot with factors all combined together.

The Kernel Density Plot with factors combined on the same pane


Here you can see all the density plots are overlapped together on a single graph pane and even the dots have been removed.

 

(Suggested blog: R Programming Vector Functions)

 

 

The Boxplot

 

To create a boxplot in R programming using the lattice package, we have a function named as bwplot(). It becomes more convenient to use box plots through the lattice package as the effect of different variables among each other could be captured in a single frame as well as with precision. 

 

Let us see an example of box plots through the lattice package of the plotting system.


#Boxplot associated with multiple variables

bwplot(factor_gear ~ mpg | factor_cyl,

       data = mtcars,   

       xlab = "Miles per Gallon (US)",

       ylab = "No of Gears",

       Main = "Mileage by no. of gears and cylinders")

 


Here, we are trying to plot the boxplot for mileage by number of gears for each number of cylinders for every vehicle. 

 

Therefore, we will have three groups based on the number of cylinders and for those three groups, we will have the boxplots for mpg over no of gears. 

 

See the screenshot below for your reference.


This image shows a boxplot with multivariate data for which mileage and no of gears are plotted for every value of no of cylinders.

Boxplot for multiple variables using lattice package


We can also change the visual layout. Since we have three graphs, which are placed vertically besides each other, we can use the layout = argument to change those as horizontally placed below each other. 

 

See the code and screenshot below for your reference.


#Boxplot associated with multiple variables and alternate layout

bwplot(factor_gear ~ mpg | factor_cyl,

       data = mtcars,

       xlab = "Miles per Gallon (US)",

       ylab = "No of Gears",

       Main = "Mileage by no. of gears and cylinders",

       layout = c(1, 3))

 


See the output of the code above, where graphs are placed horizontally rather than vertically by default.

 


Changing layout of the visual in a multivariate graph using lattice.

Alternate layout where visuals are placed horizontally


This article ends here, in our next article, we will come up with some more interesting topics from R programming for your reference.

0%

Comments