Tibbles in R Programming

  • Lalit Salunkhe
  • Oct 26, 2020
  • R Programming
Tibbles in R Programming title banner

The R language was developed by some old school geeks and hence few things might be useful at that period and not today’s. As it looks and appears, making changes in core R is highly difficult as you will need to break the original coding structure. Therefore, most of the developments or changes happen within the package and those get updated/invented now and then. 

 

Tibbles is nothing but a part of this evolution. Tibbles is a modified version of data frames. They have some amendments made into older behaviour that helps in making life easier for the programmer. Throughout this article, we will discuss the Tibbles, how they work in R, Some examples, and many more such things. Before that, if you are new to R Programming, please read our series of articles that start from the basics of the language at R Programming Articles.


 

Working with Tibbles

 

Speaking of Tibbles, they are not a part of core R and if we are trying to access them, we have to move towards certain packages. Tibbles is originally a part of the package named Tibble which is a subpart of the legendary tidyverse package. You can install these packages into your environment. See the code below as an illustration:


#Installing Tibble package to access Tibbles

install.packages("tibble")



#Alternatively installing tidyverse to access the Tibbles

install.packages("tidyverse")

Well, if you are new to this package thing, why not read out our article on  Packages in R Programming. You will get to learn everything about the packages in that article.


 

How Can We Create Tibbles?

 

To create a Tibble, we have to use the as_tibble(). This function coerces the normal data frame into a Tibble and is a part of the package we have imported. Since we most of the time have data sets stored as data frames, this function will be helpful to coerce those towards a Tibble. Let’s see the example code below:


#setting up the directory

library(tidyverse)



#coercing a data frame into a Tibble

cars_tibble <- as_tibble(cars)



#printing the new Tibble

print(cars_tibble)

First of all, we access the package from the directory from which we can access the functions associated with Tibbles. After that, we have the cars dataset built-in under R which is in the form of a data frame. We use the as_tibble() function to coerce that data frame into a Tibble and print it. Let us see the output of this code as shown below:


This image shows how the code shown above allows us to create a tibble from the cars dataset.

The output of the code shown above. It creates a Tibble named cars_tibble


We can also create a Tibble from scratch, i.e. we don’t always need an already created data frame to generate a Tibble. To create a Tibble from scratch, we have the tibble() function with us.

# Creating a Tibble from scratch

my_tibble <- 

  tibble(

  a = 1:5,

  b = 10,

  c = a * 2 + b

  )

print(my_tibble)

class(my_tibble)

 


Here in this example, we have created a tibble named my_tibble using the tibble() function. In the example above, the input for variable b is of length one. Under Tibble, any input of length one gets recycled automatically. Note that, only input with length one gets recycled. See an output for this code if we run through R.


This image shows how a tibble can be created from scratch without using data frames.

Output for creating the Tibbles from scratch


 

Here, in this example, the value for variable “b” is recycled up to the length of the variable “a” so that further calculations happen under column C. 

 

In R, we could not have variable names that contain special characters, starting with numbers, containing spaces, etc. However, in Tibble, you have the luxury to use such variable names (not recommended though). You need to enclose such variable names into backticks( ` ). See an example below:


# Creating a Tibble with non-syntactic variable names

my_tib_1 <- 

  Tibble(

  `1*vect` = 1:3,

  `:)` = "Smile Please!",

  `# ` = "Pound Key!"

  )

print(my_tib_1)

class(my_tib_1)


Here, in this example, we are creating a Tibble with non-syntactic variable names such as the ones that start with numbers, containing special characters such as an asterisk (‘*’) or pound (‘#’), etc. These types of names are completely valid under Tibble if you enclose those under backticks ( ` ). See the output of this code as shown below:


The output of a code that contains non-syntactical variable names.

The output of the code above


We can also use the tribble() function in R to create a tibble. This function is more of on the data entry side. Variable names can be added under the function vertically (like columns in Excel) using the Tilda (‘~’) sign and below to each variable name, we can add the values associated with each of them.

 

See the code below for a better realization:


# Creating a tibble in a data entry fashion using tribble() function

my_tib_2 <- 

  tribble(

    ~p,  ~q,  ~r,

    #--|---|----|

     1, "a", 98.2,

     2, "b", 76.5,

  )

print(my_tib_2)

class(my_tib_2)

 


In this code, the tribble function helps us to create a tibble with three variables named “p”, “q”, and “r”, respectively. The comment below to headers is just for user realization that where a variable starts and ends. It can be omitted but looks pretty informative. Below to that, we have the values associated with each variable separated by a comma horizontally. See the output of this code below:


This code shows how a tibble can be created with a data-entry sort of approach in R using the tribble() function.

Creating a tibble using the tribble() function in R


 

Advantage of Tibbles Over Data Frames

 

There are some similarities between Tibbles and data frames but at the same time, there are some differences too. Therefore, it would be wise to make a comparison between Tibbles and data frames.

 

  1. Printing View

 

Tibbles have the nicest printing methodology when it comes to comparing against the data frames. If a tibble is printed, it always shows the first ten rows with by default the number of columns that can fit on your console. This feature works great when you are dealing with a large data set. See an example below where we are using the weather dataset from the nycflights13 package: 


This image shows the weather dataset as a tibble from the nycflights13 package. And explains the tibble printing method

This image shows a better printing view of Tibbles


If you see at the end of the printed output, it reads a message that there are five more columns that are not printed on the screen as they do not fit on it as well as the number of rows. Moreover, an important explanatory benefit is that every variable type is being displayed below the variable name so that we can easily identify the variable type at the data pre-processing level itself. This is borrowed from the str() function present in base R Programming.

 

It is also possible to see all columns or no. of columns of your choice in Tibble. We can use the print() function which has customization available. See below:


nycflights13::weather %>%

                        print(n = 8, width = Inf)


This code prints an output with all variable columns and eight rows of each column on the console screen.


This image shows how to Customize the output generated from a tibble

Customizing Tibble output using the print() function and it’s options


  • Tibbles also don’t change the variable names of the given data frame

  • They don’t create row names and every row is represented or rather identified by the number that it holds.

  • One of the nicest features in some cases is, Tibbles don’t convert the strings into factors and it saves your time in almost every string variable case.


 

Conclusion

 

  1. Tibbles are modified versions of data frames that do tasks more efficiently in comparison with data frames (especially when we are dealing with huge data sets).

  2. Tibbles is a part of the tibble package which is a subpart of the legendary tidyverse package. You can install either the tidyverse or the Tibble package to access the Tibbles.

  3. To convert a data frame into a tibble, we need to use the as_tibble() function.

  4. We can also create a tibble directly using tibble() or tribble() functions.

  5. Tibbles have a nice printing preview, where you can get the first ten rows from each column by default, the number of columns that can fit on your console width, as well as the variable type below each variable name which is nice to have.

  6. We can customize the Tibble printing number of rows and columns using customizations that are available under the print() function.


This article ends here, but we have several different articles ranging from the field of general analytics, SQL, to the advanced fields such as NLP, Machine Learning, Artificial Intelligence, etc. All these articles can be accessed through Analytics Steps. Also, in my next article, I will come up with something more interesting from the field of R Programming. Until then, stay safe and keep enhancing.

0%

Comments