The R language was developed by some old school geeks and hence few things might be useful at that period and not today’s. As it looks and appears, making changes in core R is highly difficult as you will need to break the original coding structure. Therefore, most of the developments or changes happen within the package and those get updated/invented now and then.
Tibbles is nothing but a part of this evolution. Tibbles is a modified version of data frames. They have some amendments made into older behaviour that helps in making life easier for the programmer. Throughout this article, we will discuss the Tibbles, how they work in R, Some examples, and many more such things. Before that, if you are new to R Programming, please read our series of articles that start from the basics of the language at R Programming Articles.
Speaking of Tibbles, they are not a part of core R and if we are trying to access them, we have to move towards certain packages. Tibbles is originally a part of the package named Tibble which is a subpart of the legendary tidyverse package. You can install these packages into your environment. See the code below as an illustration:
#Installing Tibble package to access Tibbles install.packages("tibble") #Alternatively installing tidyverse to access the Tibbles install.packages("tidyverse")
Well, if you are new to this package thing, why not read out our article on Packages in R Programming. You will get to learn everything about the packages in that article.
To create a Tibble, we have to use the as_tibble(). This function coerces the normal data frame into a Tibble and is a part of the package we have imported. Since we most of the time have data sets stored as data frames, this function will be helpful to coerce those towards a Tibble. Let’s see the example code below:
#setting up the directory library(tidyverse) #coercing a data frame into a Tibble cars_tibble <- as_tibble(cars) #printing the new Tibble print(cars_tibble)
First of all, we access the package from the directory from which we can access the functions associated with Tibbles. After that, we have the cars dataset built-in under R which is in the form of a data frame. We use the as_tibble() function to coerce that data frame into a Tibble and print it. Let us see the output of this code as shown below:
The output of the code shown above. It creates a Tibble named cars_tibble
We can also create a Tibble from scratch, i.e. we don’t always need an already created data frame to generate a Tibble. To create a Tibble from scratch, we have the tibble() function with us.
# Creating a Tibble from scratch my_tibble <- tibble( a = 1:5, b = 10, c = a * 2 + b ) print(my_tibble) class(my_tibble)
Here in this example, we have created a tibble named my_tibble using the tibble() function. In the example above, the input for variable b is of length one. Under Tibble, any input of length one gets recycled automatically. Note that, only input with length one gets recycled. See an output for this code if we run through R.
Output for creating the Tibbles from scratch
Here, in this example, the value for variable “b” is recycled up to the length of the variable “a” so that further calculations happen under column C.
In R, we could not have variable names that contain special characters, starting with numbers, containing spaces, etc. However, in Tibble, you have the luxury to use such variable names (not recommended though). You need to enclose such variable names into backticks( ` ). See an example below:
# Creating a Tibble with non-syntactic variable names my_tib_1 <- Tibble( `1*vect` = 1:3, `:)` = "Smile Please!", `# ` = "Pound Key!" ) print(my_tib_1) class(my_tib_1)
Here, in this example, we are creating a Tibble with non-syntactic variable names such as the ones that start with numbers, containing special characters such as an asterisk (‘*’) or pound (‘#’), etc. These types of names are completely valid under Tibble if you enclose those under backticks ( ` ). See the output of this code as shown below:
The output of the code above
We can also use the tribble() function in R to create a tibble. This function is more of on the data entry side. Variable names can be added under the function vertically (like columns in Excel) using the Tilda (‘~’) sign and below to each variable name, we can add the values associated with each of them.
See the code below for a better realization:
# Creating a tibble in a data entry fashion using tribble() function my_tib_2 <- tribble( ~p, ~q, ~r, #--|---|----| 1, "a", 98.2, 2, "b", 76.5, ) print(my_tib_2) class(my_tib_2)
In this code, the tribble function helps us to create a tibble with three variables named “p”, “q”, and “r”, respectively. The comment below to headers is just for user realization that where a variable starts and ends. It can be omitted but looks pretty informative. Below to that, we have the values associated with each variable separated by a comma horizontally. See the output of this code below:
Creating a tibble using the tribble() function in R
There are some similarities between Tibbles and data frames but at the same time, there are some differences too. Therefore, it would be wise to make a comparison between Tibbles and data frames.
Tibbles have the nicest printing methodology when it comes to comparing against the data frames. If a tibble is printed, it always shows the first ten rows with by default the number of columns that can fit on your console. This feature works great when you are dealing with a large data set. See an example below where we are using the weather dataset from the nycflights13 package:
This image shows a better printing view of Tibbles
If you see at the end of the printed output, it reads a message that there are five more columns that are not printed on the screen as they do not fit on it as well as the number of rows. Moreover, an important explanatory benefit is that every variable type is being displayed below the variable name so that we can easily identify the variable type at the data pre-processing level itself. This is borrowed from the str() function present in base R Programming.
It is also possible to see all columns or no. of columns of your choice in Tibble. We can use the print() function which has customization available. See below:
nycflights13::weather %>% print(n = 8, width = Inf)
This code prints an output with all variable columns and eight rows of each column on the console screen.
Customizing Tibble output using the print() function and it’s options
Tibbles also don’t change the variable names of the given data frame
They don’t create row names and every row is represented or rather identified by the number that it holds.
One of the nicest features in some cases is, Tibbles don’t convert the strings into factors and it saves your time in almost every string variable case.
Tibbles are modified versions of data frames that do tasks more efficiently in comparison with data frames (especially when we are dealing with huge data sets).
Tibbles is a part of the tibble package which is a subpart of the legendary tidyverse package. You can install either the tidyverse or the Tibble package to access the Tibbles.
To convert a data frame into a tibble, we need to use the as_tibble() function.
We can also create a tibble directly using tibble() or tribble() functions.
Tibbles have a nice printing preview, where you can get the first ten rows from each column by default, the number of columns that can fit on your console width, as well as the variable type below each variable name which is nice to have.
We can customize the Tibble printing number of rows and columns using customizations that are available under the print() function.
This article ends here, but we have several different articles ranging from the field of general analytics, SQL, to the advanced fields such as NLP, Machine Learning, Artificial Intelligence, etc. All these articles can be accessed through Analytics Steps. Also, in my next article, I will come up with something more interesting from the field of R Programming. Until then, stay safe and keep enhancing.
Reliance Jio and JioMart: Marketing Strategy, SWOT Analysis, and Working EcosystemREAD MORE
6 Major Branches of Artificial Intelligence (AI)READ MORE
Top 10 Big Data Technologies in 2020READ MORE
What is the OpenAI GPT-3?READ MORE
Introduction to Time Series Analysis: Time-Series Forecasting Machine learning Methods & ModelsREAD MORE
7 types of regression techniques you should know in Machine LearningREAD MORE
8 Most Popular Business Analysis Techniques used by Business AnalystREAD MORE
How Does Linear And Logistic Regression Work In Machine Learning?READ MORE
How is Artificial Intelligence (AI) Making TikTok Tick?READ MORE
7 Types of Activation Functions in Neural NetworkREAD MORE