“Data visualizations will often serve an integral role in helping you to uncover key patterns, trends, and anomalies in your data.” - Brent Dykes, Director, Data Strategy
Data Visualization is the graphical representation of data with the help of graphs, maps, charts, heatmaps, and some open-source tools available i.e. Tableau, Kepler.gl, Excel sheets, Fusion Charts, etc. Apart from that, there are some Python libraries like matplotlib, plotly, seaborn, ggplot.
Visualization of the data will help in understanding the data by getting insights into that like the relationship between different features, recognizing the patterns, and trends.
It will obviously help to roll out the irrelevant information from the data and keep the relevant information required for the purpose. Also, it’s an easygoing process to visualize large as well as small datasets.
Common Process of Data Visualization,Source
Geospatial data represent the information regarding the locations i.e. both latitude, longitude, zip code, address, city. This dataset can be accumulated from GPS, geotagging, satellite imagery.
Kepler.gl is an open-source geo-analytics tool developed by some geniuses at Uber, for the visualization of geospatial data which can be helpful in getting trends and patterns hidden in the dataset. The dataset contains various geospatial information including latitude, longitude, zip or postal code, address, city, country, etc.
It makes it easier to visualize a large set of geospatial data. You can visualize any kind of geospatial data i.e. road trips, flights, earthquakes, the population of country or city, etc.
Visualization of NYC taxi trips using kepler.gl
In Kepler.gl data visualization on will be done with a combination of different layers i.e. points, hexagons, arcs, paths, grid layers. At the same time, it offers both the 2D and 3D visualization with different layers.
Layers in kepler.gl
As discussed above, there are many other methods to visualize the geospatial data, but the point is many of them are the time taken, and also some of them have hardware restrictions i.e. GPU as you need to use some software but this is not the case with kepler.gl as it’s a web-based application and you just need to import the dataset to visualize to the browser and by utilizing some filters and visualize the data with layers of your choice in fewer minutes only.
If you are new to these tools and want to explore this, you can just make use of some of the available datasets in Kepler.gl, by selecting one of your choices and then start playing with that on the map by applying different maps layers and filters, either in 2D or 3D mode.
Available dataset on kepler.gl
Some of the famous datasets available,
NYC Taxi trips
Travel Times From Uber Movement
San Francisco Street Tree Map
2017 Unemployment Rates for U.S. Counties.
There are many others, you can choose any one of them and explore kepler.gl. Apart from this one can also import a dataset of their own either in csv or GeoJson data or it can be URL to load data from some cloud.
In this blog, we will be utilizing the NYC Taxi Trip data, from one of the Kaggle competition. So the dataset is originally based on the 2016 NYC Yellow Cab trip record data, where the user has to make the prediction of trip duration.
This dataset has different features, some of them are -
So, the first step is to load the dataset on the kepler.gl.
Uploading data on kepler.gl
In Kepler.gl, there is an option to show the data table to display the dataset that is pre-uploaded there.
NYC Taxi Trip data
Navigate to the Interactions tab by moving the mouse in the Kepler.gl dashboard. There are different options to interact with the data plotted on the map, i.e. tooltips, brush, coordinates. You can choose any of that, but from tooltips and brush, only one can be chosen.
Let’s have a look into each of them one by one;
Tooltips will display the information corresponding to the locations while hovering on the map. You can choose the information from the dataset of your choice. So whenever user click on any location a pop up will show the information as seen below,
Brushes will highlight the region where you move and the rest of another area will get dark or you can say the other plotted locations will get hidden.
Coordinates will display the coordinates of all points where you put the cursor, as seen in the image.
Kepler.gl provides different types of map styles to plot the data on, even you can add one of your choices as well. Some of the available map styles are,
Apart from different map styles, there are another option i.e. layers to show on map and those are,
3D layer Source
Now its time to plot the geospatial dataset on the map by utilizing different layers along with setting particular attributes. You can add multiple layers for visualizing your data by providing a suitable label for each layer.
Some of the available layers are as follows,
Point: It will plot points on the locations from the dataset like pickup points, dropoff points by giving their latitude and longitude.
Arc: It will plot arcs between two points, which can be helpful in getting distance between two points like the distance between pickup and dropoff points for each passenger.
Line: Another version of the arc layer but it will be 2D version.
Heatmap: It demonstrates the intensity at geographical points using different colors.
Grid: It is similar to heatmaps.
H3: It plots data by making use of the H3 Hexagonal Hierarchical Spatial Index. To plot H3 on the map you need hex id corresponding to each location which can be found using the python h3 module.
Apart from setting layers, you can also choose particular layer attributes corresponding to each different layers i.e. color, stroked, height (in case of 3D), cluster size, etc.
Using filters you can limit the data to be displayed on the map according to features from the dataset. For example, say you just want to display the points corresponding to trip pickup time or trip dropoff time, so just set the fields according to your choice.
As in the below figure you can see the filtration according to trip pickup time and trip dropoff time,
Filters in kepler.gl
For the filter having time-field values, you can use time playback video which will on play display points corresponding to that time from the scale as seen in below,
Time playback for time-field Filters
Also, you can export the map from kepler.gl in different modes as an image, HTML page, JSON, or as URL.
As in this blog, you have seen some of the basic functionalities of Kepler.gl toolbox, but there are many others available which you can explore on your own and also explore some other dataset available there and try with your own datasets too.
Data Visualization makes it easier for getting hidden patterns from the large dataset and as there are many ways to visualize the data as MS excel, tableau, python libraries but all of them have some restrictions in their use i.e time taken, hardware resources and many more, which can be resolve by utilizing an open-source web-based toolbox by Uber Engineering i.e. kepler.gl. (Read another blog on 5 ways ML helps in Uber Services Optimization)
It helps in the visualization of geospatial data on the maps by plotting lat-long only in some minutes without utilizing any hardware resources as it runs on a web browser. It has different functionalities as covered in this blog, to visualize data like the point, arc, lines, heatmaps, 3D Grids, layers, etc. Even you can manage particular attributes for each of the layers. Any kind of geospatial data can be visualized here as in this blog we have covered NYC Taxi Trips Data.
6 Major Branches of Artificial Intelligence (AI)READ MORE
Reliance Jio and JioMart: Marketing Strategy, SWOT Analysis, and Working EcosystemREAD MORE
Top 10 Big Data TechnologiesREAD MORE
8 Most Popular Business Analysis Techniques used by Business AnalystREAD MORE
Deep Learning - Overview, Practical Examples, Popular AlgorithmsREAD MORE
7 types of regression techniques you should know in Machine LearningREAD MORE
7 Types of Activation Functions in Neural NetworkREAD MORE
What Are Recommendation Systems in Machine Learning?READ MORE
Introduction to Time Series Analysis in Machine learningREAD MORE
How Does Linear And Logistic Regression Work In Machine Learning?READ MORE