“Data visualizations will often serve an integral role in helping you to uncover key patterns, trends, and anomalies in your data.” - Brent Dykes, Director, Data Strategy
Data Visualization is the graphical representation of data with the help of graphs, maps, charts, heatmaps, and some open-source tools available i.e. Tableau, Kepler.gl, Excel sheets, Fusion Charts, etc. Apart from that, there are some Python libraries like matplotlib, plotly, seaborn, ggplot.
Visualization of the data will help in understanding the data by getting insights into that like the relationship between different features, recognizing the patterns, and trends.
It will obviously help to roll out the irrelevant information from the data and keep the relevant information required for the purpose. Also, it’s an easygoing process to visualize large as well as small datasets.
Common Process of Data Visualization,Source
Understanding Geospatial Data
Geospatial data represent the information regarding the locations i.e. both latitude, longitude, zip code, address, city. This dataset can be accumulated from GPS, geotagging, satellite imagery.
Types of Geospatial Data
What is Kepler.gl?
Kepler.gl is an open-source geo-analytics tool developed by some geniuses at Uber, for the visualization of geospatial data which can be helpful in getting trends and patterns hidden in the dataset. The dataset contains various geospatial information including latitude, longitude, zip or postal code, address, city, country, etc.
It makes it easier to visualize a large set of geospatial data. You can visualize any kind of geospatial data i.e. road trips, flights, earthquakes, the population of country or city, etc.
Visualization of NYC taxi trips using kepler.gl
In Kepler.gl data visualization on will be done with a combination of different layers i.e. points, hexagons, arcs, paths, grid layers. At the same time, it offers both the 2D and 3D visualization with different layers.
Layers in kepler.gl
As discussed above, there are many other methods to visualize the geospatial data, but the point is many of them are the time taken, and also some of them have hardware restrictions i.e. GPU as you need to use some software but this is not the case with kepler.gl as it’s a web-based application and you just need to import the dataset to visualize to the browser and by utilizing some filters and visualize the data with layers of your choice in fewer minutes only.
If you are new to these tools and want to explore this, you can just make use of some of the available datasets in Kepler.gl, by selecting one of your choices and then start playing with that on the map by applying different maps layers and filters, either in 2D or 3D mode.
Available dataset on kepler.gl
Some of the famous datasets available,
NYC Taxi trips
Travel Times From Uber Movement
San Francisco Street Tree Map
2017 Unemployment Rates for U.S. Counties.
There are many others, you can choose any one of them and explore kepler.gl. Apart from this one can also import a dataset of their own either in csv or GeoJson data or it can be URL to load data from some cloud.
Getting Started with Kepler.gl
In this blog, we will be utilizing the NYC Taxi Trip data, from one of the Kaggle competition. So the dataset is originally based on the 2016 NYC Yellow Cab trip record data, where the user has to make the prediction of trip duration.
This dataset has different features, some of them are -
So, the first step is to load the dataset on the kepler.gl.
Uploading data on kepler.gl
In Kepler.gl, there is an option to show the data table to display the dataset that is pre-uploaded there.
NYC Taxi Trip data
Navigate to the Interactions tab by moving the mouse in the Kepler.gl dashboard. There are different options to interact with the data plotted on the map, i.e. tooltips, brush, coordinates. You can choose any of that, but from tooltips and brush, only one can be chosen.
Let’s have a look into each of them one by one;
Brushes will highlight the region where you move and the rest of another area will get dark or you can say the other plotted locations will get hidden.
Coordinates will display the coordinates of all points where you put the cursor, as seen in the image.
Kepler.gl provides different types of map styles to plot the data on, even you can add one of your choices as well. Some of the available map styles are,
- Muted Night
- Muted Light
Apart from different map styles, there are another option i.e. layers to show on map and those are,
- The label of cities and other localities on the map
- The 3D building, for this you can choose the color of your choice.
3D layer Source
Now its time to plot the geospatial dataset on the map by utilizing different layers along with setting particular attributes. You can add multiple layers for visualizing your data by providing a suitable label for each layer.
Some of the available layers are as follows,
Apart from setting layers, you can also choose particular layer attributes corresponding to each different layers i.e. color, stroked, height (in case of 3D), cluster size, etc.
Using filters you can limit the data to be displayed on the map according to features from the dataset. For example, say you just want to display the points corresponding to trip pickup time or trip dropoff time, so just set the fields according to your choice.
As in the below figure you can see the filtration according to trip pickup time and trip dropoff time,
Filters in kepler.gl
For the filter having time-field values, you can use time playback video which will on play display points corresponding to that time from the scale as seen in below,
Time playback for time-field Filters
Also, you can export the map from kepler.gl in different modes as an image, HTML page, JSON, or as URL.
As in this blog, you have seen some of the basic functionalities of Kepler.gl toolbox, but there are many others available which you can explore on your own and also explore some other dataset available there and try with your own datasets too.
Data Visualization makes it easier for getting hidden patterns from the large dataset and as there are many ways to visualize the data as MS excel, tableau, python libraries but all of them have some restrictions in their use i.e time taken, hardware resources and many more, which can be resolve by utilizing an open-source web-based toolbox by Uber Engineering i.e. kepler.gl. (Read another blog on 5 ways ML helps in Uber Services Optimization)
It helps in the visualization of geospatial data on the maps by plotting lat-long only in some minutes without utilizing any hardware resources as it runs on a web browser. It has different functionalities as covered in this blog, to visualize data like the point, arc, lines, heatmaps, 3D Grids, layers, etc. Even you can manage particular attributes for each of the layers. Any kind of geospatial data can be visualized here as in this blog we have covered NYC Taxi Trips Data.