Jul 13, 2021 | Vanshika Kaushik
Data lake refers to a system that stores huge amounts of data in its raw format. Data lakes allow organizations to merge different types of data. Data is capable of storing multi structured data from different sources.
Airbyte has launched AWS S3 Connector and Connector Development Kit(CDK) to allow users to copy data from different sources to Amazon’s Simple Storage Service. Data plays a key role in the success of modern day enterprises. It is scattered and stored at different locations. Businesses are required to centralize data and arrange it in a common format for analysis.
Extract, Transform, Load (ETL) is a process that allows companies to change the data prior to its arrival on the central data warehouse. This process is accompanied with expensive on- premise storage. The data transformation process in (ETL) is slow. A change in user’s need results in re- extraction of data.
(Must Check: 9 Techniques Used in Business Analytics Framework)
Another process Extract, Load, Transform (ELT) enables companies to modify raw data on demand. This process cuts down the costs. Airbyte’s prime focus is on Extraction and Load functions. Its Connector Development Kit (CDK) allows businesses to create custom data source connectors. In addition to that CDK also provides prebuilt connectors.
Pre-built connectors simplify the creation of data pipelines and data transportation. Data transportation sources include CRMs (Salesforce), databases (MySQL PostgreSQL), analytics (Amplitude), data warehouses (Snowflake).
The development kit includes python framework for writing source connectors
An inbuilt test suite to run against connectors
Code generator for bootstrap development
CDK enables the users to built full featured connectors within a span of two hours
App Store, SmartSheet, Oracle DB are a few connectors built using CDK. CDK eradicates 75% code requirement. Less code focus is the main aim of CDK. CDK addresses connector specific edge cases and connector specific code.
According to VentureBeat For now, Airbyte’s Core product is the free and MIT licensed community edition though it eventually plans to go commercial through a hosted cloud incarnation, with an additional enterprise-grade offering in the works.