What is ETL?
What is Etl

It is abbreviated as Extraction, Transformation and Loading.

[menu_anchor name=”extraction”]


The first part of ETL includes extraction of the data from different data sources. Many of the Data warehousing projects or progress build up data from various other source systems. Each single system may use different data organization. Data sources like common formats are called to be as Relational databases whilst XML and flat files are called as Non Relation database.

An essential part of extraction includes mainly data validation to ensure whether the data taken from the particular source have the right values in present progress.

[menu_anchor name=”transformation”]


Transformation helps to convert the source data according to the specification of the targeted systems to ensure the kind of data being loaded into target. Transformation is divided into two types, they are active transformation and passive transformation. Where active transformation can reverse the rows of numbers that is being produced, but passive transformation cannot change the rows. Transformation involves both connected and unconnected. Connected transformation is directly connected to the target table in the mapping and unconnected transformation is called within another transformation or does not contain any network with transformations or objects in the mapping.

In data transformation a series of rules is added to the extracted data in order to prepare it for loading into the end target. An important function of transformation is the clearing of data which leads to proper data to the expected target.

Some of the Lists of transformation are:

  • Aggregator Transformation
  • Filter Transformation
  • Joiner Transformation
  • Lookup Transformation
  • Rank Transformation
  • Router Transformation
  • Source Qualifier transformation
  • Sorter Transformation
  • Sequence Generator transformation

[menu_anchor name=”loading”]


This loading phase carries the data in to the final target which has simple measured flat files or data warehouse. Sometimes data warehouse may rewrite existed matter with aggregated matter, uploading of extracted data is done many a times for a day or a week or a month.

Yet the entry of data for any one year in loading transformation the window is made in an historical way. Therefore the time limit and the capacity to restore are having vital designs choices that lay on the time available according to the business needs.


Parallel processing is the latest process applied in ETL software. In ETL software parallelism is grouped into three main types are

  • DATA : By segregating one file in order into small data file to give bilateral access
  • PIPELINE: Allows by coinciding of various components on the similar data stream.
  • COMPONENT: Running the process at the same time of different process in different data stream in the same job.


  • Interpretation and creation of the sources and aim
  • Importing the sources and marked tables
  • Creating a Pass -through planning

Read the latest car news and check out newest photos, articles, and more from the Car and Driver Blog.