What is SQL Server Integration Services (SSIS)
SSIS is Microsoft newest ETL platform.
SSIS is a successor not an upgrade of DTS.
The primary purpose for SSIS is to migrate and manipulate data in support of the Business Intelligence (BI) lifecycle (SQL Server, SSIS, SQL Server Analysis Services (SSAS), SQL Server Reporting Services (SSRS).
SSIS includes a development environment within Visual Studio, runtime utilities, and a management environment which service as a platform for ETL operations.
What is ETL?
“ETL” stands for operation where data is Extracted, Transformed, and then Loaded.
Extraction of the data may have sources such as: databases, files, web services, or ftp sites.
Transformation of the source data involves:
- Standardizing the data as it’s moved from source to destination.
- Cleanse – remove errors.
- Convert – change data from its native data type to another type.
- Conform – massaging data between systems so idiosyncrasies between them match (two companies merge, one company uses a ‘0’ as a status flag, one doesn’t.
Loading data may target several types of destinations – SQL Server isn’t required.
Parts of an SSIS Package
The components that make up an SSIS package are:
- Control Flow – programmatic logic of SSIS package.
- Data Flows – flow of data between source & destination.
- Event handlers – wire actions to events.
- Logging Providers – record details of SSIS package execution to disk (file) or windows event log.
SSIS Package Lifecycle
- Develop: Import/Export Wizard or Business Intelligence Development Studio (BIDS).
- Store: File system (xml – “.dtsx”) or SQL Server (MSDB system db)- run from SSMS.
- Execute: DTExec or DTExecUI, BIDS, or SSMS.