SQL Server Integration Services (SSIS) is a platform for data integration and workflow applications used for extracting, transforming, and loading (ETL) data. SSIS packages contain control flows and data flows to organize tasks for data migration. SSIS provides tools for loading data, transforming data types, and splitting data into training and testing sets for data mining models. It includes data mining transformations in the control flow and data flow environments to prepare and analyze text data for classification, clustering, and association models.
OverviewStandard Tasks inSSISSSIS PackagesData FlowWorking with SSIS in Data MiningData Mining TransformationsText Mining TransformationsSummary
3.
Overview of SSISSQLServer Integration Services (SSIS) is a component of the Microsoft SQL Server database software which can be used to perform a broad range of data migration tasks.SSIS is a platform for data integration and workflow applications. It features a fast and flexible data warehousing tool used for data extraction, transformation, and loading (ETL). The tool may also be used to automate maintenance of SQL Server databases and updates to multidimensional cube data.
SSIS PackagesA packageis the basic deployment and execution unit of an SSIS project.An SSIS package is the container for SSIS flows. You can create an SSIS package by right-clicking the SSIS Package folder in the Integration Services project folder and selecting the New SSIS Package menu item.An SSIS project may contain multiple packages. A package contains only one control flow, which may contain one or more data flows.In addition to control flow and data flow, a package contains SSIS connections and package variables.
7.
Task Flow andContainersTasks are listed in the SSIS Toolbox. You can add a task to the package by dragging it from the Toolbox and dropping it into the package designer.A package usually contains multiple tasks in a task flow. Multiple tasks are organized in sequential order with precedence constraints.Containers are SSIS objects that provide structure to a package. Each package has a container, which stores the flows of a package.
How to Setthe Properties of a Task or Container?To set the properties of a task or container by using theProperties window :In Business Intelligence Development Studio, open the Integration Services project that contains the package you want.
To save theupdated package, click Save Selected Items on the File menu.How to Set the Properties of a Task or Container?To set the properties of a task or container by using a task or container editor:In Business Intelligence Development Studio, open the Integration Services project that contains the package you want.
On the designsurface of the Control Flow tab, right-click the task or container, and then click Edit to open the corresponding task or container editor.
19.
If the taskor container editor has multiple nodes, click the node that contains the property that you want to set.
20.
Optionally, click Expressions and, onthe Expressions page, create property expressions to dynamically update the properties of the task or container.
To save theupdated package, click Save Selected Items on the File menu.Working with SSIS in Data MiningThis powerful tool is used to load data from various sources, combine these data sources, normalize column values, remove dirty records, replace missing values, split data into training and testing data sets, and so on.SSIS is more than just an ETL tool for data mining as it actually provides a few built-in data mining components in the control flow and data flow environment.
23.
Data Mining TransformationsThedata flow components can be categorized in three large groups, depending on their position in the data flow:
24.
Text Mining Transformationsyoumust first bring the text to some form that can be consumed by the algorithms, to perform text mining with SQL Server Data Mining. The solution included in the product is to represent each piece of text as a collection of words and phrases.
25.
Text Mining TransformationsAftereach document is represented as a collection of key phrases, you can perform data mining using one of the following model types:Classification models that use the key words and phrases nested table as input to predict the class of a document
Association models thatdetect cross-correlations between key words and phrasesText Mining TransformationsThe process of text mining usually consists of at least the following three phases:1. Extraction transformation: Build a dictionary of key words and phrases over a collection of representative documents. 2. Lookup transformation: Based on the dictionary, extract the list of significant key words and phrases for each document to be analyzed. 3. Train mining models on top of the transformed data.
28.
Visit more selfhelp tutorialsPick a tutorial of your choice and browse through it at your own pace.The tutorials section is free, self-guiding and will not involve any additional support.Visit us at www.dataminingtools.net