Using the Data Mining Tools
overviewIntroduction to BI Dev studioCreating Data Mining ObjectsSteps to Create and Edit the ModelsUsing the ModelsUsing SQL Server Management Studio
BI Development Studio Business Intelligence Development Studio is the primary environment that you will use to develop business solutions that include Analysis Services, Integration Services, and Reporting Services projects. This environment is integrated into the Microsoft Visual Studio (VS) shell to provide a complete development experience for business intelligence operations.Each project type supplies templates for creating the objects required for business intelligence solutions, and provides a variety of designers, tools, and wizards to work with the objects.
BI Development Studio
BI User InterfaceSolution ExplorerThis is where you manage your solution and projects. All objects are created and managed here. To add objects to your project, you right-click the project name and select Add New Item.Window tabs These tabs allow you to quickly switch between designer windows. A tab will be displayed for each object or file that is currently open.
BI User InterfaceDesigner window This is where you edit and analyze your objects. Creating a new object or double-clicking an object in Solution Explorer will open that object’s specific designer, allowing you to modify and interact with the object.Designer tabsMany objects have different aspects that you can edit or interact with. These aspects are indicated by tabs within the Designer window.
BI User InterfaceProperties windowThis is a context-sensitive window that displays properties for the currently selected item, which is a general concept in VS and applies to any type of operation performed within the studio.BI menusThe area on the main menu bar between the Debug menu and the Tools menu is where you will find context-sensitive menus specific to Analysis Services objects.Output windowThe Output window displays messages when youbuild and deploy projects. If there are errors in your project, this is where you will find their descriptions.
Data Mining Objectsopen your database or project To perform data mining: you must indicate and describe your source data
 Then create mining structures and models.How to set up data sources?Two objects in Analysis Services act as interfaces to your data: The data source : which is essentially a connection string indicating data locationThe data source view (DSV):DSV is an abstraction layer that enables you to modify the way you look at data sources, or even define a schema and switch the actual source at a later time.
What is a Data source?A data source is a rather simple object. It consists of nothing more than a connection string, plus some additional information indicating how to connect.Two important things to note about data source are:Data location :when you set up your data sources, the data source must be accessible not only to the client where you used the tools to build the model, but also to the server where the model will be processed.This can be done by moving the data to a SQL Server database using SQL Server Integration Services (SSIS) before building your models using BI Dev Studio.Security:The user credentials that are used to access data from Analysis Services play an essential role.Microsoft recommends always using integrated security if it’s supported by the source database.
Using the data source viewThe DSV is an abstract client-side view of your data. This is where your modeling begins. The DSV is where you select, organize, explore, and in a sense, manipulate the data in the source.While creating a DSV for data mining purposes, the most important table to identify is your case table. This is the table that contains the cases you want to analyze.You must also bring in any related tables that provide additional information about your cases.
Using the data source viewThe DSV Designer initially displays a diagram of the tables in your data source and the relationships between them.you can use the DSV Designer later to explore the data and alter it to the shape you need for your models.
Named Calculations in DSVThese are additional virtual columns on the tables in your DSV, which enable you to mine derived information in your data without having to change your source data. A named calculation consists of name, a SQL expression containing the calculation, and an optional description.
Named Calculations in DSVThe calculation can be any valid of the following SQL expression : Arithmetic operations (+, −, *, /, and %)
Mathematical functions (ABS, LOG, SIGN, and SQRT)
Compositing expressions : The hypothesis you want to test depends on a variable that is a combination of two of the variables you already have. EX: It may not be interesting that a person is married or has children, but the combination of the two may provide valuable information. A composite expression for this situation could look like this:[Marital Status] + ‘ ‘ + [Has Children]CASE expressions :CASE expressions are an extremely flexible wayto create meaningful variables for data mining. The CASE expression allows you to assign results based on the evaluation of one or more conditions.
Exploring DataPart of any data mining project is learning about and understanding the nature of your data. By leveraging controls from Office Web Components (OWC), the DSV Designer provides the functionality to explore your data in four different views. By right-clicking a DSV table and selecting Explore Data, you can view your data as a table, PivotTable, simple charts, and a PivotChart.
Exploring data with pivot table
Steps to Create and Edit the ModelsAfter the data has been organized, modified, selected, and understand the data you want to analyze.you can start to create data mining objects. Which involves the following steps:Running the Data Mining Wizard.
Refining the results in Data Mining Designer.Server Analysis Services Server Analysis Services has two major objects that deal with data mining: mining structuresA mining structure defines the domain of a mining problem.A mining structure contains a list of structure columns that have data and content types, bindings to the data source, and some optional flags that control how the data is modeled.mining models.A mining model is the application of a mining algorithm to the data in a mining structure.The definition of a mining model contains an algorithm with its associated parameters, plus a list of columns from the mining structure.
The Data Mining WizardThe Data Mining Wizard creates the mining structure that describes the columns and training data you will use for mining, and optionally a mining model, which takes those columns, applies an algorithm, and defines the usage of each column for that algorithm.The steps of the wizard are:1. Select your algorithm or choose only a structure.2. Select the source tables and specify how they are used.3. Select the columns from those tables and specify how they are used.4. Finally, specify holdout data and name the structure and model.
 The Data Mining Wizard Specifying the trained data using Data Mining Wizard.
The Data Mining DesignerData Mining Designer is where most of the work with your models will take place. It contains the following five panes for editing, browsing, querying, and comparing models:The Mining Structure pane
The Mining Models pane
The Mining Model Viewer pane
The Mining Accuracy pane
The Mining Model Prediction paneYou must use the Mining structure editor to perform modeling operations that are not possible in the Mining Model Wizard.
The Mining Structure Editor
Data Mining ReportsSQL Server Reporting Services are used to access data mining query results and to distribute those results.SQL Server Management Studio combines a broad group of graphical tools with a number of rich script editors to provide access to SQL Server to developers and administrators of all skill levels.Reporting Services has options to run reports periodically and cache the results to expedite report retrieval, and you can even specify queries to control report distribution.

MS SQL SERVER: Using the data mining tools

  • 1.
    Using the DataMining Tools
  • 2.
    overviewIntroduction to BIDev studioCreating Data Mining ObjectsSteps to Create and Edit the ModelsUsing the ModelsUsing SQL Server Management Studio
  • 3.
    BI Development Studio BusinessIntelligence Development Studio is the primary environment that you will use to develop business solutions that include Analysis Services, Integration Services, and Reporting Services projects. This environment is integrated into the Microsoft Visual Studio (VS) shell to provide a complete development experience for business intelligence operations.Each project type supplies templates for creating the objects required for business intelligence solutions, and provides a variety of designers, tools, and wizards to work with the objects.
  • 4.
  • 5.
    BI User InterfaceSolutionExplorerThis is where you manage your solution and projects. All objects are created and managed here. To add objects to your project, you right-click the project name and select Add New Item.Window tabs These tabs allow you to quickly switch between designer windows. A tab will be displayed for each object or file that is currently open.
  • 6.
    BI User InterfaceDesignerwindow This is where you edit and analyze your objects. Creating a new object or double-clicking an object in Solution Explorer will open that object’s specific designer, allowing you to modify and interact with the object.Designer tabsMany objects have different aspects that you can edit or interact with. These aspects are indicated by tabs within the Designer window.
  • 7.
    BI User InterfacePropertieswindowThis is a context-sensitive window that displays properties for the currently selected item, which is a general concept in VS and applies to any type of operation performed within the studio.BI menusThe area on the main menu bar between the Debug menu and the Tools menu is where you will find context-sensitive menus specific to Analysis Services objects.Output windowThe Output window displays messages when youbuild and deploy projects. If there are errors in your project, this is where you will find their descriptions.
  • 8.
    Data Mining Objectsopenyour database or project To perform data mining: you must indicate and describe your source data
  • 9.
    Then createmining structures and models.How to set up data sources?Two objects in Analysis Services act as interfaces to your data: The data source : which is essentially a connection string indicating data locationThe data source view (DSV):DSV is an abstraction layer that enables you to modify the way you look at data sources, or even define a schema and switch the actual source at a later time.
  • 10.
    What is aData source?A data source is a rather simple object. It consists of nothing more than a connection string, plus some additional information indicating how to connect.Two important things to note about data source are:Data location :when you set up your data sources, the data source must be accessible not only to the client where you used the tools to build the model, but also to the server where the model will be processed.This can be done by moving the data to a SQL Server database using SQL Server Integration Services (SSIS) before building your models using BI Dev Studio.Security:The user credentials that are used to access data from Analysis Services play an essential role.Microsoft recommends always using integrated security if it’s supported by the source database.
  • 11.
    Using the datasource viewThe DSV is an abstract client-side view of your data. This is where your modeling begins. The DSV is where you select, organize, explore, and in a sense, manipulate the data in the source.While creating a DSV for data mining purposes, the most important table to identify is your case table. This is the table that contains the cases you want to analyze.You must also bring in any related tables that provide additional information about your cases.
  • 12.
    Using the datasource viewThe DSV Designer initially displays a diagram of the tables in your data source and the relationships between them.you can use the DSV Designer later to explore the data and alter it to the shape you need for your models.
  • 13.
    Named Calculations inDSVThese are additional virtual columns on the tables in your DSV, which enable you to mine derived information in your data without having to change your source data. A named calculation consists of name, a SQL expression containing the calculation, and an optional description.
  • 14.
    Named Calculations inDSVThe calculation can be any valid of the following SQL expression : Arithmetic operations (+, −, *, /, and %)
  • 15.
    Mathematical functions (ABS,LOG, SIGN, and SQRT)
  • 16.
    Compositing expressions :The hypothesis you want to test depends on a variable that is a combination of two of the variables you already have. EX: It may not be interesting that a person is married or has children, but the combination of the two may provide valuable information. A composite expression for this situation could look like this:[Marital Status] + ‘ ‘ + [Has Children]CASE expressions :CASE expressions are an extremely flexible wayto create meaningful variables for data mining. The CASE expression allows you to assign results based on the evaluation of one or more conditions.
  • 17.
    Exploring DataPart ofany data mining project is learning about and understanding the nature of your data. By leveraging controls from Office Web Components (OWC), the DSV Designer provides the functionality to explore your data in four different views. By right-clicking a DSV table and selecting Explore Data, you can view your data as a table, PivotTable, simple charts, and a PivotChart.
  • 18.
  • 19.
    Steps to Createand Edit the ModelsAfter the data has been organized, modified, selected, and understand the data you want to analyze.you can start to create data mining objects. Which involves the following steps:Running the Data Mining Wizard.
  • 20.
    Refining the resultsin Data Mining Designer.Server Analysis Services Server Analysis Services has two major objects that deal with data mining: mining structuresA mining structure defines the domain of a mining problem.A mining structure contains a list of structure columns that have data and content types, bindings to the data source, and some optional flags that control how the data is modeled.mining models.A mining model is the application of a mining algorithm to the data in a mining structure.The definition of a mining model contains an algorithm with its associated parameters, plus a list of columns from the mining structure.
  • 21.
    The Data MiningWizardThe Data Mining Wizard creates the mining structure that describes the columns and training data you will use for mining, and optionally a mining model, which takes those columns, applies an algorithm, and defines the usage of each column for that algorithm.The steps of the wizard are:1. Select your algorithm or choose only a structure.2. Select the source tables and specify how they are used.3. Select the columns from those tables and specify how they are used.4. Finally, specify holdout data and name the structure and model.
  • 22.
    The DataMining Wizard Specifying the trained data using Data Mining Wizard.
  • 23.
    The Data MiningDesignerData Mining Designer is where most of the work with your models will take place. It contains the following five panes for editing, browsing, querying, and comparing models:The Mining Structure pane
  • 24.
  • 25.
    The Mining ModelViewer pane
  • 26.
  • 27.
    The Mining ModelPrediction paneYou must use the Mining structure editor to perform modeling operations that are not possible in the Mining Model Wizard.
  • 28.
  • 29.
    Data Mining ReportsSQLServer Reporting Services are used to access data mining query results and to distribute those results.SQL Server Management Studio combines a broad group of graphical tools with a number of rich script editors to provide access to SQL Server to developers and administrators of all skill levels.Reporting Services has options to run reports periodically and cache the results to expedite report retrieval, and you can even specify queries to control report distribution.
  • 30.
    Reporting using dataMining Query Designer
  • 31.
    SQL Server ManagementStudioMicrosoft SQL Server Management Studio, is an integrated environment for accessing, configuring, managing, administering, and developing all components of SQL Server. SQL Server Management Studio combines a broad group of graphical tools with a number of rich script editors to provide access to SQL Server to developers and administrators of all skill levels.
  • 32.
    SQL Server ManagementStudioSQL Server Management Studio combines the features of Enterprise Manager, Query Analyzer, and Analysis Manager, included in previous releases of SQL Server, into a single environment.In addition, SQL Server Management Studio works with all components of SQL Server such as Reporting Services, Integration Services, SQL Server 2005 Compact Edition, and Notification Services. Developers get a familiar experience, and database administrators get a single comprehensive utility that combines easy-to-use graphical tools with rich scripting capabilities.
  • 33.
  • 34.
    Management StudioFeaturesManagement Studio provides function to implement the following features:Maintain servers
  • 35.
  • 36.
  • 37.
    Build queries usingthe prediction builder
  • 38.
    Build queries usingthe query editor
  • 39.
  • 40.
  • 41.
    Backup and restoredatabasesVisit more self help tutorialsPick a tutorial of your choice and browse through it at your own pace.The tutorials section is free, self-guiding and will not involve any additional support.Visit us at www.dataminingtools.net