What is DATA MINING Data mining (or Knowledge Discovery) refers to the process of analyzing a give data set from different precepts and scenarios in order to discover patterns in the given data set
What is the SQL Server Add-in Microsoft has introduced a new and efficient data mining tool ,the “ Microsoft SQL Server 2005 Data Mining Add-Ins for Office 2007 ” putting data mining within the reach of every user or desktop. The add-in provides an easy way to reap the benefits of the data mining by harnessing the sophisticated data mining algorithms of Microsoft SQL Server 2005 Analysis Services within the familiar Microsoft Office environment at every desktop
Pre-requisites Microsoft .Net framework 2.0 : The add-in requires the .NET framework capabilities to function under Excel . Microsoft Excel 2007 : The add-in only works in Excel 2007. SQL Server Analysis Services : A connection between the add-in in excel and the Analysis services needs to be created ,a the add-in uses the analysis engine to perform the data mining. 4. The Data mining add-in : The add-in can be downloaded for free from the Microsoft website and is essential to perform Data mining using under Excel .
The Add-in Please select the following package to be installed in order to use the plugin.
The add-in in Excel On Installing the add-in you can see a new tab “DATA MINING” on the excel ribbon. Click on it to expand the tab. The ribbon contains four important partitions, Data preparation. Data modeling. Accuracy and Validation. Connection. We will see in brief how to use these options.
Data Preparation This block deals with preparing the data for mining, converting it to the proper format. The data preparation is the most important part of the data mining process as it is possible to analyze and find partitions in data sets that are in proper format.
Data Preparation- Explore Data We can explore the data by drawing a histogram of the data set and visualizing the pattern of data present. It improves our understanding about the data.
Data Preparation-Clean Data This tool helps us find out uncommon values and also delete/modify values that are beyond/below a certain value or re-lable data in a different manner.
Data Preparation-Partition Data This tool is used to create random partitions from a data set that can be used separately for mining.
Data Modeling The actual work of data mining is done on prepared data using these tools. These tools internally mine data using powerful mining algorithm’s employing SQL Server Analysis services. Sr.no Tool name Mining Algorithm used 1. Classify Microsoft Decision Trees 2. Estimate Microsoft Decision Trees 3. Clusters Microsoft Clustering 4. Associate Microsoft Association Rules 5. Forecast Microsoft Time Series
Data Modeling-Classify The Classify tool helps us build a classification model that shows how the individual values of one column are affected by values of other columns.
Data Modeling - Estimate This tool builds an estimation model used to predict continuous values on one column based on the values of other column
Data Modeling - Cluster This tool helps us create clustering model which can be used to detect which group of columns share similar characteristics.
Data Modeling - Associate This creates an association model that analyzes the data to detect items that appear together in transaction.
Data Modeling - Forecast Creates a Forecasting model that detects patterns and uses it to predict future values of individual columns based on previous values from the data set.
Data Modeling – Advanced We can apply desired mining algorithms to data set in excel to create new mining models or add a new mining model to a existing mining structure.
Accuracy and Validation In this part, we can find tools that can be used to test and validate our mining models. It is important that we know how well the mining models developed by us work with real world data, and by checking their accuracy we can validate the mining models
Accuracy and Validation-Accuracy Chart This tool helps us to apply previously developed mining model on a set of real world data so that we can see how well it performs .
Accuracy and Validation- Classification Matrix This tools applies the previously mined model to a new data set and compares the values predicted by the mining model and compares it with the actual model
Accuracy and Validation- Profit Chart This tool when used to apply a previously developed mining model to a new set of data plots a graph of the expected profit that would occur if the mining model was used .
Browse: Used to browse the previously created data mining models. Query: The Query Model tool lets you use the existing mining models to make predictions using the data in an Excel table using prediction query. Model Usage
Default Local host : Used to configure the connection of Excel to SQL Server analysis Services. Trace : Used to view the log of all the data sent to the QL Server for analysis during mining model creation. Connection
Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net

Excel Datamining Addin Intermediate

  • 2.
    What is DATAMINING Data mining (or Knowledge Discovery) refers to the process of analyzing a give data set from different precepts and scenarios in order to discover patterns in the given data set
  • 3.
    What is theSQL Server Add-in Microsoft has introduced a new and efficient data mining tool ,the “ Microsoft SQL Server 2005 Data Mining Add-Ins for Office 2007 ” putting data mining within the reach of every user or desktop. The add-in provides an easy way to reap the benefits of the data mining by harnessing the sophisticated data mining algorithms of Microsoft SQL Server 2005 Analysis Services within the familiar Microsoft Office environment at every desktop
  • 4.
    Pre-requisites Microsoft .Netframework 2.0 : The add-in requires the .NET framework capabilities to function under Excel . Microsoft Excel 2007 : The add-in only works in Excel 2007. SQL Server Analysis Services : A connection between the add-in in excel and the Analysis services needs to be created ,a the add-in uses the analysis engine to perform the data mining. 4. The Data mining add-in : The add-in can be downloaded for free from the Microsoft website and is essential to perform Data mining using under Excel .
  • 5.
    The Add-in Pleaseselect the following package to be installed in order to use the plugin.
  • 6.
    The add-in inExcel On Installing the add-in you can see a new tab “DATA MINING” on the excel ribbon. Click on it to expand the tab. The ribbon contains four important partitions, Data preparation. Data modeling. Accuracy and Validation. Connection. We will see in brief how to use these options.
  • 7.
    Data Preparation Thisblock deals with preparing the data for mining, converting it to the proper format. The data preparation is the most important part of the data mining process as it is possible to analyze and find partitions in data sets that are in proper format.
  • 8.
    Data Preparation- ExploreData We can explore the data by drawing a histogram of the data set and visualizing the pattern of data present. It improves our understanding about the data.
  • 9.
    Data Preparation-Clean DataThis tool helps us find out uncommon values and also delete/modify values that are beyond/below a certain value or re-lable data in a different manner.
  • 10.
    Data Preparation-Partition DataThis tool is used to create random partitions from a data set that can be used separately for mining.
  • 11.
    Data Modeling Theactual work of data mining is done on prepared data using these tools. These tools internally mine data using powerful mining algorithm’s employing SQL Server Analysis services. Sr.no Tool name Mining Algorithm used 1. Classify Microsoft Decision Trees 2. Estimate Microsoft Decision Trees 3. Clusters Microsoft Clustering 4. Associate Microsoft Association Rules 5. Forecast Microsoft Time Series
  • 12.
    Data Modeling-Classify TheClassify tool helps us build a classification model that shows how the individual values of one column are affected by values of other columns.
  • 13.
    Data Modeling -Estimate This tool builds an estimation model used to predict continuous values on one column based on the values of other column
  • 14.
    Data Modeling -Cluster This tool helps us create clustering model which can be used to detect which group of columns share similar characteristics.
  • 15.
    Data Modeling -Associate This creates an association model that analyzes the data to detect items that appear together in transaction.
  • 16.
    Data Modeling -Forecast Creates a Forecasting model that detects patterns and uses it to predict future values of individual columns based on previous values from the data set.
  • 17.
    Data Modeling –Advanced We can apply desired mining algorithms to data set in excel to create new mining models or add a new mining model to a existing mining structure.
  • 18.
    Accuracy and ValidationIn this part, we can find tools that can be used to test and validate our mining models. It is important that we know how well the mining models developed by us work with real world data, and by checking their accuracy we can validate the mining models
  • 19.
    Accuracy and Validation-AccuracyChart This tool helps us to apply previously developed mining model on a set of real world data so that we can see how well it performs .
  • 20.
    Accuracy and Validation-Classification Matrix This tools applies the previously mined model to a new data set and compares the values predicted by the mining model and compares it with the actual model
  • 21.
    Accuracy and Validation-Profit Chart This tool when used to apply a previously developed mining model to a new set of data plots a graph of the expected profit that would occur if the mining model was used .
  • 22.
    Browse: Used tobrowse the previously created data mining models. Query: The Query Model tool lets you use the existing mining models to make predictions using the data in an Excel table using prediction query. Model Usage
  • 23.
    Default Local host: Used to configure the connection of Excel to SQL Server analysis Services. Trace : Used to view the log of all the data sent to the QL Server for analysis during mining model creation. Connection
  • 24.
    Visit more selfhelp tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net