0% found this document useful (0 votes)

20 views73 pages

Unit 3 - Predictive Analysis

The document discusses trendlines and regression analysis, emphasizing the use of various mathematical functions and Excel tools to model relationships in data. It covers simple and multiple linear regression, the importance of R-squared values, and the significance of regression coefficients through hypothesis testing. Additionally, it highlights the need for careful model building and checking assumptions to ensure accurate predictions.

Uploaded by

rishabhsaini40618

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views73 pages

Unit 3 - Predictive Analysis

Uploaded by

rishabhsaini40618

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 73

Trendlines and Regression

Analysis
Modeling Relationships and Trends in
Data

 Create charts to better

understand data sets.
 For cross-sectional data, use a

scatter chart.
 For time series data, use a line

chart.
Common Mathematical Functions Used n
Predictive Analytical Models

Linear y = a + bx
Logarithmic y = ln(x)
Polynomial (2nd order) y = ax2 + bx + c
Polynomial (3rd order) y = ax3 + bx2 + dx + e
Power y = axb
Exponential y = abx
(the base of natural logarithms, e = 2.71828…is often used
for the constant b)
Excel Trendline Tool
 Right click on data series
and choose Add trendline
from pop-up menu
 Check the boxes Display
Equation on chart and
Display R-squared value
on chart
R2
 R2 (R-squared) is a measure of the “fit” of the line
to the data.
◦ The value of R2 will be between 0 and 1.
◦ A value of 1.0 indicates a perfect fit and all data points
would lie on the line; the larger the value of R2 the better
the fit.
Example 8.1: Modeling a Price-Demand Function

Linear demand function:

Sales = 20,512 - 9.5116(price)
Example 8.2: Predicting Crude Oil Prices
 Line chart of historical crude oil prices
Example 8.9 Continued
 Excel’s Trendline tool is used to fit various functions to the
data.

Exponential y = 50.49e0.021x R2 = 0.664

y = 0.13x2 − 2.399x + 68.01 R2 = 0.905

Logarithmic y = 13.02ln(x) + 39.60 R2 = 0.382

y = 0.005x3 − 0.111x2
Polynomial 2°
Polynomial 3°
+ 0.648x + 59.497 R2 = 0.928
*
Power y = 45.96x0.0169 R2 = 0.397
Example 8.2 Continued
 Third order polynomial trendline fit to the data

Figure 8.11
Think- Pair & Share on Regression Analysis

 Think:Consider the role of regression

analysis in your field or industry. Reflect
on how regression analysis is currently
being used to analyze relationships
between variables, make predictions, or
inform decision-making processes.

Copyright © 2013 Pearson Education, Inc.

publishing as Prentice Hall 9-10
Continue

 Pair: Discuss with a colleague or team member

how regression analysis has impacted your work
or industry. Share examples of specific projects or
analyses where regression analysis was used and
the outcomes it yielded. Consider any challenges
or limitations you've encountered when working
with regression analysis.

Copyright © 2013 Pearson Education, Inc.

publishing as Prentice Hall 9-11
Continue..
 Share: Share your insights and discussion points
with the wider team or group. Discuss the
potential of regression analysis to provide
valuable insights into complex relationships and
make accurate predictions. Exchange ideas on
how regression analysis can be further leveraged
to address specific challenges or capitalize on
opportunities within your organization or industry.

Copyright © 2013 Pearson Education, Inc.

publishing as Prentice Hall 9-12
Caution About Polynomials
 The R2 value will continue to increase as the order
of the polynomial increases; that is, a 4th order
polynomial will provide a better fit than a 3rd order,
and so on.
 Higher order polynomials will generally not be very
smooth and will be difficult to interpret visually.
◦ Thus, we don't recommend going beyond a third-
order polynomial when fitting data.
 Use your eye to make a good judgment!
Regression Analysis
 Regression analysis is a tool for building
mathematical and statistical models that
characterize relationships between a dependent
(ratio) variable and one or more independent, or
explanatory variables (ratio or categorical), all of
which are numerical.
 Simple linear regression involves a single
independent variable.
 Multiple regression involves two or more
independent variables.
Simple Linear Regression
 Finds a linear relationship between:
 one independent variable X and
 one dependent variable Y
 First prepare a scatter plot to verify the data has a

linear trend.
 Use alternative approaches if the data is not linear.
Example 8.3: Home Market Value Data

Size of a house is
typically related to its
market value.
X = square footage
Y = market value ($)
The scatter plot of the full
data set (42 homes)
indicates a linear trend.
Finding the Best-Fitting Regression Line
 Market value = a + b × square feet
 Two possible lines are shown below.

 Line A is clearly a better fit to the data.

 We want to determine the best regression line.
Example 8.4: Using Excel to Find the
Best Regression Line
 Market value = 32,673 + $35.036 × square feet
◦ The estimated market value of a home with 2,200 square feet
would be: market value = $32,673 + $35.036 × 2,200 = $109,752

The regression model

explains variation in
market value due to
size of the home.
It provides better
estimates of market
value than simply
using the average.
Least-Squares Regression
 Simple linear regression model:

 We estimate the parameters from the sample data:

 Let Xi be the value of the independent variable of the ith

observation. When the value of the independent
variable is Xi, then Yi = b0 + b1Xi is the estimated value
of Y for Xi.
Residuals
 Residuals are the observed errors associated
with estimating the value of the dependent
variable using the regression line:
Least Squares Regression
 The best-fitting line minimizes the sum of squares of the
residuals.

 Excel functions:
◦ =INTERCEPT(known_y’s, known_x’s)
◦ =SLOPE(known_y’s, known_x’s)
Example 8.5: Using Excel Functions to
Find Least-Squares Coefficients
 Slope = b1 = 35.036
=SLOPE(C4:C45, B4:B45)
 Intercept = b0 = 32,673
=INTERCEPT(C4:C45, B4:B45)
 Estimate
^
Y when X = 1750 square feet
Y = 32,673 + 35.036(1750) = $93,986
=TREND(C4:C45, B4:B45, 1750)
Simple Linear Regression With Excel
Data > Data Analysis >
Regression
Input Y Range (with
header)
Input X Range (with
header)
Check Labels

Excel outputs a table with

many useful regression
statistics.
Home Market Value Regression
Results
Regression Statistics
 Multiple R - | r |, where r is the sample correlation
coefficient. The value of r varies from -1 to +1 (r is
negative if slope is negative)
 R Square - coefficient of determination, R2, which

varies from 0 (no fit) to 1 (perfect fit)

 Adjusted R Square - adjusts R2 for sample size

and number of X variables

 Standard Error - variability between observed

and predicted Y values. This is formally called the

standard error of the estimate, SYX.
Example 8.6: Interpreting Regression
Statistics for Simple Linear Regression

53% of the variation in home market values

can be explained by home size.
The standard error of $7287 is less than
standard deviation (not shown) of $10,553.
Regression as Analysis of Variance

ANOVA conducts an F-test to determine whether

variation in Y is due to varying levels of X.
ANOVA is used to test for significance of regression:

H1: population slope coefficient ≠ 0

H0: population slope coefficient = 0

Excel reports the p-value (Significance F).

Rejecting H0 indicates that X explains variation in Y.
Example 8.7: Interpreting Significance of
Regression
Home size is not a significant variable
Home size is a significant variable
 p-value = 3.798 x 10-8
◦ Reject H0: The slope is not equal to zero. Using a linear
relationship, home size is a significant variable in explaining
variation in market value.
Testing Hypotheses for Regression
Coefficients
 An alternate method for testing whether a slope or
intercept is zero is to use a t-test:

 Excel provides the p-values for tests on the slope and

intercept.
Example 8.8: Interpreting Hypothesis
Tests for Regression Coefficients

 Use p-values to draw conclusion

 Neither coefficient is statistically equal to zero.

Confidence Intervals for Regression
Coefficients
 Confidence intervals (Lower 95% and Upper 95%
values in the output) provide information about the
unknown values of the true regression
coefficients, accounting for sampling error.
 We may also use confidence intervals to test
hypotheses about the regression coefficients.
◦ To test the hypotheses

check whether B1 falls within the confidence interval for the

slope. If it does, reject the null hypothesis.
Example 8.9: Interpreting Confidence
Intervals for Regression Coefficients
 For the Home Market Value data, a 95% confidence
interval for the intercept is [14,823, 50,523], and for the
slope, [24.59, 45.48].
 Although we estimated that a house with 1,750 square
feet has a market value of 32,673 + 35.036(1,750)
=$93,986, if the true population parameters are at the
extremes of the confidence intervals, the estimate might
be as low as 14,823 + 24.59(1,750) = $57,855 or as high
as 50,523 + 45.48(1,750) = $130,113.
Residual Analysis and Regression Assumptions
 Residual = Actual Y value − Predicted Y value
 Standard residual = residual / standard deviation
 Rule of thumb: Standard residuals outside of ±2 or

±3 are potential outliers.

 Excel provides a table and a plot of residuals.

This point has a standard

residual of 4.53
Checking Assumptions
 Linearity
 examine scatter diagram (should appear linear)
 examine residual plot (should appear random)
 Normality of Errors
 view a histogram of standard residuals
 regression is robust to departures from normality
 Homoscedasticity: variation about the regression line is
constant
 examine the residual plot
 Independence of Errors: successive observations should
not be related.
 This is important when the independent variable is time.
Example 8.11: Checking Regression
Assumptions for the Home Market Value Data
 Linearity - linear trend in scatterplot
- no pattern in residual plot
Example 8.11 Continued

Normality of Errors – residual histogram appears

slightly skewed but is not a serious departure
Example 8.11 Continued
 Homoscedasticity – residual plot shows no serious
difference in the spread of the data for different X
values.
Example 8.11 Continued

 Independence of Errors – Because the data is

cross-sectional, we can assume this assumption
holds.
Multiple Linear Regression
 A linear regression model with more than one
independent variable is called a multiple linear
regression model.
Role Play Exploring Linear Regression
Analysis
 Pooja: Hey Ajay, come on in! I've been diving into
the marketing data you requested. Have a seat.
 Ajay: Thanks, Pooja. I'm really curious about our

marketing campaign's effectiveness, especially

regarding our online ads. Can you shed some light
on that?
 Pooja: Absolutely! I've been running some

analyses, and it seems like linear regression could

provide some insights into how our online ad
spending correlates with website traffic or
conversions.
Copyright © 2013 Pearson Education, Inc.
publishing as Prentice Hall 9-40
Continue..
 Ajay: Linear regression? Can you explain that in
simpler terms?
 Pooja: Of course. Linear regression is a statistical

method used to model the relationship between

two variables, typically by fitting a straight line to
the data points. In our case, we're looking at how
changes in our online ad spending impact website
traffic or conversions.
 Ajay: Ah, got it. So, how do we go about doing

this?

Copyright © 2013 Pearson Education, Inc.

publishing as Prentice Hall 9-41
Continue..
 Pooja: With linear regression, we'll be able to
quantify the impact of our ad spending on website
traffic or conversions. For example, we might find
that for every dollar increase in ad spending,
website traffic increases by a certain number of
visits. This information can help us optimize our
marketing budget allocation for maximum impact.

Copyright © 2013 Pearson Education, Inc.

publishing as Prentice Hall 9-42
Estimated Multiple Regression
Equation
 We estimate the regression coefficients—called
partial regression coefficients — b0, b1, b2,… bk,
then use the model:

 The partial regression coefficients represent the

expected change in the dependent variable when
the associated independent variable is increased
by one unit while the values of all other
independent variables are held constant.
Excel Regression Tool
 The independent variables in the spreadsheet must be in
contiguous columns.
◦ So, you may have to manually move the columns of data around
before applying the tool.
 Key differences:
 Multiple R and R Square are called the multiple
correlation coefficient and the coefficient of multiple
determination, respectively, in the context of multiple
regression.
 ANOVA tests for significance of the entire model. That
is, it computes an F-statistic for testing the hypotheses:
ANOVA for Multiple Regression
 ANOVA tests for significance of the entire model. That
is, it computes an F-statistic for testing the hypotheses:

 The multiple linear regression output also provides

information to test hypotheses about each of the
individual regression coefficients.
◦ If we reject the null hypothesis that the slope associated with
independent variable i is 0, then the independent variable i is
significant and improves the ability of the model to better predict
the dependent variable. If we cannot reject H0, then that
independent variable is not significant and probably should not be
included in the model.
Example 8.12: Interpreting Regression Results
for the Colleges and Universities Data

 Predict student graduation rates using several indicators:

Example 8.12 Continued
 Regression model

 The value of R2 indicates that 53% of the variation in the dependent

variable is explained by these independent variables.
 All coefficients are statistically significant.
Model Building Issues
 A good regression model should include only
significant independent variables.
 Adding an independent variable to a regression
model will always result in R2 equal to or greater
than the R2 of the original model.
 Adjusted R2 reflects both the number of
independent variables An increase in adjusted
R2 indicates that the model has improved.
Systematic Model Building Approach

1. Construct a model with all available independent

variables. Check for significance of the independent
variables by examining the p-values.
2. Identify the independent variable having the largest p-
value that exceeds the chosen level of significance.
3. Remove the variable identified in step 2 from the
model and evaluate adjusted R2.
(Don’t remove all variables with p-values that exceed a at the
same time, but remove only one at a time.)
4. Continue until all variables are significant.
Example 8.13: Identifying the Best Regression Model

 Banking Data

Home value has the

largest p-value; drop
and re-run the
regression.
Example 8.13 Continued

 Bank regression after removing Home Value

Adjusted R2 improves slightly.

All X variables are significant.

Alternate Criterion
 Use the t-statistic.
 If | t | < 1, then the standard error will decrease

and adjusted R2 will increase if the variable is

removed. If | t | > 1, then the opposite will occur.
 You can follow the same systematic approach,

except using t-values instead of p-values.

Multicollinearity
 Multicollinearity occurs when there are
strong correlations among the independent
variables, and they can predict each other

 Correlations exceeding ±0.7 may indicate

better than the dependent variable.

multicollinearity
 The variance inflation factor is a better

indicator, but not computed in Excel.

Example 8.14: Identifying Potential Multicollinearity

 Colleges and Universities correlation matrix; none

exceed the recommend threshold of ±0.7

 Banking Data correlation matrix; large correlations exist

Example 8.14 Continued
 If we remove Wealth from the model, the adjusted R2 drops to
0.9201, but we discover that Education is no longer significant.
 Dropping Education and leaving only Age and Income in the model
results in an adjusted R2 of 0.9202.
 However, if we remove Income from the model instead of Wealth,
the Adjusted R2 drops to only 0.9345, and all remaining variables
(Age, Education, and Wealth) are significant.
Practical Issues in Trendline and
Regression Modeling
 Identifying the best regression model often requires
experimentation and trial and error.
 The independent variables selected should make
sense in attempting to explain the dependent
variable
.
 Additional variables increase R2 and, therefore,
help to explain a larger proportion of the variation.
 Good models are as simple as possible (the
principle of parsimony).
Overfitting
 Over fitting means fitting a model too closely to the
sample data at the risk of not fitting it well to the
population in which we are interested.
 In multiple regression, if we add too many terms to the
model, then the model may not adequately predict other
values from the population.
 Over fitting can be mitigated by using good logic,
intuition, theory, and parsimony.
Regression with Categorical Variables
 Regression analysis requires numerical data.
 Categorical data can be included as independent

variables, but must be coded numeric using

dummy variables.
 For variables with 2 categories, code as 0 and 1.
CaseLets: Regression Analysis in Marketing

 What is the dependent variable in this

case study?
 Identify the independent variables.
 Why is regression analysis an appropriate

method for this case study?

 What type of regression analysis would be

most suitable for this case study, and

why?

publishing as Prentice Hall 9-59
Example 8.15: A Model with Categorical
Variables
 Employee Salaries provides data for 35 employees

 Predict Salary using Age and MBA (code as

yes=1, no=0)
Example 8.15 Continued
 Salary = 893.59 + 1044.15 × Age + 14767.23 × MBA
◦ If MBA = 0, salary = 893.59 + 1044 × Age
◦ If MBA = 1, salary =15,660.82 + 1044 × Age
Interactions
 An interaction occurs when the effect of one
variable is dependent on another variable.
 We can test for interactions by defining a new

variable as the product of the two variables,

X3 = X1 × X2 , and testing whether this
variable is significant, leading to an
alternative model.
Example 8.16: Incorporating Interaction
Terms in a Regression Model
 Define an interaction between
Age and MBA and re-run the
regression.

The MBA indicator is not significant; drop and re-run.

Example 8.16 Continued
 Adjusted R2 increased slightly, and both age and the
interaction term are significant. The final model is
salary = 3,323.11 + 984.25 × age + 425.58 × MBA × age
Categorical Variables with More Than Two
Levels
When a categorical variable
has k > 2 levels, we need to
add k - 1 additional variables to
the model.
Example 8.17: A Regression Model with
Multiple Levels of Categorical Variables
 The Excel file Surface
Finish provides
measurements of the
surface finish of 35 parts
produced on a lathe,
along with the
revolutions per minute
(RPM) of the spindle
and one of four types of
cutting tools used.
Example 8.17 Continued
 Because we have k = 4 levels of tool type, we will
define a regression model of the form
Example 8.17 Continued

 Add 3 columns to
the data, one for
each of the tool
type variables
Example 8.17 Continued
 Regression results

Surface finish = 24.49 + 0.098 RPM - 13.31 type B - 20.49 type C -

26.04 type D
Regression Models with Nonlinear Terms
 Curvilinear models may be appropriate when
scatter charts or residual plots show nonlinear
relationships.
 A second order polynomial might be used

Here β1 represents the linear effect of X on Y and

β2 represents the curvilinear effect.


 This model is linear in the β parameters so we can

use linear regression methods.
Debate on Regression Analysis
 Proponent Argument:
 Regression analysis is a powerful statistical tool

that allows us to explore relationships between

variables and make predictions based on data. It
is widely used across various fields, including
economics, social sciences, and marketing, to
understand complex phenomena and make
informed decisions.

publishing as Prentice Hall 9-71
Continue
 While it's true that regression analysis has its limitations,
these challenges can often be addressed with proper
data preprocessing, model selection, and validation
techniques. For example, techniques such as robust
regression and regularization can help mitigate the
impact of outliers and overfitting, respectively. Moreover,
advancements in statistical methods, such as machine
learning algorithms, offer alternatives to traditional
regression models, allowing for more flexibility and
robustness in modeling complex relationships.

publishing as Prentice Hall 9-72
CONCLUSION..

while regression analysis has its limitations and

challenges, its benefits outweigh its drawbacks when
used judiciously and in the right context. It remains
a valuable tool for exploring relationships between
variables, making predictions, and informing
decision-making across various domains. As data
analytics continue to evolve, regression analysis will
continue to play a vital role in extracting actionable
insights from data and driving innovation and
growth. Copyright © 2013 Pearson Education, Inc.
9-73
publishing as Prentice Hall

Master of Business Administration Arpit
No ratings yet
Master of Business Administration Arpit
75 pages
Evans Analytics2e PPT 08
No ratings yet
Evans Analytics2e PPT 08
73 pages
Chapter 4 - BA
No ratings yet
Chapter 4 - BA
41 pages
Chapter 8 B - Trendlines and Regression Analysis
No ratings yet
Chapter 8 B - Trendlines and Regression Analysis
73 pages
Evans Analytics2e PPT 08
No ratings yet
Evans Analytics2e PPT 08
65 pages
Business Analytics Module 8
100% (1)
Business Analytics Module 8
65 pages
Trendlines and Regression Analysis
No ratings yet
Trendlines and Regression Analysis
17 pages
Chapter 06-Regression Analysis
No ratings yet
Chapter 06-Regression Analysis
41 pages
Topic 7-Regression Analysis
No ratings yet
Topic 7-Regression Analysis
56 pages
Chapter 6
No ratings yet
Chapter 6
58 pages
Trendlines and Regression Analysis
No ratings yet
Trendlines and Regression Analysis
17 pages
Trendlines and Regression Analysis
No ratings yet
Trendlines and Regression Analysis
17 pages
Chapter 6 (Part I)
0% (1)
Chapter 6 (Part I)
40 pages
Module 8 Regression Analysis
No ratings yet
Module 8 Regression Analysis
15 pages
Evans - Analytics2e - PPT - 07 and 08 CH
No ratings yet
Evans - Analytics2e - PPT - 07 and 08 CH
50 pages
Simple Liner REgression
No ratings yet
Simple Liner REgression
27 pages
Least Square Method Using Excel
No ratings yet
Least Square Method Using Excel
21 pages
14 Statistics and Probability
No ratings yet
14 Statistics and Probability
37 pages
Linear Regression Analysis in Excel 2
No ratings yet
Linear Regression Analysis in Excel 2
15 pages
BES - Lecture 10 - Simple Linear Regression
No ratings yet
BES - Lecture 10 - Simple Linear Regression
15 pages
Evans - Analytics2e - PPT - 07 and 08
No ratings yet
Evans - Analytics2e - PPT - 07 and 08
49 pages
P4-FDA-B29-Monish Patle
No ratings yet
P4-FDA-B29-Monish Patle
14 pages
Forecasting Models & Linear Regression
No ratings yet
Forecasting Models & Linear Regression
71 pages
Regrion
No ratings yet
Regrion
19 pages
Excel Regression for Beginners
No ratings yet
Excel Regression for Beginners
39 pages
DA-3rd Unit
No ratings yet
DA-3rd Unit
16 pages
Da Unit 3 R22
No ratings yet
Da Unit 3 R22
15 pages
TOD 212 - PPT 1 For Students - Monsoon 2023
No ratings yet
TOD 212 - PPT 1 For Students - Monsoon 2023
26 pages
Regression Coeffient
No ratings yet
Regression Coeffient
52 pages
Linear Regression Analysis in Excel Assingment
No ratings yet
Linear Regression Analysis in Excel Assingment
17 pages
Regression Analysis
No ratings yet
Regression Analysis
49 pages
Regression Analysis Using Excel
100% (1)
Regression Analysis Using Excel
85 pages
In Statistical Regression
No ratings yet
In Statistical Regression
16 pages
Simplelinearregression NBC
No ratings yet
Simplelinearregression NBC
50 pages
Simple Linear Regression: Coefficient of Determination
No ratings yet
Simple Linear Regression: Coefficient of Determination
21 pages
Regression Analysis Explained
No ratings yet
Regression Analysis Explained
4 pages
Regression Using Excel
No ratings yet
Regression Using Excel
18 pages
Regression
No ratings yet
Regression
64 pages
Lecture 6 - Regression Analysis
No ratings yet
Lecture 6 - Regression Analysis
34 pages
Correlation and Regression
No ratings yet
Correlation and Regression
5 pages
Regression For Everyone Vol. 1
No ratings yet
Regression For Everyone Vol. 1
25 pages
Excel Regression Guide for Beginners
No ratings yet
Excel Regression Guide for Beginners
20 pages
Intro to Linear Regression Analysis
100% (2)
Intro to Linear Regression Analysis
39 pages
Simple Linear Regression1
No ratings yet
Simple Linear Regression1
51 pages
Unit - V Part-2
No ratings yet
Unit - V Part-2
48 pages
Regression Analysis in Excel
No ratings yet
Regression Analysis in Excel
22 pages
Unit-III (Data Analytics)
50% (2)
Unit-III (Data Analytics)
15 pages
Day7-Linear Regression New
No ratings yet
Day7-Linear Regression New
26 pages
Session 3: - Quantitative Demand
No ratings yet
Session 3: - Quantitative Demand
32 pages
Deck2 BusinessIntelligence M1 ACSA
No ratings yet
Deck2 BusinessIntelligence M1 ACSA
15 pages
4m TN Chapter 3 Regression
No ratings yet
4m TN Chapter 3 Regression
29 pages
Data Science Analytics Finals Reviewer
No ratings yet
Data Science Analytics Finals Reviewer
64 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
2 pages
Regression Analysis in Excel
No ratings yet
Regression Analysis in Excel
15 pages
Business Analytics Regression Guide
No ratings yet
Business Analytics Regression Guide
91 pages
Forecasting Models & Regression Analysis
No ratings yet
Forecasting Models & Regression Analysis
13 pages
Intro To Regression
No ratings yet
Intro To Regression
30 pages
Topic Simple Linear Regression
No ratings yet
Topic Simple Linear Regression
38 pages
Unit III
No ratings yet
Unit III
13 pages
? Unit Iv
No ratings yet
? Unit Iv
5 pages
Module-1 - Introduction To Research-BRM - Dr. Arvinder Kaur
No ratings yet
Module-1 - Introduction To Research-BRM - Dr. Arvinder Kaur
43 pages
Unit 2 - Descriptive Analytics
No ratings yet
Unit 2 - Descriptive Analytics
74 pages
CESIM Decision Making Guide - Ordering
No ratings yet
CESIM Decision Making Guide - Ordering
13 pages
Lesson Plan05-Latest Tools and Techniques For Business Analytics-D1PK201T-Sesson-3
No ratings yet
Lesson Plan05-Latest Tools and Techniques For Business Analytics-D1PK201T-Sesson-3
3 pages
Dharma at Work
No ratings yet
Dharma at Work
8 pages
Ethics Epics
No ratings yet
Ethics Epics
5 pages
Aruba 3810 Switch Series Data Sheet
No ratings yet
Aruba 3810 Switch Series Data Sheet
30 pages
Soal Bahasa Inggris Sma
75% (4)
Soal Bahasa Inggris Sma
17 pages
Apply Learning Theories in Counselling
No ratings yet
Apply Learning Theories in Counselling
17 pages
Material Handling1
No ratings yet
Material Handling1
2 pages
ATX-UK Exam Guidance 2024-2025
No ratings yet
ATX-UK Exam Guidance 2024-2025
7 pages
Bp2101 A Mighty Fortress Complete Corrected 3digitalsale-1
No ratings yet
Bp2101 A Mighty Fortress Complete Corrected 3digitalsale-1
15 pages
ALL MCQ QUESTIONS & ANSWERS RELATED TO NCC 'A' CERTIFICATE EXAMINATION - 062416.hi - en
No ratings yet
ALL MCQ QUESTIONS & ANSWERS RELATED TO NCC 'A' CERTIFICATE EXAMINATION - 062416.hi - en
86 pages
Official PNS Tut Solutions (Almost All Tutes)
No ratings yet
Official PNS Tut Solutions (Almost All Tutes)
64 pages
Online Assessment Preparation Guide
No ratings yet
Online Assessment Preparation Guide
2 pages
HLASM R3: New Features & Functions
No ratings yet
HLASM R3: New Features & Functions
27 pages
ECON 4012 - Macroeconomics IV (H) - Course Outlines PDF
No ratings yet
ECON 4012 - Macroeconomics IV (H) - Course Outlines PDF
6 pages
Impact of Digital Marketing
No ratings yet
Impact of Digital Marketing
76 pages
Hub Name - Dat
No ratings yet
Hub Name - Dat
1,603 pages
Transforming Christian Thought in The Visual Arts Theology Aesthetics and Practice 1st Edition Sheona Beaumont (Editor) 2024 Scribd Download
No ratings yet
Transforming Christian Thought in The Visual Arts Theology Aesthetics and Practice 1st Edition Sheona Beaumont (Editor) 2024 Scribd Download
71 pages
Cs25c03 Ec-Set 1
No ratings yet
Cs25c03 Ec-Set 1
2 pages
FLuconazole Anhydrate Form
No ratings yet
FLuconazole Anhydrate Form
14 pages
2.KP2 Basic Butcheri
No ratings yet
2.KP2 Basic Butcheri
15 pages
Book Accountability Form: Subjects Book Title Code
No ratings yet
Book Accountability Form: Subjects Book Title Code
4 pages
BLAME! Academy and So On (2017) (Digital) (Danke-Empire)
100% (3)
BLAME! Academy and So On (2017) (Digital) (Danke-Empire)
158 pages
Reports Sample
No ratings yet
Reports Sample
20 pages
Jipmat 2023 Paper
No ratings yet
Jipmat 2023 Paper
12 pages
0000004421-Chap 9
No ratings yet
0000004421-Chap 9
42 pages
GONAR-DTH Drilling Tools 2022-08
No ratings yet
GONAR-DTH Drilling Tools 2022-08
28 pages
Raymond Caldwell: Theatre Educator & Director
No ratings yet
Raymond Caldwell: Theatre Educator & Director
9 pages
The Role of Coaching, Training and Mentoring in Efforts To Improve The Performance of Micro, Small and Medium Enterprises
No ratings yet
The Role of Coaching, Training and Mentoring in Efforts To Improve The Performance of Micro, Small and Medium Enterprises
16 pages
Summrizing Tenses in English
No ratings yet
Summrizing Tenses in English
3 pages
Email Spamming Defense Guide
100% (1)
Email Spamming Defense Guide
7 pages
New Microsoft Office Word Document
No ratings yet
New Microsoft Office Word Document
23 pages
Australia's Sports: A Dark Reflection
No ratings yet
Australia's Sports: A Dark Reflection
3 pages
Chapter I - Introduction To Accounting Information Systems
No ratings yet
Chapter I - Introduction To Accounting Information Systems
29 pages

Unit 3 - Predictive Analysis

Uploaded by

Unit 3 - Predictive Analysis

Uploaded by

Trendlines and Regression

 Create charts to better

Linear demand function:

Exponential y = 50.49e0.021x R2 = 0.664

y = 0.13x2 − 2.399x + 68.01 R2 = 0.905

 Think:Consider the role of regression

Copyright © 2013 Pearson Education, Inc.

 Pair: Discuss with a colleague or team member

Copyright © 2013 Pearson Education, Inc.

Copyright © 2013 Pearson Education, Inc.

 Line A is clearly a better fit to the data.

The regression model

 We estimate the parameters from the sample data:

 Let Xi be the value of the independent variable of the ith

Excel outputs a table with

varies from 0 (no fit) to 1 (perfect fit)

and number of X variables

and predicted Y values. This is formally called the

53% of the variation in home market values

ANOVA conducts an F-test to determine whether

H1: population slope coefficient ≠ 0

Excel reports the p-value (Significance F).

 Excel provides the p-values for tests on the slope and

 Use p-values to draw conclusion

 Neither coefficient is statistically equal to zero.

check whether B1 falls within the confidence interval for the

±3 are potential outliers.

This point has a standard

Normality of Errors – residual histogram appears

 Independence of Errors – Because the data is

marketing campaign's effectiveness, especially

analyses, and it seems like linear regression could

method used to model the relationship between

Copyright © 2013 Pearson Education, Inc.

Copyright © 2013 Pearson Education, Inc.

 The partial regression coefficients represent the

 The multiple linear regression output also provides

 Predict student graduation rates using several indicators:

 The value of R2 indicates that 53% of the variation in the dependent

1. Construct a model with all available independent

Home value has the

 Bank regression after removing Home Value

Adjusted R2 improves slightly.

All X variables are significant.

and adjusted R2 will increase if the variable is

except using t-values instead of p-values.

 Correlations exceeding ±0.7 may indicate

indicator, but not computed in Excel.

 Colleges and Universities correlation matrix; none

 Banking Data correlation matrix; large correlations exist

variables, but must be coded numeric using

 What is the dependent variable in this

method for this case study?

most suitable for this case study, and

Copyright © 2013 Pearson Education, Inc.

 Predict Salary using Age and MBA (code as

variable as the product of the two variables,

The MBA indicator is not significant; drop and re-run.

Surface finish = 24.49 + 0.098 RPM - 13.31 type B - 20.49 type C -

Here β1 represents the linear effect of X on Y and

 This model is linear in the β parameters so we can

that allows us to explore relationships between

Copyright © 2013 Pearson Education, Inc.

Copyright © 2013 Pearson Education, Inc.

while regression analysis has its limitations and

You might also like