CSC446 : Pattern Recognition Prof. Dr. Mostafa G. M. Mostafa Faculty of Computer & Information Sciences Computer Science Department AIN SHAMS UNIVERSITY Lecture Note 3: Mathematical Foundations ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1 Appendix, Pattern Classification and PRML
CS446 : Pattern Recognition Readings: Chapter 1 in Bishop’s PRML Data Modeling (Regression) ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
Learning: Data Modeling β€’ Assume we have examples of pairs (x , y) and we want to learn the mapping 𝑭: 𝑿 β†’ 𝒀 to predict y for future values of x. π’š 𝒙 = 𝐬𝐒𝐧⁑( πŸπ…π’™) ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
Polynomial Curve Fitting β€’ Problem: There are many possible mapping functions 𝑭: 𝑿 β†’ 𝒀 exist! Which one to choose? β€’ We could choose the one that minimize the error : ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
Polynomial Curve Fitting β€’ Fitting a different polynomials (models) to data: 𝑦 π‘₯ = π’˜ 𝟎 𝑦 π‘₯ = π’˜ 𝟎+π’˜ 𝟏 𝒙 ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
Polynomial Curve Fitting β€’ Fitting a different polynomials (models) to data: 𝑦 π‘₯ = π’˜ 𝟎+π’˜ 𝟏 𝒙+π’˜ 𝟐 𝒙 𝟐 𝑦 π‘₯ = π’˜ 𝟎+π’˜ 𝟏 𝒙+π’˜ 𝟐 𝒙 𝟐 + β‹― + π’˜ πŸ– 𝒙 πŸ– ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
Overfitting β€’ At M = 9, we get zero training Error , BUT highest testing Error ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
Effect of Data Size β€’ As number of data samples N increases, we get more closer to the real data model with higher order. M = 9 M = 9 ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
Performance Evaluation β€’ Generalization error is the true error for the population of examples we would like to optimize – Sample mean only approximates it. β€’ Two ways to assess the generalization error is: β€’ Theoretical: Law of Large numbers – statistical bounds on the difference between the true and sample mean errors β€’ Practical: Use a separate data set with m data samples to test the model (Mean) test error = ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
Assignment 1 1. Derive an equation for estimating the parameters w from the sample data for the cases M = 1 and M = 2. 2. Use such equations to draw a relation between w and E(w) for each M. Use the estimated values of w as the middle values of the w range. ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
CS446 : Pattern Recognition Readings: Appendix A Probability & Statistics ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Randomness: –we call a phenomenon random if individual outcomes are uncertain but there is nonetheless a regular distribution of outcomes in a large number of repetitions. β€’ Probability: –the probability of any outcome of a random phenomenon is the proportion of times the outcome would occur in a very long series of repetitions. –Probability is the long-term relative frequency. ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Discrete random variables: –Let xοƒŽ X ; the sample space X = {v1, v2, ... , vm}. –We denote by pi the probability that x = vi: β€’ Where pi must satisfy the following two conditions: pi = Pr{ x = vi } , i = 1, . . . , m. οƒ₯ο€½ ο€½ο‚³ m i ii pp 1 1and0 ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Equally likely outcomes: β€œEqually likely outcomes are outcomes that have the same probability of occurring.” β€’ Examples: – Rolling a fair die – Tossing a fair coin β€’ P(x) is a β€œUniform Distribution” ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Equally likely outcomes: β€’ if we have ten identical balls numbered from 0 to 9, in a box find the probability of randomly drawing a ball with a number divisible by 3, – the event space (desired outcomes): A={3,6,9}. – the sample space (possible outcomes): S = {0, 1, 2, . . . , 9}. β€’ Since the drawing is at random, then each outcome is equally likely to occur, i.e.: P(0) = P(1) = P(2) =…= P(9) =1/10 β€’ P(A) ={numb. Of outcomes in A} / {number of outcomes in S} = 3/10 = 0.3 ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Biased outcomes (non-uniform dist.): β€œBiased outcomes are outcomes that have different probability of occurring.” β€’ Examples: – Rolling a unfair die – Tossing a unfair coin β€’ P(x) is a β€œNon-uniform Dist.” ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Biased outcomes (non-uniform dist.): β€’ A biased coin, twice as likely to come up tails as heads, is tossed twice: – What is the probability that at least one head occurs? β€’ Solution: – Sample space = {HH, HT, TH, TT} – P(H= head) = 1/3 , P(T= tail) =2/3 – Sample points/probability for the event: β€’ P(HT)= 1/3 x 2/3 = 2/9 P(HH)= 1/3 x 1/3= 1/9 β€’ P(TH) = 2/3 x 1/3 = 2/9 P(TT)= 2/3 x 2/3 = 4/9 – Answer: 5/9 = ο‚»0.56 (sum of weights in red) ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Probability and Language What’s the probability of a random word (from a random dictionary page) being a verb? β€’ Solution: β€’ All words = just count all the words in the dictionary β€’ # of ways to get a verb: number of words which are verbs! β€’ If a dictionary has 50,000 entries, and 10,000 are verbs, then: β€’ P(Verb) =10000/50000 = 1/5 = .20 wordsall verbagettowaysof verbadrawingP # )( ο€½ ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Conditional Probability – A way to reason about the outcome of an experiment based on partial information: β€’ In a word guessing game the first letter for the word is a β€œt”. How likely is the second letter is an β€œh”? β€’ How likely is a person has a disease given that a medical test was negative? β€’ A spot shows up on a radar screen. How likely it corresponds to an aircraft? β€’ I saw your friend, How likely I will saw you? ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Conditional Probability β€’ let A and B be events β€’ p(B|A) = the probability of event B occurring given event A occurs β€’ definition: )( ),( )|( BP BAP BAP ο€½ A BA,B Note: P(A,B)=P(A|B) Β· P(B) Also : P(A,B) = P(B,A) ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Conditional Probability β€’ One of the following 30 items is chosen at random. β€’ What is P(X), the probability that it is an X? β€’ What is P(X|red), the probability that it is an X given that it is red? ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Statistically Independent events –Variables x and y are said to be statistically independent if and only if: –That is, knowing the value of x did not give us any additional knowledge about the possible value of y )()(),( yPxPyxP ο€½ ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Marginal Probability β€’ Conditional Probability β€’ Joint Probability ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Sum Rule β€’ Product Rule ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Sum Rule β€’ Product Rule β€’ The Rules of Probability )()|()()|(),( YpYXpXpXYpYXp ο€½ο€½ οƒ₯ο€½ Y YXpXp ),()( ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Bayes Theorem where ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
1- Probability Theory β€’ Probability mass function, P(x): – P(x) is the cumulative distribution of p(x). οƒ₯  οƒŽ ο‚₯ο€­ ο€½ ο‚³ ο€½ο€½ Xx z xP xP dxxpz)P(x 1)(and 0)( )( ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
2- Statistics β€’ Statistics is the science of collecting, organizing, and interpreting numerical facts, which we call data. β€’ The best way of looking at data is to draw its histogram/ (frequency distribution) ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
2- Statistics β€’ Univariate Gaussian/Normal Density: –A density that is analytically tractable –Continuous density –A lot of processes are asymptotically Gaussian Where:  = mean (or expected value) of x 2 = squared deviation or variance , 2 1 exp 2 1 )( 2 οƒΊ οƒΊ  οƒΉ οƒͺ οƒͺ   οƒ· οƒΈ οƒΆ    ο€­ ο€­ο€½    x xp  ο‚₯ ο‚₯ο€­ ο€½ 1)( dxxp ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
2- Statistics β€’ Univariate Gaussian/Normal Density p(u) ~ N(0,1) ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
2- Statistics β€’ Multivariate Normal Density – Multivariate normal density in d dimensions is: where: x = (x1, x2, …, xd)t = The multivariate random variable  = (1, 2, …, d)t = the mean vector  = d*d covariance matrix, || and -1 are it determinant and inverse, respectively .  οƒΉ οƒͺ    ο€½ ο€­ )x()x( 2 1 exp )2( 1 )x( 1 2/12/   t d p ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
2- Statistics β€’ Multivariate Density: Statistically Independent – If xi and xj are statistically independent οƒ  Οƒij = 0. – In this case, p (x) reduces to the product of the univariate normal densities for the components of x. That is: if p(xi) ~ N(xi | Β΅i , Οƒi ) p(x) = p(x1,x2, …, xd) = p(x1) p(x2) … p(xd) =  p(xi) , i ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
2- Statistics β€’ Multivariate Normal Density – From the multivariate normal density, the loci of points of constant density are hyperellipsoids for which the quadratic form (xβˆ’Β΅)t Ξ£βˆ’1(xβˆ’Β΅) is constant – The quantity: r2 = (xβˆ’Β΅)t Ξ£βˆ’1 (xβˆ’Β΅) is sometimes called the squared Mahalanobis distance from x to Β΅. ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
2- Statistics Multivariate Normal Density ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
2- Statistics Expected values: β€’ The expected value, mean or average of the random variable x is defined by: β€’ if f(x) is any function of x, the expected value of f is defined by: ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
2- Statistics Expected values: β€’ The second moment of x is defined by: β€’ The variance of x is defined by: where Οƒ is the standard deviation of x. ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
3- Mathematical Notations ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
3- Mathematical Notations ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
3- Mathematical Notations ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
3- Mathematical Notations ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
3- Mathematical Notations ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
3- Mathematical Notations ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
Next Time Bayesian Decision Theory ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1

CSC446: Pattern Recognition (LN3)

  • 1.
    CSC446 : PatternRecognition Prof. Dr. Mostafa G. M. Mostafa Faculty of Computer & Information Sciences Computer Science Department AIN SHAMS UNIVERSITY Lecture Note 3: Mathematical Foundations ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1 Appendix, Pattern Classification and PRML
  • 2.
    CS446 : PatternRecognition Readings: Chapter 1 in Bishop’s PRML Data Modeling (Regression) ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 3.
    Learning: Data Modeling β€’Assume we have examples of pairs (x , y) and we want to learn the mapping 𝑭: 𝑿 β†’ 𝒀 to predict y for future values of x. π’š 𝒙 = 𝐬𝐒𝐧⁑( πŸπ…π’™) ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 4.
    Polynomial Curve Fitting β€’Problem: There are many possible mapping functions 𝑭: 𝑿 β†’ 𝒀 exist! Which one to choose? β€’ We could choose the one that minimize the error : ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 5.
    Polynomial Curve Fitting β€’Fitting a different polynomials (models) to data: 𝑦 π‘₯ = π’˜ 𝟎 𝑦 π‘₯ = π’˜ 𝟎+π’˜ 𝟏 𝒙 ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 6.
    Polynomial Curve Fitting β€’Fitting a different polynomials (models) to data: 𝑦 π‘₯ = π’˜ 𝟎+π’˜ 𝟏 𝒙+π’˜ 𝟐 𝒙 𝟐 𝑦 π‘₯ = π’˜ 𝟎+π’˜ 𝟏 𝒙+π’˜ 𝟐 𝒙 𝟐 + β‹― + π’˜ πŸ– 𝒙 πŸ– ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 7.
    Overfitting β€’ At M= 9, we get zero training Error , BUT highest testing Error ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 8.
    Effect of DataSize β€’ As number of data samples N increases, we get more closer to the real data model with higher order. M = 9 M = 9 ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 9.
    Performance Evaluation β€’ Generalizationerror is the true error for the population of examples we would like to optimize – Sample mean only approximates it. β€’ Two ways to assess the generalization error is: β€’ Theoretical: Law of Large numbers – statistical bounds on the difference between the true and sample mean errors β€’ Practical: Use a separate data set with m data samples to test the model (Mean) test error = ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 10.
    Assignment 1 1. Derivean equation for estimating the parameters w from the sample data for the cases M = 1 and M = 2. 2. Use such equations to draw a relation between w and E(w) for each M. Use the estimated values of w as the middle values of the w range. ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 11.
    CS446 : PatternRecognition Readings: Appendix A Probability & Statistics ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 12.
    1- Probability Theory β€’Randomness: –we call a phenomenon random if individual outcomes are uncertain but there is nonetheless a regular distribution of outcomes in a large number of repetitions. β€’ Probability: –the probability of any outcome of a random phenomenon is the proportion of times the outcome would occur in a very long series of repetitions. –Probability is the long-term relative frequency. ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 13.
    1- Probability Theory β€’Discrete random variables: –Let xοƒŽ X ; the sample space X = {v1, v2, ... , vm}. –We denote by pi the probability that x = vi: β€’ Where pi must satisfy the following two conditions: pi = Pr{ x = vi } , i = 1, . . . , m. οƒ₯ο€½ ο€½ο‚³ m i ii pp 1 1and0 ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 14.
    1- Probability Theory β€’Equally likely outcomes: β€œEqually likely outcomes are outcomes that have the same probability of occurring.” β€’ Examples: – Rolling a fair die – Tossing a fair coin β€’ P(x) is a β€œUniform Distribution” ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 15.
    1- Probability Theory β€’Equally likely outcomes: β€’ if we have ten identical balls numbered from 0 to 9, in a box find the probability of randomly drawing a ball with a number divisible by 3, – the event space (desired outcomes): A={3,6,9}. – the sample space (possible outcomes): S = {0, 1, 2, . . . , 9}. β€’ Since the drawing is at random, then each outcome is equally likely to occur, i.e.: P(0) = P(1) = P(2) =…= P(9) =1/10 β€’ P(A) ={numb. Of outcomes in A} / {number of outcomes in S} = 3/10 = 0.3 ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 16.
    1- Probability Theory β€’Biased outcomes (non-uniform dist.): β€œBiased outcomes are outcomes that have different probability of occurring.” β€’ Examples: – Rolling a unfair die – Tossing a unfair coin β€’ P(x) is a β€œNon-uniform Dist.” ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 17.
    1- Probability Theory β€’Biased outcomes (non-uniform dist.): β€’ A biased coin, twice as likely to come up tails as heads, is tossed twice: – What is the probability that at least one head occurs? β€’ Solution: – Sample space = {HH, HT, TH, TT} – P(H= head) = 1/3 , P(T= tail) =2/3 – Sample points/probability for the event: β€’ P(HT)= 1/3 x 2/3 = 2/9 P(HH)= 1/3 x 1/3= 1/9 β€’ P(TH) = 2/3 x 1/3 = 2/9 P(TT)= 2/3 x 2/3 = 4/9 – Answer: 5/9 = ο‚»0.56 (sum of weights in red) ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 18.
    1- Probability Theory β€’Probability and Language What’s the probability of a random word (from a random dictionary page) being a verb? β€’ Solution: β€’ All words = just count all the words in the dictionary β€’ # of ways to get a verb: number of words which are verbs! β€’ If a dictionary has 50,000 entries, and 10,000 are verbs, then: β€’ P(Verb) =10000/50000 = 1/5 = .20 wordsall verbagettowaysof verbadrawingP # )( ο€½ ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 19.
    1- Probability Theory β€’Conditional Probability – A way to reason about the outcome of an experiment based on partial information: β€’ In a word guessing game the first letter for the word is a β€œt”. How likely is the second letter is an β€œh”? β€’ How likely is a person has a disease given that a medical test was negative? β€’ A spot shows up on a radar screen. How likely it corresponds to an aircraft? β€’ I saw your friend, How likely I will saw you? ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 20.
    1- Probability Theory β€’Conditional Probability β€’ let A and B be events β€’ p(B|A) = the probability of event B occurring given event A occurs β€’ definition: )( ),( )|( BP BAP BAP ο€½ A BA,B Note: P(A,B)=P(A|B) Β· P(B) Also : P(A,B) = P(B,A) ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 21.
    1- Probability Theory β€’Conditional Probability β€’ One of the following 30 items is chosen at random. β€’ What is P(X), the probability that it is an X? β€’ What is P(X|red), the probability that it is an X given that it is red? ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 22.
    1- Probability Theory β€’Statistically Independent events –Variables x and y are said to be statistically independent if and only if: –That is, knowing the value of x did not give us any additional knowledge about the possible value of y )()(),( yPxPyxP ο€½ ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 23.
    1- Probability Theory β€’Marginal Probability β€’ Conditional Probability β€’ Joint Probability ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 24.
    1- Probability Theory β€’Sum Rule β€’ Product Rule ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 25.
    1- Probability Theory β€’Sum Rule β€’ Product Rule β€’ The Rules of Probability )()|()()|(),( YpYXpXpXYpYXp ο€½ο€½ οƒ₯ο€½ Y YXpXp ),()( ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 26.
    1- Probability Theory β€’Bayes Theorem where ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 27.
    1- Probability Theory β€’Probability mass function, P(x): – P(x) is the cumulative distribution of p(x). οƒ₯  οƒŽ ο‚₯ο€­ ο€½ ο‚³ ο€½ο€½ Xx z xP xP dxxpz)P(x 1)(and 0)( )( ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 28.
    2- Statistics β€’ Statisticsis the science of collecting, organizing, and interpreting numerical facts, which we call data. β€’ The best way of looking at data is to draw its histogram/ (frequency distribution) ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 29.
    2- Statistics β€’ UnivariateGaussian/Normal Density: –A density that is analytically tractable –Continuous density –A lot of processes are asymptotically Gaussian Where:  = mean (or expected value) of x 2 = squared deviation or variance , 2 1 exp 2 1 )( 2 οƒΊ οƒΊ  οƒΉ οƒͺ οƒͺ   οƒ· οƒΈ οƒΆ    ο€­ ο€­ο€½    x xp  ο‚₯ ο‚₯ο€­ ο€½ 1)( dxxp ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 30.
    2- Statistics β€’ UnivariateGaussian/Normal Density p(u) ~ N(0,1) ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 31.
    2- Statistics β€’ MultivariateNormal Density – Multivariate normal density in d dimensions is: where: x = (x1, x2, …, xd)t = The multivariate random variable  = (1, 2, …, d)t = the mean vector  = d*d covariance matrix, || and -1 are it determinant and inverse, respectively .  οƒΉ οƒͺ    ο€½ ο€­ )x()x( 2 1 exp )2( 1 )x( 1 2/12/   t d p ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 32.
    2- Statistics β€’ MultivariateDensity: Statistically Independent – If xi and xj are statistically independent οƒ  Οƒij = 0. – In this case, p (x) reduces to the product of the univariate normal densities for the components of x. That is: if p(xi) ~ N(xi | Β΅i , Οƒi ) p(x) = p(x1,x2, …, xd) = p(x1) p(x2) … p(xd) =  p(xi) , i ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 33.
    2- Statistics β€’ MultivariateNormal Density – From the multivariate normal density, the loci of points of constant density are hyperellipsoids for which the quadratic form (xβˆ’Β΅)t Ξ£βˆ’1(xβˆ’Β΅) is constant – The quantity: r2 = (xβˆ’Β΅)t Ξ£βˆ’1 (xβˆ’Β΅) is sometimes called the squared Mahalanobis distance from x to Β΅. ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 34.
    2- Statistics Multivariate NormalDensity ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 35.
    2- Statistics Expected values: β€’The expected value, mean or average of the random variable x is defined by: β€’ if f(x) is any function of x, the expected value of f is defined by: ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 36.
    2- Statistics Expected values: β€’The second moment of x is defined by: β€’ The variance of x is defined by: where Οƒ is the standard deviation of x. ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 37.
    3- Mathematical Notations ASU-CSC446: Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 38.
    3- Mathematical Notations ASU-CSC446: Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 39.
    3- Mathematical Notations ASU-CSC446: Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 40.
    3- Mathematical Notations ASU-CSC446: Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 41.
    3- Mathematical Notations ASU-CSC446: Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 42.
    3- Mathematical Notations ASU-CSC446: Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1
  • 43.
    Next Time Bayesian DecisionTheory ASU-CSC446 : Pattern Recognition. Prof. Dr. Mostafa Gadal-Haqq slide - 1