Devashish Kumar Faculty-IT iNurture
Data Analysis Data Analysis, also known as analysis of data or data analytics, is a process of  Inspecting,  Cleansing,  Transforming, and  Modelling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making.
Python as Data Science Tool? Easy to learn Scalability Growing Data Analytics Libraries Python community
Python Packages for Data Analysis • Numpy and Scipy – fundamental scientific computing. • Pandas – data manipulation and analysis. • Matplotlib – plotting and visualization. • Scikit-learn– machine learning and data mining. • StatsModels – statistical modeling, testing, and analysis.
NumPY The NumPy (Numeric Python) package required for high performance computing and data analysis. Low level library written in C (and FORTRAN) for high level mathematical functions. Overcomes the problem of running slower algorithms on Python by using multidimensional arrays and functions that operate on arrays. Allows concise and quick computations by VECTORIZATION. To use NumPy module, we need to import it using:
Python in combination with NumPy, Scipy and Matplotlib can be used as a replacement for MATLAB. Matplotlib module provides MATLAB- like plotting functionality. NumPy – A Replacement for MatLab
Operations Using NumPy  Fast vectorized array operations for data munging and cleaning, subsetting and filtering, transformation, and any other kinds of computations  Common array algorithms like sorting, unique, and set operations  Efficient descriptive statistics and aggregating/summarizing data  Data alignment and relational data manipulations for merging and joining together heterogeneous data sets  Expressing conditional logic as array expressions instead of loops with if-elif- else branches  Group-wise data manipulations (aggregation, transformation, function
Core Python Vs NumPy "Core Python", means Python without any special modules, i.e. especially without NumPy. Advantages of Core Python: high-level number objects: integers, floating point containers: lists with cheap insertion and append methods, dictionaries with fast lookup  Advantages of using NumPy with Python: array oriented computing efficiently implemented multi-dimensional arrays
Advantages of using NumPy with Python  Array oriented computing  Efficiently implemented multi-dimensional arrays  Designed for scientific computation  Standard mathematical functions for fast operations on entire arrays of data without having to write loops  Tools for reading / writing array data to disk and working with memory-mapped files  Linear algebra, random number generation, and Fourier transform capabilities.
NumPy(Array)  NumPy array is a grid of values.  Similar to lists, except that every element of an array must be the same type.  Alias for NumPy library is np.  np.array() is used to convert a list into a NumPy array.
NumPy(Array) SHAPE Shape function gives a tuple of array dimensions and can be used to change the dimensions of an array.  Using shape to get array dimensions  Using shape to change array dimensions
NumPy(Array) RESHAPE Gives a new shape to an array without changing its data. Creates a new array and does not modify the original array itself.
NumPy(Array) TRANSPOSE Generates the transposition of an array using the function np.transpose. Does not affect the original array, but it will create a new array.
NumPy(Array) FLATTEN Flatten creates a copy of the input array flattened to one dimension.
NumPy(Array)  CONCATENATE  Two or more arrays can be concatenated together using the concatenate function with a tuple of the arrays to be joined:  If an array has more than one dimension, it is possible to specify the axis along which multiple arrays are concatenated. By default, it is along the first dimension.
NumPy(Array)  ZEROS The zeros tool returns a new array with a given shape and type filled with 0's.  ONES The ones tool returns a new array with a given shape and type filled with 1's.
NumPy(Array) IDENTITY Returns an identity array. An identity array is a square matrix with all the main diagonal elements as 1 and the rest as 0 . The default type of elements is float.
NumPy(Array) EYE  Returns a 2-D array with 1's as the diagonal and 0's elsewhere.  The diagonal can be main, upper or lower depending on the optional parameter .  Positive k is for the upper diagonal, a negative k is for the lower, and a 0k (default) is for the main diagonal.
NumPy(Linear Algebra)  The NumPy module also comes with a number of built-in routines for linear algebra calculations.  These can be found in the sub-module linalg.  Some of the built in routines are:  linalg.det  linalg.eiv  linalg.inv
NUMPY(LINEAR ALGEBRA)  linalg.det: Computes the determinant of an array.  linalg.eig: Computes the eigen values and right eigen vectors of a square array.
Operations On NumPy We can perform operations on numpy such as addition, subtraction , multiplication and even dot product of two or more matrices
Operations On NumPy  To transpose a matrix, use matrix_name.T operation .  To find what shape is of transposed matrix is use matrix_name.T.shape to find it. TRANPOSE
Operations On NumPy We can find the sum of matrices by sum() operation. We can find the maximum number in the matrix by using max() operation. We can find the position of the element in the matrix where the maximum or minimum value is in place. We can find the mean of a matrix using mean() operation.
NumPy(Indexing/Slicing) Fetches elements from 2nd to 7th position of single dimensional array. Fetches last 2 elements of single dimensional array.
NumPy(Indexing/Slicing)
CONTRIBUTERS

Data Analysis in Python-NumPy

  • 1.
  • 2.
    Data Analysis Data Analysis,also known as analysis of data or data analytics, is a process of  Inspecting,  Cleansing,  Transforming, and  Modelling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making.
  • 3.
    Python as DataScience Tool? Easy to learn Scalability Growing Data Analytics Libraries Python community
  • 4.
    Python Packages forData Analysis • Numpy and Scipy – fundamental scientific computing. • Pandas – data manipulation and analysis. • Matplotlib – plotting and visualization. • Scikit-learn– machine learning and data mining. • StatsModels – statistical modeling, testing, and analysis.
  • 5.
    NumPY The NumPy (NumericPython) package required for high performance computing and data analysis. Low level library written in C (and FORTRAN) for high level mathematical functions. Overcomes the problem of running slower algorithms on Python by using multidimensional arrays and functions that operate on arrays. Allows concise and quick computations by VECTORIZATION. To use NumPy module, we need to import it using:
  • 6.
    Python in combinationwith NumPy, Scipy and Matplotlib can be used as a replacement for MATLAB. Matplotlib module provides MATLAB- like plotting functionality. NumPy – A Replacement for MatLab
  • 7.
    Operations Using NumPy Fast vectorized array operations for data munging and cleaning, subsetting and filtering, transformation, and any other kinds of computations  Common array algorithms like sorting, unique, and set operations  Efficient descriptive statistics and aggregating/summarizing data  Data alignment and relational data manipulations for merging and joining together heterogeneous data sets  Expressing conditional logic as array expressions instead of loops with if-elif- else branches  Group-wise data manipulations (aggregation, transformation, function
  • 8.
    Core Python VsNumPy "Core Python", means Python without any special modules, i.e. especially without NumPy. Advantages of Core Python: high-level number objects: integers, floating point containers: lists with cheap insertion and append methods, dictionaries with fast lookup  Advantages of using NumPy with Python: array oriented computing efficiently implemented multi-dimensional arrays
  • 9.
    Advantages of usingNumPy with Python  Array oriented computing  Efficiently implemented multi-dimensional arrays  Designed for scientific computation  Standard mathematical functions for fast operations on entire arrays of data without having to write loops  Tools for reading / writing array data to disk and working with memory-mapped files  Linear algebra, random number generation, and Fourier transform capabilities.
  • 10.
    NumPy(Array)  NumPy arrayis a grid of values.  Similar to lists, except that every element of an array must be the same type.  Alias for NumPy library is np.  np.array() is used to convert a list into a NumPy array.
  • 11.
    NumPy(Array) SHAPE Shape function givesa tuple of array dimensions and can be used to change the dimensions of an array.  Using shape to get array dimensions  Using shape to change array dimensions
  • 12.
    NumPy(Array) RESHAPE Gives a newshape to an array without changing its data. Creates a new array and does not modify the original array itself.
  • 13.
    NumPy(Array) TRANSPOSE Generates the transpositionof an array using the function np.transpose. Does not affect the original array, but it will create a new array.
  • 14.
    NumPy(Array) FLATTEN Flatten creates acopy of the input array flattened to one dimension.
  • 15.
    NumPy(Array)  CONCATENATE  Twoor more arrays can be concatenated together using the concatenate function with a tuple of the arrays to be joined:  If an array has more than one dimension, it is possible to specify the axis along which multiple arrays are concatenated. By default, it is along the first dimension.
  • 16.
    NumPy(Array)  ZEROS The zerostool returns a new array with a given shape and type filled with 0's.  ONES The ones tool returns a new array with a given shape and type filled with 1's.
  • 17.
    NumPy(Array) IDENTITY Returns an identityarray. An identity array is a square matrix with all the main diagonal elements as 1 and the rest as 0 . The default type of elements is float.
  • 18.
    NumPy(Array) EYE  Returns a2-D array with 1's as the diagonal and 0's elsewhere.  The diagonal can be main, upper or lower depending on the optional parameter .  Positive k is for the upper diagonal, a negative k is for the lower, and a 0k (default) is for the main diagonal.
  • 19.
    NumPy(Linear Algebra)  TheNumPy module also comes with a number of built-in routines for linear algebra calculations.  These can be found in the sub-module linalg.  Some of the built in routines are:  linalg.det  linalg.eiv  linalg.inv
  • 20.
    NUMPY(LINEAR ALGEBRA)  linalg.det:Computes the determinant of an array.  linalg.eig: Computes the eigen values and right eigen vectors of a square array.
  • 21.
    Operations On NumPy Wecan perform operations on numpy such as addition, subtraction , multiplication and even dot product of two or more matrices
  • 22.
    Operations On NumPy To transpose a matrix, use matrix_name.T operation .  To find what shape is of transposed matrix is use matrix_name.T.shape to find it. TRANPOSE
  • 23.
    Operations On NumPy Wecan find the sum of matrices by sum() operation. We can find the maximum number in the matrix by using max() operation. We can find the position of the element in the matrix where the maximum or minimum value is in place. We can find the mean of a matrix using mean() operation.
  • 24.
    NumPy(Indexing/Slicing) Fetches elements from2nd to 7th position of single dimensional array. Fetches last 2 elements of single dimensional array.
  • 25.
  • 26.