The document provides an overview of data analysis, emphasizing its process of inspecting, cleansing, transforming, and modeling data to derive useful insights. It highlights Python as a powerful tool for data science, detailing key libraries such as NumPy, Pandas, and Matplotlib, alongside their functionalities for data manipulation and analysis. Additionally, the text discusses operational capabilities of NumPy, including array creation, reshaping, linear algebra functions, and indexing techniques.
Data Analysis Data Analysis,also known as analysis of data or data analytics, is a process of Inspecting, Cleansing, Transforming, and Modelling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making.
3.
Python as DataScience Tool? Easy to learn Scalability Growing Data Analytics Libraries Python community
4.
Python Packages forData Analysis • Numpy and Scipy – fundamental scientific computing. • Pandas – data manipulation and analysis. • Matplotlib – plotting and visualization. • Scikit-learn– machine learning and data mining. • StatsModels – statistical modeling, testing, and analysis.
5.
NumPY The NumPy (NumericPython) package required for high performance computing and data analysis. Low level library written in C (and FORTRAN) for high level mathematical functions. Overcomes the problem of running slower algorithms on Python by using multidimensional arrays and functions that operate on arrays. Allows concise and quick computations by VECTORIZATION. To use NumPy module, we need to import it using:
6.
Python in combinationwith NumPy, Scipy and Matplotlib can be used as a replacement for MATLAB. Matplotlib module provides MATLAB- like plotting functionality. NumPy – A Replacement for MatLab
7.
Operations Using NumPy Fast vectorized array operations for data munging and cleaning, subsetting and filtering, transformation, and any other kinds of computations Common array algorithms like sorting, unique, and set operations Efficient descriptive statistics and aggregating/summarizing data Data alignment and relational data manipulations for merging and joining together heterogeneous data sets Expressing conditional logic as array expressions instead of loops with if-elif- else branches Group-wise data manipulations (aggregation, transformation, function
8.
Core Python VsNumPy "Core Python", means Python without any special modules, i.e. especially without NumPy. Advantages of Core Python: high-level number objects: integers, floating point containers: lists with cheap insertion and append methods, dictionaries with fast lookup Advantages of using NumPy with Python: array oriented computing efficiently implemented multi-dimensional arrays
9.
Advantages of usingNumPy with Python Array oriented computing Efficiently implemented multi-dimensional arrays Designed for scientific computation Standard mathematical functions for fast operations on entire arrays of data without having to write loops Tools for reading / writing array data to disk and working with memory-mapped files Linear algebra, random number generation, and Fourier transform capabilities.
10.
NumPy(Array) NumPy arrayis a grid of values. Similar to lists, except that every element of an array must be the same type. Alias for NumPy library is np. np.array() is used to convert a list into a NumPy array.
11.
NumPy(Array) SHAPE Shape function givesa tuple of array dimensions and can be used to change the dimensions of an array. Using shape to get array dimensions Using shape to change array dimensions
12.
NumPy(Array) RESHAPE Gives a newshape to an array without changing its data. Creates a new array and does not modify the original array itself.
NumPy(Array) CONCATENATE Twoor more arrays can be concatenated together using the concatenate function with a tuple of the arrays to be joined: If an array has more than one dimension, it is possible to specify the axis along which multiple arrays are concatenated. By default, it is along the first dimension.
16.
NumPy(Array) ZEROS The zerostool returns a new array with a given shape and type filled with 0's. ONES The ones tool returns a new array with a given shape and type filled with 1's.
17.
NumPy(Array) IDENTITY Returns an identityarray. An identity array is a square matrix with all the main diagonal elements as 1 and the rest as 0 . The default type of elements is float.
18.
NumPy(Array) EYE Returns a2-D array with 1's as the diagonal and 0's elsewhere. The diagonal can be main, upper or lower depending on the optional parameter . Positive k is for the upper diagonal, a negative k is for the lower, and a 0k (default) is for the main diagonal.
19.
NumPy(Linear Algebra) TheNumPy module also comes with a number of built-in routines for linear algebra calculations. These can be found in the sub-module linalg. Some of the built in routines are: linalg.det linalg.eiv linalg.inv
20.
NUMPY(LINEAR ALGEBRA) linalg.det:Computes the determinant of an array. linalg.eig: Computes the eigen values and right eigen vectors of a square array.
21.
Operations On NumPy Wecan perform operations on numpy such as addition, subtraction , multiplication and even dot product of two or more matrices
22.
Operations On NumPy To transpose a matrix, use matrix_name.T operation . To find what shape is of transposed matrix is use matrix_name.T.shape to find it. TRANPOSE
23.
Operations On NumPy Wecan find the sum of matrices by sum() operation. We can find the maximum number in the matrix by using max() operation. We can find the position of the element in the matrix where the maximum or minimum value is in place. We can find the mean of a matrix using mean() operation.