Publish AI, ML & data-science insights to a global community of data professionals.

Fundamentals of NumPy

Over the course of this article, we shall learn the various features and functions of the Python library, NumPy

Learn the basics of NumPy under 5 minutes

Photo by Emile Perron on Unsplash
Photo by Emile Perron on Unsplash

Over the course of this article, we shall learn the various features and functions of the Python library, NumPy

NumPy is one of the Python libraries, that supports multi-dimensional, substantial arrays as well as matrices. It also supports a large collection of mathematical functions to operate on these arrays. NumPy provides a strong base for many other data science and data visualization libraries.

In this article we will create, index and manipulate arrays using NumPy. (You will need to have prior knowledge on how tuples and lists work in Python.)

First, we’ll look at NumPy arrays. A NumPy array consists of values that can be indexed with slices, but also with boolean or integer arrays (masks). The shape of an array denotes the size of an array along each dimension and it is expressed using a tuple of integers as well. For a standard 2D array, the shape gives the number of rows followed by the number of columns.

Before we start, make sure you have all the necessary libraries installed in your system.

Using conda:

conda install numpy

Using pip:

pip install numpy 

Creating arrays using NumPy

>>> import numpy as np >>> a = np.array([1,22,333]) >>> print(a) [1 22 333]

Let’s play around with arrays a little before we get into more of NumPy’s features and functions. We can check the size of our array in Python using the ‘.shape’ function.

>>> print(a.shape) (3,)

Here we see the shape is (3, ) which means we’ve a 1 dimensional array of size 3. If we want to look at specific elements within the array, we use the same notations as we do in Python lists.

>>> print(a[0]) 1
>>> print(a[1]) 22
>>> print(a[2]) 333

Change a value at a specific index:

>>> a[0] = 11 >>> print(a) [ 11 22 333]

As you can see, the element at the first position has now changed from 1 to 11.

Create a 2D array using NumPy

>>> b = np.array([[4,5,6],[7,8,9]]) >>> print(b) [[4 5 6] [7 8 9]]

Print the shape

>>> b.shape (2, 3)

This shows us that we’ve 2 rows and 3 columns.

Now to look at the individual elements in the 2D array we can do the following:

>>> print(b[0,0]) 4

We can make numpy arrays of different values, not just integers

>>> c = np.array([9.0, 8.7, 6.5]) >>> print(c) [9. 8.7 6.5]

Other NumPy functions to construct arrays

NumPy provides different functions to create arrays. For example, If I wanted to create an array that is filled with zeros, I can do so easily using NumPy’s ‘.zeros’ function as shown below

>>> d = np.zeros((5,6)) >>> print(d) [[0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0.]]

Similarly we can make an array filled with 1’s using the ‘.ones’ function

>>> e = np.ones((5,6)) >>> print(e) [[1. 1. 1. 1. 1. 1.] [1. 1. 1. 1. 1. 1.] [1. 1. 1. 1. 1. 1.] [1. 1. 1. 1. 1. 1.] [1. 1. 1. 1. 1. 1.]]

Furthermore, you can create an array that is filled with some constant value by using the ‘.full’ function wherein you will need to mention the size of the array in the first part of the parenthesis and mention the constant value in the second part of the parenthesis. An instance of this is shown below where we create a 3×3 array filled with the value ‘7.8’ .

>>> f = np.full((3,3), 7.8) >>> print(f) [[7.8 7.8 7.8] [7.8 7.8 7.8] [7.8 7.8 7.8]]

NumPy also provides the random function. Using this we can create an array that consists of random values between 0 and 1

>>> g= np.random.random((4,4)) >>> print(g) [[0.21243056 0.4998238 0.46474266 0.24573327] [0.80314845 0.94159578 0.65609858 0.0559475 ] [0.80367609 0.35230391 0.91716958 0.03513166] [0.37717325 0.00882003 0.82166044 0.7435783 ]]

Indexing using NumPy

Numpy offers several ways to index arrays. Similar to python lists, NumPy arrays can be sliced. Since arrays can be multi-dimensional, you will need to specify the slice for each dimension of the array.

Let’s create a two-dimensional array with random values.

>>> i = np.array([[12,23,34], [45,56,67], [78,89,90]]) >>> print(i) [[12 23 34] [45 56 67] [78 89 90]]

For this example, lets use slicing to pull out the a part of the array that consists of the first 2 rows and columns 1 and 2

>>> j = i[:2, 1:3] >>> print(j) [[23 34] [56 67]]

A slice of an array (as shown above) is a view into the same data.

>>> print(i[0,1]) 23

You can change the value at any position in the array.

>>> j[0,0]=75 >>> print(i[0,1]) 75
>>> print(i) [[ 1 75 3] [ 4 5 6] [ 7 8 9]]

Boolean array indexing in NumPy

Boolean array indexing type of indexing is used to select the elements of an array that satisfy some condition. So lets create an array of some random values, apply a certain condition and see how boolean array indexing works.

>>> k = np.array([[26,78], [51,42], [30,89]]) >>> print(k) [[26 78] [51 42] [30 89]]
>>> print(k>50) [[False True] [ True False] [False True]]

We can also use boolean array indexing to create a new one dimensional array consisting of all the elements that are actually greater than 50

>>> print(k[k>50]) [78 51 89]

NumPy Math Functions

Basic math functions operate element wise on arrays. Which means that an element in one array corresponds to an element in another array in the same position. You can use both math operators or the functions provided by NumPy. Both give us the same outputs. Lets perform a few math operations on arrays.

  1. Addition
>>> l = np.array([[1,2], [3,4]]) >>> m = np.array([[5,6], [7,8]]) >>> print(l+m) [[ 6 8] [10 12]]

This adds elements in the arrays according to their corresponding positions. We can acquire similar results using the NumPy function.

>>> print(np.add(l,m)) [[ 6 8] [10 12]]

In addition to this, NumPy has functions for subtraction, multiplication division and computing the sum of the array as well.

  1. Subtraction
# Subtraction using math operators >>> print(l-m) [[ -4 -4] [-4 -4]]
# Subtraction using NumPy functions >>> print(np.subtract(l,m)) [[ -4 -4] [-4 -4]]
  1. Multiplication
# Multiplication using math operators >>> print(l*m) [[ 5 12] [21 32]]
# Multiplication using NumPy functions >>> print(np.multiply(l,m)) [[ 5 12] [21 32]]
  1. Division
# Division using math operators >>> print(l/m) [[0.2 0.33333333] [0.42857143 0.5 ]]
# Division using NumPy functions >>> print(np.divide(l,m)) [[0.2 0.33333333] [0.42857143 0.5 ]]
  1. Sum
>>> print(l) [[1 2] [3 4]]
#sum of all the elements >>> print(np.sum(l)) 10
#sum of columns >>> print(np.sum(l, axis=0)) [4 6]
#sum of row >>> print(np.sum(l, axis=1)) [3 7]

You can find the code for this tutorial here.

I hope you enjoyed this article. Thank you for giving it a read!


References

[1] NumPy documentation: https://numpy.org/doc/


Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

Related Articles