DEV Community

HoangNg
HoangNg

Posted on

Python Optimization with NumPy (Vectorization)

Methods

I created different methods to simulate some data and compare these methods regarding their performance when increasing the sample size.

Method 1: Unvectorized method using Python list;
Method 2: Unvectorized method using Numpy array;
Method 3: Partially vectorized method (i.e., this method still utilizes a Python list and an explicit loop)
Method 4: Fully vectorized method (i.e., only use Numpy array and vectorization provided by Numpy)

See the code below

def make_dummy_y_unvectorized1(x, vector_w, b, error_term): y = [] m = x.shape[1] for i in range(m): y_i = 0 for j in range(len(vector_w)): y_i += vector_w[j] * x[j, i] y_i = (y_i + b) * np.exp(error_term[i]) y.append(y_i) y = np.array(y) return y def make_dummy_y_unvectorized2(x, vector_w, b, error_term): m, n = x.shape y = np.zeros(n) for i in range(n): for j in range(m): y[i] += vector_w[j] * x[j, i] y = (y + b) * np.exp(error_term) return y def make_dummy_y_vectorized1(x, vector_w, b, error_term): y = [] for i in range(x.shape[1]): y.append((np.dot(vector_w, x[:, i]) + b) * np.exp(error_term[i])) y = np.array(y) return y def make_dummy_y_vectorized2(x, vector_w, b, error_term): y = (np.dot(vector_w, x) + b) * np.exp(error_term) return y 
Enter fullscreen mode Exit fullscreen mode

Comparison results between un-vectorization and vectorization methods

In the comparison chart, method 1 and method 2 show a sharp increase in the time it takes to finish calculations as the amount of data grows, indicating they're not well-suited for large tasks. Method 3 improves this by handling more data before slowing down. Method 4 - a fully vectorized method - stands out as the clear winner, maintaining a fast and consistent performance regardless of data size, showcasing its efficiency with heavy workloads.

Source code

Have a nice day
Hoang

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.