**Vectorization**is basically the art of getting rid of explicit folders in your code.**In the deep learning**era safety in deep learning in practice, you often find yourself training on relatively large data sets, because that's when deep learning algorithms tend to shine.**And so**, it's important that your code very quickly because otherwise, if it's running on a big data set, your code might take a long time to run then you just find yourself waiting a very long time to get the result.**So in the deep learning era**, I think the ability to perform vectorization has become a key skill.

**In logistic regression**you need to compute Z equals W transpose X plus B \( Z = W^TX + B\), where W was this column vector and X is also this vector.**Maybe**there are very large vectors if you have a lot of features. So, W and X were both these \( w , x \in R^{n_x} \) dimensional vectors as shown in figure below.**So, to compute W transpose X**, if you had a non-vectorized implementation, you would do something like given below:z = 0; for i in range (n - x): z + = w[i] * x[i] z += b

**So**, that's a non-vectorized implementation. Then you find that that's going to be really slow.**In contrast**, a vectorized implementation would just compute W transpose X directly.**In Python or a numpy**, the command you use for that is z = np.dot(w,x) + b, so this computes W transpose X and b is added.**And when you**are implementing deep learning algorithms, you can really get a result back faster. It will be much faster if you vectorize your code.**Some of you might**have heard that a lot of scaleable deep learning implementations are done on a GPU or a graphics processing unit. But all the demos I did just now in the Jupiter notebook where actually on the CPU.**And it turns out**that both GPU and CPU have parallelization instructions. They're sometimes called SIMD instructions. This stands for a single instruction multiple data.**But what**this basically means is that, if you use built-in functions such as this np.function or other functions that don't require you explicitly implementing a for loop.**It enables Python**to take much better advantage of parallelism to do your computations much faster.**And this**is true both computations on CPUs and computations on GPUs. It's just that GPUs are remarkably good at these SIMD calculations but CPU is actually also not too bad at that.**Maybe**just not as good as GPUs. You're seeing how vectorization can significantly speed up your code.**The rule of thumb**to remember is whenever possible, avoid using explicit four loops.

**The rule of thumb**to keep in mind is, when you're programming your new networks, or when you're programming just a regression, whenever possible avoid explicit for-loops.**And it's not always**possible to never use a for-loop, but when you can use a built in function or find some other way to compute whatever you need, you'll often go faster than if you have an explicit for-loop.**If ever you want**to compute a vector u as the product of the matrix A, and another vector v, then we use the code given below:u = np.zeros((n,1)) for i ..... for j .... u[i] + = A[i][j] * v[j]

**So**, that's a non-vectorized version, the vectorized implementation which is to say u = np.dot(A,v).**The vectorized version**, eliminates two different for-loops, and it's going to be way faster.

**Let's say you already have a vector**, v, in memory and you want to apply the exponential operation on every element of this vector v.**So in the non-vectorized implementation**, which is at first you initialize u to the vector of zeros. And then you have a for-loop that computes the elements one at a time.u = np.zeros((n,1)) for i in range(n): u[i] = math.exp(v[i])

**But it turns out**that Python and NumPy have many built-in functions that allow you to compute these vectors with just a single call to a single function. So what I would do to implement this isimport numpy as np u = np.exp(v).

**And so**, notice that, whereas previously you had that explicit for-loop, with just one line of code here, just v as an input vector u as an output vector, you've gotten rid of the explicit for-loop, and the implementation will be much faster that the one needing an explicit for-loop.**In fact**, the NumPy library has many of the vector value functions.**np.log(v)**will compute the element-wise log**np.abs(v)**computes the absolute value**np.maximum**computes the element-wise maximum to take the max of every element of v with 0**v**2**just takes the element-wise square of each element of v**\( \frac{1}{v} \)**takes the element-wise inverse.**So**, whenever you are tempted to write a for-loop take a look, and see if there's a way to call a NumPy built-in function to do it without that for-loop.