• In below matrix, the number of calories from carbohydrates, proteins, and fats in 100 grams of four different foods is shown.


  • broadcasting-example


  • So, a 100 grams of apples has 56 calories from carbs, and much less from proteins and fats. Whereas, 100 grams of beef has 104 calories from protein and 135 calories from fat.


  • Now, let's say your goal is to calculate the percentage of calories from carbs, proteins and fats for each of the four foods.


  • So, if you look at this column and add up the numbers in that column you get that 100 grams of apple has 56 plus 1.2 plus 1.8 so that's 59 calories. And so as a percentage the percentage of calories from carbohydrates in an apple would be 56 over 59, that's about 94.9%.


  • So most of the calories in an apple come from carbs, whereas in contrast, most of the calories of beef come from protein and fat and so on.


  • So the calculation you want is really to sum up each of the four columns of this matrix to get the total number of calories in 100 grams of apples, beef, eggs, and potatoes.


  • And then to divide throughout the matrix, so as to get the percentage of calories from carbs, proteins and fats for each of the four foods. So the objective is to do this without an explicit for-loop?


  • To do this with one line of Python code we're going to sum down the columns. So we're going to get four numbers corresponding to the total number of calories in these four different types of foods, 100 grams of these four different types of foods.


  • Then we use a second line of Python code to divide each of the four columns by their corresponding sum.


  • First, we're going to compute the sum. cal = A.sum(axis = 0)


  • And axix is equals 0 means to sum vertically. And then with a compute percentage equals percentage = 100*A/(cal.reshape(1,4))


  • As we want percentages, so multiply by 100 here.


  • To add a bit of detail this parameter, (axis = 0), means that you want Python to sum vertically. So axis = 0 this means to sum vertically, where as the horizontal axis is axis = 1.


  • So axis = 1 is for sum horizontally instead of sum vertically.






  • If you take a 4 by 1 vector and add it to a number, Python will take this number and auto-expand it into a four by one vector as follows.


  • And so the vector [1, 2, 3, 4] plus the number 100 ends up with that vector on the right.


  • You're adding a 100 to every element, and in fact we use this form of broadcasting where that constant was the parameter 'b'. And this type of broadcasting works with both column vectors and row vectors


  • Let's say you have a two by three matrix and you add it to this one by n matrix. So the general case would be if you have some (m,n) matrix here and you add it to a (1,n) matrix. What Python will do is copy the matrix m, times to turn this into m by n matrix.


  • So one last example, whether you have a (m,n) matrix and you add this to a (m,1) vector, (m,1) matrix.


  • Then just copy this n times horizontally. So you end up with an (m,n) matrix. So as you can imagine you copy it horizontally three times. And you add those. So when you add them you end up with this. So we've added 100 to the first row and added 200 to the second row.


  • broadcasting-example

    broadcasting-example


  • If you have an (m,n) matrix and you add or subtract or multiply or divide with a (1,n) matrix, then this will copy it n times into an (m,n) matrix. And then apply the addition, subtraction, and multiplication of division element wise.


  • If conversely, you were to take the (m,n) matrix and add, subtract, multiply, divide by an (m,1) matrix, then also this would copy it now n times. And turn that into an (m,n) matrix and then apply the operation element wise.


  • Just one of the broadcasting, which is if you have an (m,1) matrix, so that's really a column vector like [1,2,3], and you add, subtract, multiply or divide by a row number.


  • So maybe a (1,1) matrix. So such as that plus 100, then you end up copying this real number n times until you'll also get another (n,1) matrix.


  • And then you perform the operation such as addition on this example element-wise. And something similar also works for row vectors.


  • broadcasting-example






  • The ability of python to allow you to use broadcasting operations and more generally, the great flexibility of the python numpy program language is both a strength as well as a weakness of the programming language.


  • A great flexibility of the language lets you get a lot done even with just a single line of code.


  • But there's also weakness because with broadcasting and this great amount of flexibility, sometimes it's possible you can introduce very subtle bugs or very strange looking bugs, if you're not familiar with all of the intricacies of how broadcasting and how features like broadcasting work


  • For example, if you take a column vector and add it to a row vector, you would expect it to throw up a dimension mismatch or type error or something.


  • But you might actually get back a matrix as a sum of a row vector and a column vector. So there is an internal logic to these strange effects of Python.


  • With these tips and tricks, you'll also be able to much more easily write bug-free, python and numpy code.


  • If you set a = np.random.randn(5), this creates five random Gaussian variables stored in array a.


  • And so let's print(a) and now it turns out that the shape of 'a' when you do this is this five color structure. And so this is called a rank 1 array in Python and it's neither a row vector nor a column vector. And this leads it to have some slightly non-intuitive effects.


  • So for example, if I print 'a' transpose, it ends up looking the same as a.


  • And if I print the inner product between 'a' and 'a' transpose, you might think a times a transpose is maybe the outer product should give you matrix maybe. But if I do that, you instead get back a number.


  • Instead, if you set 'a' to be this, (5,1), then this commits 'a' to be (5,1) column vector [a = np.random.randn((5,1)) ]. And whereas previously, 'a' and 'a' transpose looked the same, it becomes now 'a' transpose, now 'a' transpose is a row vector.


  • So what I'm going to recommend is that when you're doing your programing exercises, or in fact when you're implementing logistic regression or neural networks that you create an array, you commit to making it either a column vector, so this creates a (5,1) vector, or commit to making it a row vector, then the behavior of your vectors may be easier to understand.