**A new version of linear regression**that's more powerful is one that works with multiple variables or with multiple features.**In the original version of linear regression**that we developed in previous chapters, we have a single feature x, the size of the house, and we wanted to use that to predict why the price of the house and this was our form of our hypothesis.**But now imagine, what if we had not only the size of the house as a feature or as a variable of which to try to predict the price**, but that we also knew the number of bedrooms, the number of house and the age of the home and years. It seems like this would give us a lot more information with which to predict the price.**To introduce a little bit of notation, we sort of started to talk about this earlier**, I'm going to use the variables \( x_1, x_2 \) and so on to denote my, in this case, four features and I'm going to continue to use Y to denote the variable, the output variable price that we're trying to predict.**Now that we have four features**we are going to use lowercase "n" to denote the number of features. So in this example we have n = 4 because we have four features.**And "n" is different from our earlier notation where we were using "n" to denote the number of examples**. So if you have 47 rows "M" is the number of rows on this table or the number of training examples.**So I'm also going to use \( x^{(i)} \) to denote the input features of the "I" training example.****As a concrete example**let say \( x_2 \) is going to be a vector of the features for my second training example. And so \( X_2 \) here is going to be a vector 1416, 3, 2, 40 since those are my four features that I have to try to predict the price of the second house.**With this notation \( x_2 \) is a four dimensional vector. In fact, more generally, this is an n-dimensional feature back there**. With this notation, \( x_2 \) is now a vector and so, I'm going to use also \( x_j^i \) to denote the value of feature number J and the ith training example. So \( x_{(3)}^2 \) is going to be equal to 2 as seen in the above figure.**Now that we have multiple features,**let's talk about what the form of our hypothesis should be. Previously this was the form of our hypothesis, where x was our single feature, but now that we have multiple features, we aren't going to use the simple representation any more.**Instead, a form of the hypothesis in linear regression is going to be this, can be**\( h_{\theta}(x) = \theta_0 + \theta_1 x_1 + \theta_2 x_2 + \theta_3 x_3 + ... + \theta_n x_n\). And if we have N features then rather than summing up over our four features, we would have a sum over our N features.**For convenience of notation**, let me define \( x_0 \) to be equals one. Concretely, this means that for every example i, I have a feature vector \( X_i \) and \( x^i_{(0)} \) is going to be equal to 1. You can think of this as defining an additional zero feature. So now my feature vector X becomes this n+1 dimensional vector.**Using the definition**of matrix multiplication, our multivariable hypothesis function can be concisely represented as:**\begin{align*}h_\theta(x) =\begin{bmatrix}\theta_0 \hspace{2em} \theta_1 \hspace{2em} ... \hspace{2em} \theta_n\end{bmatrix}\begin{bmatrix}x_0 \newline x_1 \newline \vdots \newline x_n\end{bmatrix}= \theta^T x\end{align*}****Which gives us**\( \theta^T x = h_{\theta}(x) = \theta_0 + \theta_1 x_1 + \theta_2 x_2 + \theta_3 x_3 + ... + \theta_n x_n \)

**Linear regression**with multiple variables is also known as "multivariate linear regression".**We now introduce notation**for equations where we can have any number of input variables.**\begin{align*}x_j^{(i)} &= \text{value of feature } j \text{ in the }i^{th}\text{ training example} \newline x^{(i)}& = \text{the input (features) of the }i^{th}\text{ training example} \newline m &= \text{the number of training examples} \newline n &= \text{the number of features} \end{align*}****The multivariable form**of the hypothesis function accommodating these multiple features is as follows:**\( h_{\theta}(x) = \theta_0 + \theta_1 x_1 + \theta_2 x_2 + \theta_3 x_3 + ... + \theta_n x_n\)****In order to develop intuition about this function**, we can think about \( \theta_0 \) as the basic price of a house, \( \theta_1 \) as the price per square meter, \( \theta_2 \) as the price per floor, etc. \( x_1 \) will be the number of square meters in the house, \( x_2 \) the number of floors, etc.**Using the definition**of matrix multiplication, our multivariable hypothesis function can be concisely represented as:**\begin{align*}h_\theta(x) =\begin{bmatrix}\theta_0 \hspace{2em} \theta_1 \hspace{2em} ... \hspace{2em} \theta_n\end{bmatrix}\begin{bmatrix}x_0 \newline x_1 \newline \vdots \newline x_n\end{bmatrix}= \theta^T x\end{align*}****This is a vectorization**of our hypothesis function for one training example; see the lessons on vectorization to learn more.**Remark:**Note that for convenience reasons in this course we assume \( x_{0}^{(i)} = 1 \text{ for } (i\in { 1,\dots, m } \) This allows us to do matrix operations with theta and x. Hence making the two vectors \( '\theta' \) and \( x^{(i)} \) match each other element-wise (that is, have the same number of elements: n+1).