• Multi-task learning refers to having one neural network do simultaneously several tasks.


  • When to use multi-task learning


    1. Training on a set of tasks that could benefit from having shared lower-level features


    2. Usually: Amount of data you have for each task is quite similar


    3. Can train a big enough neural network to do well on all tasks


  • Example: Simplified autonomous vehicle


  • The vehicle has to detect simultaneously several things: pedestrians, cars, road signs, traffic lights, cyclists, etc.


  • We could have trained four separate neural networks, instead of train one to do four tasks.


  • multi-task-learning


  • However, in this case, the performance of the system is better when one neural network is trained to do four tasks than training four separate neural networks since some of the earlier features in the neural network could be shared between the different types of objects.


  • The input x^{(i)} is the image with multiple labels


  • The output y^{(i)} has 4 labels which are represents: \( y^{(i)} = \begin{bmatrix} 0\\ 0\\ 1\\ 0 \end{bmatrix} \)


  • multi-task-learning


  • Neural Network architecture


  • multi-task-learning


  • To train this neural network, loss function is defined as follow:


  • \( J(w,b) = − \frac{1}{m} \sum_{i = 1}^{m} \sum_{j=1}^{4} L(\hat{y}^{(i)}, y^{(i)}) = − \frac{1}{m} \sum_{i = 1}^{m} \sum_{j=1}^{4} [y^{(i)}log(\hat{y}^{(i)})−(1−y^{(i)})log(1−\hat{y}^{(i)}) ] \)


  • Also, the cost can be compute such as it is not influenced by the fact that some entries are not labeled.


  • Example:
    \( \begin{bmatrix} 1 & 0 & ? & ? \\ 0 & 1 & ? & 0\\ 0 & 1 & ? & 1\\ ? & 0 & 1 & 0 \end{bmatrix} \)