3 - General Architecture of the learning algorithm

It’s time to design a simple algorithm to distinguish cat images from non-cat images.

You will build a Logistic Regression, using a Neural Network mindset. The following Figure explains why Logistic Regression is actually a very simple Neural Network!


Key steps
In this exercise, you will carry out the following steps: 
- Initialize the parameters of the model 
- Learn the parameters for the model by minimizing the cost 
- Use the learned parameters to make predictions (on the test set) 
- Analyse the results and conclude

4 - Building the parts of our algorithm ##

The main steps for building a Neural Network are: 
1. Define the model structure (such as number of input features) 
2. Initialize the model
’s parameters 
3. Loop: 
- Calculate current loss (forward propagation) 
- Calculate current gradient (backward propagation) 
- Update parameters (gradient descent)

You often build 1-3 separately and integrate them into one function we call model().

.1 - Helper functions

Exercise: Using your code from “Python Basics”, implement sigmoid()

4.2 - Initializing parameters

Exercise: Implement parameter initialization in the cell below. You have to initialize w as a vector of zeros. If you don’t know what numpy function to use, look up np.zeros() in the Numpy library’s documentation.

4.3 - Forward and Backward propagation

Now that your parameters are initialized, you can do the “forward” and “backward” propagation steps for learning the parameters.

Exercise: Implement a function propagate() that computes the cost function and its gradient.

d) Optimization

·        You have initialized your parameters.

·        You are also able to compute a cost function and its gradient.

·        Now, you want to update the parameters using gradient descent.

Exercise: Write down the optimization function. The goal is to learn w and b by minimizing the cost function J. For a parameter θ, the update rule is θ=θα dθ, where α is the learning rate.



Exercise: The previous function will output the learned w and b. We are able to use w and b to predict the labels for a dataset X. Implement the predict() function. There is two steps to computing predictions:

1.    Calculate Y^=A=σ(wTX+b)

2.    Convert the entries of a into 0 (if activation <= 0.5) or 1 (if activation > 0.5), stores the predictions in a vector Y_prediction. If you wish, you can use an if/else statement in a for loop (though there is also a way to vectorize this).

What to remember: 
You’ve implemented several functions that: 

- Initialize (w,b) 
- Optimize the loss iteratively to learn parameters (w,b): 
- computing the cost and its gradient 
- updating the parameters using gradient descent 
- Use the learned (w,b) to predict the labels for a given set of examples

5 - Merge all functions into a model

You will now see how the overall model is structured by putting together all the building blocks (functions implemented in the previous parts) together, in the right order.

Exercise: Implement the model function. Use the following notation: 
- Y_prediction for your predictions on the test set 
- Y_prediction_train for your predictions on the train set 
- w, costs, grads for the outputs of optimize()