## 3 – General Architecture of the learning algorithm

It’s time to design a simple algorithm to distinguish cat images from non-cat images.

You will build a Logistic Regression, using a Neural Network mindset. The following Figure explains why Logistic Regression is actually a very simple Neural Network!

Key steps
In this exercise, you will carry out the following steps:
– Initialize the parameters of the model
– Learn the parameters for the model by minimizing the cost
– Use the learned parameters to make predictions (on the test set)
– Analyse the results and conclude

## 4 – Building the parts of our algorithm ##

The main steps for building a Neural Network are:
1. Define the model structure (such as number of input features)
2. Initialize the model
’s parameters
3. Loop:
– Calculate current loss (forward propagation)
– Calculate current gradient (backward propagation)

You often build 1-3 separately and integrate them into one function we call model().

### .1 – Helper functions

Exercise: Using your code from “Python Basics”, implement sigmoid()

### 4.2 – Initializing parameters

Exercise: Implement parameter initialization in the cell below. You have to initialize w as a vector of zeros. If you don’t know what numpy function to use, look up np.zeros() in the Numpy library’s documentation.

### 4.3 – Forward and Backward propagation

Now that your parameters are initialized, you can do the “forward” and “backward” propagation steps for learning the parameters.

Exercise: Implement a function propagate() that computes the cost function and its gradient.

### d) Optimization

·        You have initialized your parameters.

·        You are also able to compute a cost function and its gradient.

·        Now, you want to update the parameters using gradient descent.

Exercise: Write down the optimization function. The goal is to learn w and b by minimizing the cost function J. For a parameter θ, the update rule is θ=θα dθ, where α is the learning rate.

Exercise: The previous function will output the learned w and b. We are able to use w and b to predict the labels for a dataset X. Implement the predict() function. There is two steps to computing predictions:

1.    Calculate Y^=A=σ(wTX+b)

2.    Convert the entries of a into 0 (if activation <= 0.5) or 1 (if activation > 0.5), stores the predictions in a vector Y_prediction. If you wish, you can use an if/else statement in a for loop (though there is also a way to vectorize this).

What to remember:
You’ve implemented several functions that:

– Initialize (w,b)
– Optimize the loss iteratively to learn parameters (w,b):
– computing the cost and its gradient
– updating the parameters using gradient descent
– Use the learned (w,b) to predict the labels for a given set of examples

## 5 – Merge all functions into a model

You will now see how the overall model is structured by putting together all the building blocks (functions implemented in the previous parts) together, in the right order.

Exercise: Implement the model function. Use the following notation:
– Y_prediction for your predictions on the test set
– Y_prediction_train for your predictions on the train set
– w, costs, grads for the outputs of optimize()