Planar data classification with one hidden layer

1 - Packages

Let’s first import all the packages that you will need during this assignment. 
numpy is the fundamental package for scientific computing with Python. 
sklearn provides simple and efficient tools for data mining and data analysis. 
matplotlib is a library for plotting graphs in Python. 
- testCases_v2 provides some test examples to assess the correctness of your functions 
- planar_utils provide various useful functions used in this assignment

2 - Dataset

First, let’s get the dataset you will work on. The following code will load a “flower” 2-class dataset into variables X and Y.

X, Y = load_planar_dataset()

Visualize the dataset using matplotlib. The data looks like a “flower” with some red (label y=0) and some blue (y=1) points. Your goal is to build a model to fit this data.

Exercise: How many training examples do you have? In addition, what is the shape of the variables X and Y?


3 - Simple Logistic Regression

Before building a full neural network, lets first see how logistic regression performs on this problem. You can use sklearn’s built-in functions to do that. Run the code below to train a logistic regression classifier on the dataset. 


4 - Neural Network model

Logistic regression did not work well on the “flower dataset”. You are going to train a Neural Network with a single hidden layer.

Reminder: The general methodology to build a Neural Network is to: 
1. Define the neural network structure ( # of input units, # of hidden units, etc). (
2. Initialize the model’s parameters(
3. Loop: 
- Implement forward propagation(
- Compute loss
- Implement backward propagation to get the gradients(
- Update parameters (gradient descent)(

You often build helper functions to compute steps 1-3 and then merge them into one function we call nn_model(). Once you’ve built nn_model() and learnt the right parameters, you can make predictions on new data.您经常构建帮助函数来计算步骤1-3,然后将它们合并到一个函数中,我们称之为nn_model()。一旦你建立了nn_model()并学习了正确的参数,你就可以预测新的数据。

4.1 - Defining the neural network structure

Exercise: Define three variables: 
- n_x: the size of the input layer 
- n_h: the size of the hidden layer (set this to 4) 
- n_y: the size of the output layer 

Hint: Use shapes of X and Y to find n_x and n_y. Also, hard code the hidden layer size to be 4.

4.2 - Initialize the model’s parameters

Exercise: Implement the function initialize_parameters().

- Make sure your parameters’ sizes are right. Refer to the neural network figure above if needed.(
- You will initialize the weights matrices with random values. (
- Use: 
np.random.randn(a,b) * 0.01 to randomly initialize a matrix of shape (a,b). 
- You will initialize the bias vectors as zeros. (
- Use: 
np.zeros((a,b)) to initialize a matrix of shape (a,b) with zeros.


4.3 - The Loop

Question: Implement forward_propagation().

- Look above at the mathematical representation of your classifier.(
- You can use the function 
sigmoid(). It is built-in (imported) in the notebook.(你可以使用函数sigmoid().它是notebook的内置函数
- You can use the function 
np.tanh(). It is part of the numpy library.(你可以使用函数np.tanh().它是notebook的内置函数
- The steps you have to implement are: 
1. Retrieve each parameter from the dictionary “parameters” (which is the output of 
initialize_parameters()) by using parameters[".."].(使用parameters [“..”]从字典“parameters”(这是initialize_parameters()的输出)中检索每个参数。
2. Implement Forward Propagation. Compute 
Z[1],A[1],Z[2] and A[2] (the vector of all your predictions on all the examples in the training set).(实现向前传播。计算Z[1]A[1]Z[2]A[2](训练中所有例子的所有预测的向量组)。
- Values needed in the backpropagation are stored in “
cache“. The cache will be given as an input to the backpropagation function.(反向传播所需的值存储在cache”中。cache`将作为反向传播函数的输入。)


Exercise: Implement compute_cost() to compute the value of the cost J.

- There are many ways to implement the cross-entropy loss. To help you, we give you how we would have implemented 




Question: Implement the update rule. Use gradient descent. You have to use (dW1, db1, dW2, db2) in order to update (W1, b1, W2, b2).(实施更新规则。使用渐变下降。你必须使用(dW1db1dW2db2)来更新(W1b1W2b2)。)

General gradient descent ruleθ=θαJθ where α is the learning rate and θ represents a parameter.

Illustration: The gradient descent algorithm with a good learning rate (converging) and a bad learning rate (diverging). Images courtesy of Adam Harley.(具有良好学习速率(收敛)和不良学习速率(发散)的梯度下降算法。)


4.4 - Integrate parts 4.1, 4.2 and 4.3 in nn_model()

Question: Build your neural network model in nn_model().

Instructions: The neural network model has to use the previous functions in the right order.


4.5 Predictions

Question: Use your model to predict by building predict(). Use forward propagation to predict results.

Reminder: predictions = yprediction=?{activation > 0.5}={1if activation>0.5 0otherwise

As an example, if you would like to set the entries of a matrix X to 0 and 1 based on a threshold you would do: X_new = (X > threshold) (例如,如果你想根据一个阈值将矩阵X的条目设置为01,你可以这样做:X_new = (X > threshold))

The image is


Accuracy: 88%