Planar data classification with one hidden layer

1 – Packages

Let’s first import all the packages that you will need during this assignment. 
– numpy is the fundamental package for scientific computing with Python. 
– sklearn provides simple and efficient tools for data mining and data analysis. 
– matplotlib is a library for plotting graphs in Python. 
– testCases_v2 provides some test examples to assess the correctness of your functions 
– planar_utils provide various useful functions used in this assignment

2 – Dataset

First, let’s get the dataset you will work on. The following code will load a “flower” 2-class dataset into variables X and Y.

X, Y = load_planar_dataset()

Visualize the dataset using matplotlib. The data looks like a “flower” with some red (label y=0) and some blue (y=1) points. Your goal is to build a model to fit this data.

Exercise: How many training examples do you have? In addition, what is the shape of the variables X and Y?


3 – Simple Logistic Regression

Before building a full neural network, lets first see how logistic regression performs on this problem. You can use sklearn’s built-in functions to do that. Run the code below to train a logistic regression classifier on the dataset. 


4 – Neural Network model

Logistic regression did not work well on the “flower dataset”. You are going to train a Neural Network with a single hidden layer.

Reminder: The general methodology to build a Neural Network is to: 
1. Define the neural network structure ( # of input units, # of hidden units, etc). (
2. Initialize the model’s parameters(
3. Loop: 
– Implement forward propagation(
– Compute loss
– Implement backward propagation to get the gradients(
– Update parameters (gradient descent)(

You often build helper functions to compute steps 1-3 and then merge them into one function we call nn_model(). Once you’ve built nn_model() and learnt the right parameters, you can make predictions on new data.您经常构建帮助函数来计算步骤1-3,然后将它们合并到一个函数中,我们称之为nn_model()。一旦你建立了nn_model()并学习了正确的参数,你就可以预测新的数据。

4.1 – Defining the neural network structure

Exercise: Define three variables: 
– n_x: the size of the input layer 
– n_h: the size of the hidden layer (set this to 4) 
– n_y: the size of the output layer 

Hint: Use shapes of X and Y to find n_x and n_y. Also, hard code the hidden layer size to be 4.

4.2 – Initialize the model’s parameters

Exercise: Implement the function initialize_parameters().

– Make sure your parameters’ sizes are right. Refer to the neural network figure above if needed.(
– You will initialize the weights matrices with random values. (
– Use: 
np.random.randn(a,b) * 0.01 to randomly initialize a matrix of shape (a,b). 
– You will initialize the bias vectors as zeros. (
– Use: 
np.zeros((a,b)) to initialize a matrix of shape (a,b) with zeros.


4.3 – The Loop

Question: Implement forward_propagation().

– Look above at the mathematical representation of your classifier.(
– You can use the function 
sigmoid(). It is built-in (imported) in the notebook.(你可以使用函数sigmoid().它是notebook的内置函数
– You can use the function 
np.tanh(). It is part of the numpy library.(你可以使用函数np.tanh().它是notebook的内置函数
– The steps you have to implement are: 
1. Retrieve each parameter from the dictionary “parameters” (which is the output of 
initialize_parameters()) by using parameters[".."].(使用parameters [“..”]从字典“parameters”(这是initialize_parameters()的输出)中检索每个参数。
2. Implement Forward Propagation. Compute 
Z[1],A[1],Z[2] and A[2] (the vector of all your predictions on all the examples in the training set).(实现向前传播。计算Z[1]A[1]Z[2]A[2](训练中所有例子的所有预测的向量组)。
– Values needed in the backpropagation are stored in “
cache“. The cache will be given as an input to the backpropagation function.(反向传播所需的值存储在cache”中。cache`将作为反向传播函数的输入。)


Exercise: Implement compute_cost() to compute the value of the cost J.

– There are many ways to implement the cross-entropy loss. To help you, we give you how we would have implemented 




Question: Implement the update rule. Use gradient descent. You have to use (dW1, db1, dW2, db2) in order to update (W1, b1, W2, b2).(实施更新规则。使用渐变下降。你必须使用(dW1db1dW2db2)来更新(W1b1W2b2)。)

General gradient descent ruleθ=θαJθ where α is the learning rate and θ represents a parameter.

Illustration: The gradient descent algorithm with a good learning rate (converging) and a bad learning rate (diverging). Images courtesy of Adam Harley.(具有良好学习速率(收敛)和不良学习速率(发散)的梯度下降算法。)


4.4 – Integrate parts 4.1, 4.2 and 4.3 in nn_model()

Question: Build your neural network model in nn_model().

Instructions: The neural network model has to use the previous functions in the right order.


4.5 Predictions

Question: Use your model to predict by building predict(). Use forward propagation to predict results.

Reminder: predictions = yprediction=?{activation > 0.5}={1if activation>0.5 0otherwise

As an example, if you would like to set the entries of a matrix X to 0 and 1 based on a threshold you would do: X_new = (X > threshold) (例如,如果你想根据一个阈值将矩阵X的条目设置为01,你可以这样做:X_new = (X > threshold))

The image is


Accuracy: 88%