Planar data classification with one hidden layer

1 – Packages

Let’s first import all the packages that you will need during this assignment. 
– numpy is the fundamental package for scientific computing with Python. 
– sklearn provides simple and efficient tools for data mining and data analysis. 
– matplotlib is a library for plotting graphs in Python. 
– testCases_v2 provides some test examples to assess the correctness of your functions 
– planar_utils provide various useful functions used in this assignment

2 – Dataset

First, let’s get the dataset you will work on. The following code will load a “flower” 2-class dataset into variables X and Y.

X, Y = load_planar_dataset()

Visualize the dataset using matplotlib. The data looks like a “flower” with some red (label y=0) and some blue (y=1) points. Your goal is to build a model to fit this data.

Exercise: How many training examples do you have? In addition, what is the shape of the variables X and Y?

 

3 – Simple Logistic Regression

Before building a full neural network, lets first see how logistic regression performs on this problem. You can use sklearn’s built-in functions to do that. Run the code below to train a logistic regression classifier on the dataset. 

1.png

4 – Neural Network model

Logistic regression did not work well on the “flower dataset”. You are going to train a Neural Network with a single hidden layer.

Reminder: The general methodology to build a Neural Network is to: 
1. Define the neural network structure ( # of input units, # of hidden units, etc). (
定义神经网络结构(输入单元数量,隐藏单元数量等)。
2. Initialize the model’s parameters(
初始化模型的参数
3. Loop: 
– Implement forward propagation(
实现向前传播
– Compute loss
(计算损失函数) 
– Implement backward propagation to get the gradients(
实现向后传播以获得梯度
– Update parameters (gradient descent)(
更新参数,梯度下降)

You often build helper functions to compute steps 1-3 and then merge them into one function we call nn_model(). Once you’ve built nn_model() and learnt the right parameters, you can make predictions on new data.您经常构建帮助函数来计算步骤1-3,然后将它们合并到一个函数中,我们称之为nn_model()。一旦你建立了nn_model()并学习了正确的参数,你就可以预测新的数据。

4.1 – Defining the neural network structure

Exercise: Define three variables: 
– n_x: the size of the input layer 
输入层的节点数
– n_h: the size of the hidden layer (set this to 4) 
隐藏层的节点数
– n_y: the size of the output layer 
输出层的节点数

Hint: Use shapes of X and Y to find n_x and n_y. Also, hard code the hidden layer size to be 4.

4.2 – Initialize the model’s parameters

Exercise: Implement the function initialize_parameters().

Instructions
– Make sure your parameters’ sizes are right. Refer to the neural network figure above if needed.(
确保你的参数的大小是正确的。如果需要,请参考上面的神经网络图。
– You will initialize the weights matrices with random values. (
你将用随机值初始化权重矩阵。
– Use: 
np.random.randn(a,b) * 0.01 to randomly initialize a matrix of shape (a,b). 
– You will initialize the bias vectors as zeros. (
你将初始化偏置向量为零。
– Use: 
np.zeros((a,b)) to initialize a matrix of shape (a,b) with zeros.

 

4.3 – The Loop

Question: Implement forward_propagation().

Instructions
– Look above at the mathematical representation of your classifier.(
请看上面的分类器的数学表示。
– You can use the function 
sigmoid(). It is built-in (imported) in the notebook.(你可以使用函数sigmoid().它是notebook的内置函数
– You can use the function 
np.tanh(). It is part of the numpy library.(你可以使用函数np.tanh().它是notebook的内置函数
– The steps you have to implement are: 
1. Retrieve each parameter from the dictionary “parameters” (which is the output of 
initialize_parameters()) by using parameters[".."].(使用parameters [“..”]从字典“parameters”(这是initialize_parameters()的输出)中检索每个参数。
2. Implement Forward Propagation. Compute 
Z[1],A[1],Z[2] and A[2] (the vector of all your predictions on all the examples in the training set).(实现向前传播。计算Z[1]A[1]Z[2]A[2](训练中所有例子的所有预测的向量组)。
– Values needed in the backpropagation are stored in “
cache“. The cache will be given as an input to the backpropagation function.(反向传播所需的值存储在cache”中。cache`将作为反向传播函数的输入。)

 

Exercise: Implement compute_cost() to compute the value of the cost J.

Instructions
– There are many ways to implement the cross-entropy loss. To help you, we give you how we would have implemented 
i=0my(i)log(a[2](i)):

梯度下降算法:

1517492239122133.png

 

Question: Implement the update rule. Use gradient descent. You have to use (dW1, db1, dW2, db2) in order to update (W1, b1, W2, b2).(实施更新规则。使用渐变下降。你必须使用(dW1db1dW2db2)来更新(W1b1W2b2)。)

General gradient descent ruleθ=θαJθ where α is the learning rate and θ represents a parameter.

Illustration: The gradient descent algorithm with a good learning rate (converging) and a bad learning rate (diverging). Images courtesy of Adam Harley.(具有良好学习速率(收敛)和不良学习速率(发散)的梯度下降算法。)

 

4.4 – Integrate parts 4.1, 4.2 and 4.3 in nn_model()

Question: Build your neural network model in nn_model().

Instructions: The neural network model has to use the previous functions in the right order.

 

4.5 Predictions

Question: Use your model to predict by building predict(). Use forward propagation to predict results.

Reminder: predictions = yprediction=?{activation > 0.5}={1if activation>0.5 0otherwise

As an example, if you would like to set the entries of a matrix X to 0 and 1 based on a threshold you would do: X_new = (X > threshold) (例如,如果你想根据一个阈值将矩阵X的条目设置为01,你可以这样做:X_new = (X > threshold))

The image is

3.png

Accuracy: 88%