欢迎来到本周的编程训练。到目前为止,你已经经常使用numpy来组建神经网络。现在,我们将通过一个深度学习框架来让你更简单的创建一个深度网络。像Tensorflow、PaddlePaddle、Torch、Caffe、Keras等其它的机器学习网络能够显著加速你的机器学习速度。所有的这些模型都有许多你可以免费阅读的文档。在本次作业中,你将要在TensorFlow中学会以下内容:
(1)初始化变量
(2)开始你自己的session
(3)训练算法
(4)执行一个神经网络
编程不仅能够缩短你的编程时间,而且有时候能加速你代码的平台优化。
1 – 探索TensorFlow库
首先,还是要导入这些库:
import mathimport numpy as npimport h5pyimport matplotlib.pyplot as pltimport tensorflow as tffrom tensorflow.python.framework import opsfrom tf_utils import load_dataset,from tf_utils import random_mini_batchesfrom tf_utils import , convert_to_one_hot, predict %matplotlib inlinenp.random.seed(1)
在TensorFlow中编写并运行一个程序需要遵循以下步骤:
(1)创建没有被执行的tensors
(2)编写Tensors的操作
(3)初始化tensors
(4)创建一个session
(5)运行这个session,这将运行你上面写的所有操作。
y_hat = tf.constant(36, name='y_hat') # Define y_hat constant as 36.y = tf.constant(39, name='y') # Define y. Set to 39# Create a variable for the lossloss = tf.Variable((y - y_hat)**2, name='loss') # When init is run later (session.run(init))# the loss variable will be initialized and ready to be computedinit = tf.global_variables_initializer() ,with tf.Session() as session: # Create a session and print the output session.run(init) # Initializes the variables print(session.run(loss))
因此,当我们创建一个损失的变量的时候,我们简单的将损失定义为一个其它数量的函数,而没有定义评估其值。为了能够评估它,我们需要运行init=tf.global_variables_initializer()。这会初始化损失变量,在最后我们嘴周能够评估损失的值并将其打印出来。
现在我们来看看这段代码:
a = tf.constant(2)b = tf.constant(10)c = tf.multiply(a,b)print(c)
我们发现并没有输出我们想要的值。你得到了一个tensor,并且声称这个结果是一个不含有形式参数的tensor,并且类型为Int32。你所做的仅仅是输入计算过程,但是并没有真正运行计算过程。为了能够实际上将这两个数相乘,你需要创建一个session并且运行它。
再运行:
sess = tf.Session()print(sess.run(c))
可以发现,这样输出了结果20。
总结一下:一定要初始化变量,然后创建一个session并且在session里面运行你需要的操作。
接下来,我们开始介绍占位符(placeholders)。占位符是一个可以在后来指定值的对象。为了为一个占位符指定具体的值,你可以使用一个“feed dictionary”(feed_dict变量)来传递值。我们创建了一个名为x的占位符。这将允许我们在运行session的时候传递一个数值给它。
运行以下代码:
# Change the value of x in the feed_dictx = tf.placeholder(tf.int64, name = 'x')print(sess.run(2 * x, feed_dict = {x: 3}))sess.close()
可以得到结果为6。
这其中发生了什么呢?当你指定了计算所需要的操作时,你实际上是在告诉TensorFlow如何去构建一个计算图。计算图可以由几个占位符,这些占位符的值在稍后被指定。最后,当你运行这个session的时候,你告诉TensorFlow来执行这些计算图。
1.1 – 线性函数
本部分是来计算Y=WX+b,在这里W和X都是随机矩阵,b是一个随机向量。
Exercise:计算WX+b,在这里W,X,b符合随机正态分布,W是一个(4,3)矩阵,X是一个(3,1)矩阵,b是一个(4,1)矩阵。作为一个范例,这里告诉你如何定义一个(3,1)的常量X:
X=tf.constant(np.random.randn(3,1),name="X")
你可能用到以下函数:
•tf.matmul(…, …) 矩阵乘法
•tf.add(…, …) 加法
•np.random.randn(…) 产生随机数
代码如下:
# GRADED FUNCTION: linear_function def linear_function(): """ Implements a linear function: Initializes W to be a random tensor of shape (4,3) Initializes X to be a random tensor of shape (3,1) Initializes b to be a random tensor of shape (4,1) Returns: result -- runs the session for Y = WX + b """ np.random.seed(1) ### START CODE HERE ### (4 lines of code) X = tf.constant(np.random.randn(3,1), name = "X") W=tf.constant(np.random.randn(4,3), name = "W") b = tf.constant(np.random.randn(4,1), name = "b") Y = tf.add(tf.matmul(W, X), b) ### END CODE HERE ### # Create the session using tf.Session() and run it with #sess.run(...) on the variable you want to calculate ### START CODE HERE ### sess = tf.Session() result = sess.run(Y) ### END CODE HERE ### # close the session sess.close() return resultprint( "result = " + str(linear_function()))
运行下看结果:
result = [[-2.15657382][ 2.95891446][-1.08926781][-0.84538042]]
1.2 – 计算sigmoid函数
TensorFlow提供例如tf.sigmoid
and tf.softmax
.的函数可供使用。
Exercise : 执行下面的程序。可以参考下面的:
•tf.placeholder(tf.float32, name = "…")
•tf.sigmoid(…)
•sess.run(…, feed_dict = {x: z})
方法1:
sess = tf.Session()# Run the variables initialization #(if needed), run the operationsresult = sess.run(..., feed_dict = {...})sess.close() # Close the session
方法2:
with tf.Session() as sess: # run the variables initialization #(if needed), run the operations result = sess.run(..., feed_dict = {...})# This takes care of closing the session for you :)
代码如下:
# GRADED FUNCTION: sigmoiddef sigmoid(z): """ Computes the sigmoid of z Arguments: z -- input value, scalar or vector Returns: results -- the sigmoid of z """ ### START CODE HERE ### # Create a placeholder for x. Name it 'x'. x = tf.placeholder(tf.float32, name = "x") # compute sigmoid(x) sigmoid = tf.sigmoid(x) # Create a session, and run it. #Please use the method 2 explained above. # You should use a feed_dict to pass z's value to x. with tf.Session() as sess: result = sess.run(sigmoid, feed_dict={x:z}) ### END CODE HERE ###return resultprint ("sigmoid(0) = " + str(sigmoid(0)))print ("sigmoid(12) = " + str(sigmoid(12)))
运行一下输出结果:
sigmoid(0) = 0.5
sigmoid(12) = 0.9999938
1.3 – 计算代价函数
在TensorFlow中你可以利用一行代码就实现本功能。
Exercise: 执行下面的程序。在函数中你会用到:
•tf.nn.sigmoid_cross_entropy_with_logits(logits = …, labels = …)
你的代码应道输入z,计算sigmoid然后计算代价函数J。所有的这一些都可以利用上面这个函数来计算。
代码:
# GRADED FUNCTION: costdef cost(logits, labels): """Computes the cost using the sigmoid cross entropyArguments:logits -- vector containing z, output of the last linear unit (before the final sigmoid activation) labels -- vector of labels y (1 or 0) Note: What we've been calling "z" and "y" in this class are respectively called "logits" and "labels" in the TensorFlow documentation. So logits will feed into z, and labels into y. Returns: cost -- runs the session of the cost (formula (2)) """ ### START CODE HERE ### # Create the placeholders for "logits" (z) and #"labels" (y) (approx. 2 lines) z = tf.placeholder(tf.float32, shape = logits.shape, name = "logits") y = tf.placeholder(tf.float32, shape = labels.shape, name = "label") # Use the loss function (approx. 1 line) cost = tf.nn.sigmoid_cross_entropy_with_logits(labels = y, logits = z) # Create a session (approx. 1 line). #See method 1 above. sess = tf.Session() # Run the session (approx. 1 line). cost = sess.run(cost, feed_dict = {z: logits, y: labels}) # Close the session (approx. 1 line). #See method 1 above. sess.close() ### END CODE HERE ### return costlogits = sigmoid(np.array([0.2,0.4,0.7,0.9]))cost = cost(logits, np.array([0,0,1,1]))print ("cost = " + str(cost))
结果:
cost = [1.0053872 1.0366409 0.4138543 0.39956614]
1.4 – 使用One Hot encodings(独热编码)
Many times in deep learning you will have a y vector with numbers ranging from 0 to C-1, where C is the number of classes. If C is for example 4, then you might have the following y vector which you will need to convert as follows:
这被称为独热编码。因为在转换表示中,每个列中的一个元素是“Hot”(意思是设置为1)。在NumPy做这种转换,你可能要写几行代码。在tensorflow,你可以使用一行代码:
tf.one_hot(labels, depth, axis)
Exercise:
# GRADED FUNCTION: one_hot_matrixdef one_hot_matrix(labels, C): """ Arguments: labels -- vector containing the labels C -- number of classes, the depth of the one hot dimension Returns: one_hot -- one hot matrix """ ### START CODE HERE ### # Create a tf.constant equal to C (depth), #name it 'C'. (approx. 1 line) C = tf.constant(C, name = "C") # Use tf.one_hot, be careful with the axis one_hot_matrix = tf.one_hot(labels, C,axis=0,name = "one_hot") # Create the session (approx. 1 line) sess = tf.Session() # Run the session (approx. 1 line) one_hot = sess.run(one_hot_matrix) # Close the session (approx. 1 line). sess.close() ### END CODE HERE ### return one_hotlabels = np.array([1,2,3,0,2,1])one_hot = one_hot_matrix(labels, C = 4)print ("one_hot = " + str(one_hot))
运行结果:
one_hot = [[0. 0. 0. 1. 0. 0.][1. 0. 0. 0. 0. 1.][0. 1. 0. 0. 1. 0.][0. 0. 1. 0. 0. 0.]]
1.5 -用0或1矩阵初始化
Now you will learn how to initialize a vector of zeros and ones. The function you will be calling is tf.ones()
. To initialize with zeros you could use tf.zeros() instead. These functions take in a shape and return an array of dimension shape full of zeros and ones respectively.
Exercise: Implement the function below to take in a shape and to return an array (of the shape's dimension of ones).
tf.ones(shape)
代码:
# GRADED FUNCTION: onesdef ones(shape): """ Creates an array of ones of dimension shape Arguments: shape -- shape of the array you want to create Returns: ones -- array containing only ones """ ### START CODE HERE ### # Create "ones" tensor using tf.ones(...). ones = tf.ones(shape) # Create the session (approx. 1 line) sess = tf.Session() # Run the session to compute 'ones' ones = sess.run(ones) # Close the session (approx. 1 line). sess.close() ### END CODE HERE ### return onesprint ("ones = " + str(ones([3])))
结果:ones = [1. 1. 1.]
第二部分稍后更新