小奥的学习笔记

  • Home
  • Learning & Working
    • Speech Enhancement Notes
    • Programming language
    • Computer & DL
    • MOOC
  • Life
    • Life Time
    • Thinking & Comprehension
    • Volunteer
    • Plan
    • Travel
  • Footprints
  • GuestBook
  • About
    • About Me
    • 个人履历
    • 隐私策略
  1. 首页
  2. Study-notes
  3. Computer & DL
  4. Deep Learning
  5. 正文

吴恩达深度学习课程 DeepLearning.ai 编程作业(1-2)Part.1

2018年1月29日 3007点热度 0人点赞 0条评论

Part 1:Python Basics with Numpy (optional assignment)

1 - Building basic functions with numpy

Numpy is the main package for scientific computing in Python. It is maintained by a large community (www.numpy.org). In this exercise you will learn several key numpy functions such as np.exp, np.log, and np.reshape. You will need to know how to use these functions for future assignments.

1.1 - sigmoid function, np.exp()

Exercise: Build a function that returns the sigmoid of a real number x. using numpy.

import numpy as np
def sigmoid(x):
    """
    define a sigmoid function
    x is input: An array
    This function is to compute the sigmoid function value
    """
    s = 1.0 / (1 + (1/np.exp(x)))
    return s #return the sigmoid function value
#main function 
m = np.array([1,2,3])
print(sigmoid(m))

1.2 - Sigmoid gradient

Exercise: Implement the function sigmoid_grad() to compute the gradient of the sigmoid function with respect to its input x. 

You often code this function in two steps: 
1. Set s to be the sigmoid of x. You might find your sigmoid(x) function useful. 
2. Compute σ′(x)=s(1−s)

import numpy as np
def sigmoid_derivative(x):
    s = 1.0 / (1 + 1 / np.exp(x))
    ds = s * (1-s)
    return ds
x = np.array([1, 2, 3])
print ("sigmoid_derivative(x) = " + str(sigmoid_derivative(x)))


1.3 - Reshaping arrays

Two common numpy functions used in deep learning are np.shape and np.reshape(). 
- X.shape is used to get the shape (dimension) of a matrix/vector X. 
- X.reshape(…) is used to reshape X into some other dimension.

Exercise: Implement image2vector() that takes an input of shape (length, height, 3) and returns a vector of shape (length*height*3, 1). For example, if you would like to reshape an array v of shape (a, b, c) into a vector of shape (a*b,c) you would do:

v = v.reshape((v.shape[0]*v.shape[1], v.shape[2])) # v.shape[0]=a;v.shape[1]=b;v.shape[2]=c
  • Please don’t hardcode the dimensions of image as a constant. Instead look up the quantities you need with image.shape[0], etc.

#Reshaping arrays
def image2vector(imag):
    """
    takes an input of shape (length,height,3)and returns a vector of shape(length*height*3,1)
    """
    v = img.reshape((img.shape[0]*img.shape[1]*img.shape[2],1 ))
    return v
img = np.array([[[ 0.67826139,  0.29380381],
        [ 0.90714982,  0.52835647],
        [ 0.4215251 ,  0.45017551]],
 
       [[ 0.92814219,  0.96677647],
        [ 0.85304703,  0.52351845],
        [ 0.19981397,  0.27417313]],
 
       [[ 0.60659855,  0.00533165],
        [ 0.10820313,  0.49978937],
        [ 0.34144279,  0.94630077]]])
m = image2vector(img)
 
print ("image2vector(image) = " + str(m))

 

1.4 - Normalizing rows

Another common technique we use in Machine Learning and Deep Learning is to normalize our data. It often leads to a better performance because gradient descent converges faster after normalization. Here, by normalization we mean changing x to x∥x∥ (dividing each row vector of x by its norm).

# GRADED FUNCTION: normalizeRows
def normalizeRows(x):
     x_norm = np.linalg.norm(x,axis=1,keepdims=True)#x代表对x求解,ord后面的数字
     #表示是几范数,无穷范数是np.inf,axis=1代表1维数据,keepdims表示如果将其设置
     #为true,则将赋范的轴作为尺寸为1的尺寸保留在结果中。
     x = x / x_norm
     return x
 
x = np.array([[0, 3, 4],[1, 6, 4]])
print("normalizeRows(x) = " + str(normalizeRows(x)))

What you need to remember: 
- np.exp(x) works for any np.array x and applies the exponential function to every coordinate 
- the sigmoid function and its gradient 
- image2vector is commonly used in deep learning 
- np.reshape is widely used. In the future, you’ll see that keeping your matrix/vector dimensions straight will go toward eliminating a lot of bugs. 

- numpy has efficient built-in functions 
- broadcasting is extremely useful

2) Vectorization

In deep learning, you deal with very large datasets. Hence, a non-computationally-optimal function can become a huge bottleneck in your algorithm and can result in a model that takes ages to run. To make sure that your code is computationally efficient, you will use vectorization. 

Note that np.dot() performs a matrix-matrix or matrix-vector multiplication. This is different from np.multiply() and the * operator (which is equivalent to .* in Matlab/Octave), which performs an element-wise multiplication.

2.1 Implement the L1 and L2 loss functions

Exercise: Implement the numpy vectorized version of the L1 loss. You may find the function abs(x) (absolute value of x) useful.

Reminder: 
- The loss is used to evaluate the performance of your model. The bigger your loss is, the more different your predictions (
y) are from the true values (y). In deep learning, you use optimization algorithms like Gradient Descent to train your model and to minimize the cost. 

# GRADED FUNCTION: L1
import numpy as np
def L1(yhat, y):
    loss = np.sum(np.abs( y - yhat))
    return loss
 
yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])
print("L1 = " + str(L1(yhat,y)))

Exercise: Implement the numpy vectorized version of the L2 loss. There are several way of implementing the L2 loss but you may find the function np.dot() useful. As a reminder, if x=[x1,x2,...,xn], then np.dot(x,x) = ∑nj=0x2j.

#GRADED FUNCTION: L2
def L2(yhat,y):
    #loss=np.sum(np.power((y-yhat),2))
    loss= np.sum(np.dot(y-yhat,y-yhat))
    return loss
 
yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])
print("L2 = " + str(L2(yhat,y)))

What to remember: 
- Vectorization is very important in deep learning. It provides computational efficiency and clarity. 
- You have reviewed the L1 and L2 loss. 
- You are familiar with many numpy functions such as np.sum, np.dot, np.multiply, np.maximum, etc…

Part 2: Logistic Regression with a Neural Network mindset

You will learn to: 
- Build the general architecture of a learning algorithm, including: 
- Initializing parameters 
- Calculating the cost function and its gradient 
- Using an optimization algorithm (gradient descent) 
- Gather all three functions above into a main model function, in the right order.

1 - Packages

First, let’s run the cell below to import all the packages that you will need during this assignment. 
- numpy is the fundamental package for scientific computing with Python. 
- 
h5py is a common package to interact with a dataset that is stored on an H5 file. 
- matplotlib is a famous library to plot graphs in Python. 
- 
PIL and scipy are used here to test your model with your own picture at the end.

2 - Overview of the Problem set

Problem Statement: You are given a dataset (“data.h5”) containing: 
- a training set of m_train images labeled as cat (y=1) or non-cat (y=0) 
- a test set of m_test images labeled as cat or non-cat 
- each image is of shape (num_px, num_px, 3) where 3 is for the 3 channels (RGB). Thus, each image is square (height = num_px) and (width = num_px).

You will build a simple image-recognition algorithm that can correctly classify pictures as cat or non-cat.

Let’s get more familiar with the dataset. Load the data by running the following code.

train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()

We added “_orig” at the end of image datasets (train and test) because we are going to preprocess them. After preprocessing, we will end up with train_set_x and test_set_x (the labels train_set_y and test_set_y don’t need any preprocessing).

Each line of your train_set_x_orig and test_set_x_orig is an array representing an image. You can visualize an example by running the following code. Feel free also to change the index value and re-run to see other images.

# Example of a picture

index = 19
plt.imshow(train_set_x_orig[index])
print("y="+str(train_set_y[:,index])+","+classes[np.squeeze(train_set_y[:,index])].decode("utf-8")+"'picture.")

Many software bugs in deep learning come from having matrix/vector dimensions that don’t fit. If you can keep your matrix/vector dimensions straight you will go a long way toward eliminating many bugs.

Exercise: Find the values for: 
- m_train (number of training examples) 
- m_test (number of test examples) 
- num_px (= height = width of a training image) 
Remember that 
train_set_x_orig is a numpy-array of shape (m_train, num_px, num_px, 3). For instance, you can access m_train by writing train_set_x_orig.shape[0]

#access m_train,m_test,num_px
m_train = train_set_x_orig.shape[0]
m_test = test_set_x_orig.shape[0]
num_px = train_set_x_orig.shape[1]
 
print ("Number of training examples: m_train = " + str(m_train))
print ("Number of testing examples: m_test = " + str(m_test))
print ("Height/Width of each image: num_px = " + str(num_px))
print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)")
print ("train_set_x shape: " + str(train_set_x_orig.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x shape: " + str(test_set_x_orig.shape))
print ("test_set_y shape: " + str(test_set_y.shape))

For convenience, you should now reshape images of shape (num_px, num_px, 3) in a numpy-array of shape (num_px ∗ num_px ∗ 3, 1). After this, our training (and test) dataset is a numpy-array where each column represents a flattened image. There should be m_train (respectively m_test) columns.

Exercise: Reshape the training and test data sets so that images of size (num_px, num_px, 3) are flattened into single vectors of shape (num_px ∗ num_px ∗ 3, 1).

#reshape the training and test examples
train_set_x_flatten = train_set_x_orig.reshape(m_train, -1).T
test_set_x_flatten = test_set_x_orig.reshape(m_test, -1).T
print("====================分割线======================")
print ("train_set_x_flatten shape: " + str(train_set_x_flatten.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x_flatten shape: " + str(test_set_x_flatten.shape))
print ("test_set_y shape: " + str(test_set_y.shape))
print ("sanity check after reshaping: " + str(train_set_x_flatten[0:5,0]))
train_set_x = train_set_x_flatten/255
test_set_x = test_set_x_flatten/255

To represent color images, the red, green and blue channels (RGB) must be specified for each pixel, and so the pixel value is actually a vector of three numbers ranging from 0 to 255.

One common preprocessing step in machine learning is to center and standardize your dataset, meaning that you substract the mean of the whole numpy array from each example, and then divide each example by the standard deviation of the whole numpy array. But for picture datasets, it is simpler and more convenient and works almost as well to just divide every row of the dataset by 255 (the maximum value of a pixel channel).

Let’s standardize our dataset.

train_set_x = train_set_x_flatten/255.
test_set_x = test_set_x_flatten/255.

What you need to remember:

Common steps for pre-processing a new dataset are: 
- Figure out the dimensions and shapes of the problem (m_train, m_test, num_px, …) 
- Reshape the datasets such that each example is now a vector of size (num_px * num_px * 3, 1) 
- “Standardize” the data

未完待续

本作品采用 知识共享署名 4.0 国际许可协议 进行许可
标签: Python 深度学习 神经网络
最后更新:2018年1月29日

davidcheung

这个人很懒,什么都没留下

打赏 点赞
< 上一篇
下一篇 >

文章评论

razz evil exclaim smile redface biggrin eek confused idea lol mad twisted rolleyes wink cool arrow neutral cry mrgreen drooling persevering
取消回复

搜索
欢迎关注我的个人公众号
最新 热点 随机
最新 热点 随机
DEEPFILTERNET:一种基于深度滤波的全频带音频低复杂度语音增强框架 奥地利匈牙利九日游旅程 论文阅读之Study of the General Kalman Filter for Echo Cancellation 小奥看房之鸿荣源珈誉府 杭州往返旅途及西溪喜来登和万怡的体验报告 2022年的第一篇碎碎念
奥地利匈牙利九日游旅程论文阅读之Study of the General Kalman Filter for Echo CancellationDEEPFILTERNET:一种基于深度滤波的全频带音频低复杂度语音增强框架
计算机组成原理笔记第一章(3) Leetcode:股票系列题目解析 2015.3—2015.7的计划(证书考试)安排 《ultraman china》第三话新的英雄(下) Java语言程序设计【学堂在线】(第三章)整理 最近打算做的一些事情
标签聚合
高中 Python 学习 Java linux leetcode python学习 算法 生活 鸟哥的linux私房菜
最近评论
davidcheung 发布于 5 个月前(02月09日) The problem has been fixed. May I ask if you can s...
tk88 发布于 5 个月前(02月07日) Hmm is anyone else having problems with the pictur...
cuicui 发布于 9 个月前(10月20日) :wink:
niming 发布于 10 个月前(09月19日) 同级校友,能刷到太巧了
davidcheung 发布于 2 年前(08月16日) 我得找一下我之前整理的word文档看一下,如果找到了我就更新一下这篇文章。
Nolan 发布于 2 年前(07月25日) 您的笔记非常有帮助。贴图不显示了,可以更新一下吗?
davidcheung 发布于 3 年前(06月19日) 到没有看webrtc的代码。现在主要在看我们公司的代码了。。。只是偶尔看一看webrtc的东西。。。
aobai 发布于 3 年前(03月13日) gain_change_hangover_ 应该是每三个block 只能够调整一次,这样保证每帧...
匿名 发布于 5 年前(12月30日) 烫
小奥 发布于 5 年前(12月12日) webRTC里面的NS本身我记得就是在C++里面呀

COPYRIGHT © 2025 小奥的学习笔记. ALL RIGHTS RESERVED.

Theme Kratos Made By Seaton Jiang

陕ICP备19003234号-1

鲁公网安备37120202000100号