成人午夜精品,国产精品二区三区免费播放心

Convolution Neural Network (CNN) 原理與實(shí)現(xiàn)

2017.04.13

本文結(jié)合Deep learning的一個(gè)應(yīng)用，Convolution Neural Network 進(jìn)行一些基本應(yīng)用，參考Lecun的Document 0.1進(jìn)行部分拓展，與結(jié)果展示（in Python）。

分為以下幾部分：

1. Convolution（卷積）

2. Pooling（降采樣過程）

3. CNN結(jié)構(gòu)

4. 跑實(shí)驗(yàn)

下面分別介紹。

PS：本篇blog為ese機(jī)器學(xué)習(xí)短期班參考資料（20140516課程），本文只是簡要講最naive最simple的思想，重在實(shí)踐部分，原理課上詳述。

1. Convolution（卷積）

類似于高斯卷積，對(duì)imagebatch中的所有image進(jìn)行卷積。對(duì)于一張圖，其所有feature map用一個(gè)filter卷成一張feature map。如下面的代碼，對(duì)一個(gè)imagebatch（含兩張圖）進(jìn)行操作，每個(gè)圖初始有3張feature map(R,G,B), 用兩個(gè)9*9的filter進(jìn)行卷積，結(jié)果是，每張圖得到兩個(gè)feature map。

卷積操作由theano的conv.conv2d實(shí)現(xiàn)，這里我們用隨機(jī)參數(shù)W，b。結(jié)果有點(diǎn)像edge detector是不是？

Code: （詳見注釋）

[python] view plain copy

# -*- coding: utf-8 -*-

"""

Created on Sat May 10 18:55:26 2014

@author: rachel

Function: convolution option of two pictures with same size (width,height)

input: 3 feature maps (3 channels <RGB> of a picture)

convolution: two 9*9 convolutional filters

"""

from theano.tensor.nnet import conv

import theano.tensor as T

import numpy, theano

rng = numpy.random.RandomState(23455)

# symbol variable

input = T.tensor4(name = 'input')

# initial weights

w_shape = (2,3,9,9) #2 convolutional filters, 3 channels, filter shape: 9*9

w_bound = numpy.sqrt(3*9*9)

W = theano.shared(numpy.asarray(rng.uniform(low = -1.0/w_bound, high = 1.0/w_bound,size = w_shape),

dtype = input.dtype),name = 'W')

b_shape = (2,)

b = theano.shared(numpy.asarray(rng.uniform(low = -.5, high = .5, size = b_shape),

dtype = input.dtype),name = 'b')

conv_out = conv.conv2d(input,W)

#T.TensorVariable.dimshuffle() can reshape or broadcast (add dimension)

#dimshuffle(self,*pattern)

# >>>b1 = b.dimshuffle('x',0,'x','x')

# >>>b1.shape.eval()

# array([1,2,1,1])

output = T.nnet.sigmoid(conv_out + b.dimshuffle('x',0,'x','x'))

f = theano.function([input],output)

# demo

import pylab

from PIL import Image

#minibatch_img = T.tensor4(name = 'minibatch_img')

#-------------img1---------------

img1 = Image.open(open('//home//rachel//Documents//ZJU_Projects//DL//Dataset//rachel.jpg'))

width1,height1 = img1.size

img1 = numpy.asarray(img1, dtype = 'float32')/256. # (height, width, 3)

# put image in 4D tensor of shape (1,3,height,width)

img1_rgb = img1.swapaxes(0,2).swapaxes(1,2).reshape(1,3,height1,width1) #(3,height,width)

#-------------img2---------------

img2 = Image.open(open('//home//rachel//Documents//ZJU_Projects//DL//Dataset//rachel1.jpg'))

width2,height2 = img2.size

img2 = numpy.asarray(img2,dtype = 'float32')/256.

img2_rgb = img2.swapaxes(0,2).swapaxes(1,2).reshape(1,3,height2,width2) #(3,height,width)

#minibatch_img = T.join(0,img1_rgb,img2_rgb)

minibatch_img = numpy.concatenate((img1_rgb,img2_rgb),axis = 0)

filtered_img = f(minibatch_img)

# plot original image and two convoluted results

pylab.subplot(2,3,1);pylab.axis('off');

pylab.imshow(img1)

pylab.subplot(2,3,4);pylab.axis('off');

pylab.imshow(img2)

pylab.gray()

pylab.subplot(2,3,2); pylab.axis("off")

pylab.imshow(filtered_img[0,0,:,:]) #0:minibatch_index; 0:1-st filter

pylab.subplot(2,3,3); pylab.axis("off")

pylab.imshow(filtered_img[0,1,:,:]) #0:minibatch_index; 1:1-st filter

pylab.subplot(2,3,5); pylab.axis("off")

pylab.imshow(filtered_img[1,0,:,:]) #0:minibatch_index; 0:1-st filter

pylab.subplot(2,3,6); pylab.axis("off")

pylab.imshow(filtered_img[1,1,:,:]) #0:minibatch_index; 1:1-st filter

pylab.show()

2. Pooling（降采樣過程）

最常用的Maxpooling. 解決了兩個(gè)問題：

1. 減少計(jì)算量

2. 旋轉(zhuǎn)不變性（原因自己悟）

PS：對(duì)于旋轉(zhuǎn)不變性，回憶下SIFT，LBP：采用主方向；HOG：選擇不同方向的模版

Maxpooling的降采樣過程會(huì)將feature map的長寬各減半。（下面結(jié)果圖中沒有體現(xiàn)出來，Python自動(dòng)給拉到一樣大了，但實(shí)際上像素?cái)?shù)是減半的）

Code: （詳見注釋）

[python] view plain copy

# -*- coding: utf-8 -*-

"""

Created on Sat May 10 18:55:26 2014

@author: rachel

Function: convolution option

input: 3 feature maps (3 channels <RGB> of a picture)

convolution: two 9*9 convolutional filters

"""

from theano.tensor.nnet import conv

import theano.tensor as T

import numpy, theano

rng = numpy.random.RandomState(23455)

# symbol variable

input = T.tensor4(name = 'input')

# initial weights

w_shape = (2,3,9,9) #2 convolutional filters, 3 channels, filter shape: 9*9

w_bound = numpy.sqrt(3*9*9)

W = theano.shared(numpy.asarray(rng.uniform(low = -1.0/w_bound, high = 1.0/w_bound,size = w_shape),

dtype = input.dtype),name = 'W')

b_shape = (2,)

b = theano.shared(numpy.asarray(rng.uniform(low = -.5, high = .5, size = b_shape),

dtype = input.dtype),name = 'b')

conv_out = conv.conv2d(input,W)

#T.TensorVariable.dimshuffle() can reshape or broadcast (add dimension)

#dimshuffle(self,*pattern)

# >>>b1 = b.dimshuffle('x',0,'x','x')

# >>>b1.shape.eval()

# array([1,2,1,1])

output = T.nnet.sigmoid(conv_out + b.dimshuffle('x',0,'x','x'))

f = theano.function([input],output)

# demo

import pylab

from PIL import Image

from matplotlib.pyplot import *

#open random image

img = Image.open(open('//home//rachel//Documents//ZJU_Projects//DL//Dataset//rachel.jpg'))

width,height = img.size

img = numpy.asarray(img, dtype = 'float32')/256. # (height, width, 3)

# put image in 4D tensor of shape (1,3,height,width)

img_rgb = img.swapaxes(0,2).swapaxes(1,2) #(3,height,width)

minibatch_img = img_rgb.reshape(1,3,height,width)

filtered_img = f(minibatch_img)

# plot original image and two convoluted results

pylab.figure(1)

pylab.subplot(1,3,1);pylab.axis('off');

pylab.imshow(img)

title('origin image')

pylab.gray()

pylab.subplot(2,3,2); pylab.axis("off")

pylab.imshow(filtered_img[0,0,:,:]) #0:minibatch_index; 0:1-st filter

title('convolution 1')

pylab.subplot(2,3,3); pylab.axis("off")

pylab.imshow(filtered_img[0,1,:,:]) #0:minibatch_index; 1:1-st filter

title('convolution 2')

#pylab.show()

# maxpooling

from theano.tensor.signal import downsample

input = T.tensor4('input')

maxpool_shape = (2,2)

pooled_img = downsample.max_pool_2d(input,maxpool_shape,ignore_border = False)

maxpool = theano.function(inputs = [input],

outputs = [pooled_img])

pooled_res = numpy.squeeze(maxpool(filtered_img))

#pylab.figure(2)

pylab.subplot(235);pylab.axis('off');

pylab.imshow(pooled_res[0,:,:])

title('down sampled 1')

pylab.subplot(236);pylab.axis('off');

pylab.imshow(pooled_res[1,:,:])

title('down sampled 2')

pylab.show()

3. CNN結(jié)構(gòu)

想必大家隨便google下CNN的圖都濫大街了，這里拖出來那時(shí)候?qū)WCNN的時(shí)候一張圖，自認(rèn)為陪上講解的話畫得還易懂（）

廢話不多說了，直接上Lenet結(jié)構(gòu)圖：（從下往上順著箭頭看，最下面為底層original input）

4. CNN代碼

去資源里下載吧，我放上去了喔~（in python）

這里貼少部分代碼，僅表示建模的NN：

[python] view plain copy

rng = numpy.random.RandomState(23455)

# transfrom x from (batchsize, 28*28) to (batchsize,feature,28,28))

# I_shape = (28,28),F_shape = (5,5),

N_filters_0 = 20

D_features_0= 1

layer0_input = x.reshape((batch_size,D_features_0,28,28))

layer0 = LeNetConvPoolLayer(rng, input = layer0_input, filter_shape = (N_filters_0,D_features_0,5,5),

image_shape = (batch_size,1,28,28))

#layer0.output: (batch_size, N_filters_0, (28-5+1)/2, (28-5+1)/2) -> 20*20*12*12

N_filters_1 = 50

D_features_1 = N_filters_0

layer1 = LeNetConvPoolLayer(rng,input = layer0.output, filter_shape = (N_filters_1,D_features_1,5,5),

image_shape = (batch_size,N_filters_0,12,12))

# layer1.output: (20,50,4,4)

layer2_input = layer1.output.flatten(2) # (20,50,4,4)->(20,(50*4*4))

layer2 = HiddenLayer(rng,layer2_input,n_in = 50*4*4,n_out = 500, activation = T.tanh)

layer3 = LogisticRegression(input = layer2.output, n_in = 500, n_out = 10)

layer0, layer1 ：分別是卷積+降采樣

layer2+layer3：組成一個(gè)MLP（ANN）

訓(xùn)練模型：

[python] view plain copy

cost = layer3.negative_log_likelihood(y)

params = layer3.params + layer2.params + layer1.params + layer0.params

gparams = T.grad(cost,params)

updates = []

for par,gpar in zip(params,gparams):

updates.append((par, par - learning_rate * gpar))

train_model = theano.function(inputs = [minibatch_index],

outputs = [cost],

updates = updates,

givens = {x: train_set_x[minibatch_index * batch_size : (minibatch_index+1) * batch_size],

y: train_set_y[minibatch_index * batch_size : (minibatch_index+1) * batch_size]})

根據(jù)cost（最上層MLP的輸出NLL），對(duì)所有層的parameters進(jìn)行訓(xùn)練

剩下的具體見代碼和注釋。

PS：數(shù)據(jù)為MNIST所有數(shù)據(jù)

final result：

Optimization complete. Best validation score of 0.990000 % obtained at iteration 122500, with test performance 0.950000 %

原文鏈接：http://blog.csdn.net/abcjennifer/article/details/25912675

本站僅提供存儲(chǔ)服務(wù)，所有內(nèi)容均由用戶發(fā)布，如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請(qǐng)點(diǎn)擊舉報(bào)。

打開APP，閱讀全文并永久保存查看更多類似文章

Convolutional Neural Networks (LeNet)

新手向——使用Keras+卷積神經(jīng)網(wǎng)絡(luò)玩小鳥

DeepLearning tutorial（3）MLP多層感知機(jī)原理簡介+代碼詳解

theano學(xué)習(xí)指南1（翻譯）

深度學(xué)習(xí)（十二）從自編碼到棧式自編碼

手把手帶你走進(jìn)卷積神經(jīng)網(wǎng)絡(luò)！

更多類似文章 >>

九色国产,午夜在线视频,新黄色网址,九九色综合,天天做夜夜做久久做狠狠,天天躁夜夜躁狠狠躁2021a,久久不卡一区二区三区