主要內(nèi)容
人像分割簡(jiǎn)介
人像分割的相關(guān)應(yīng)用非常廣,例如基于人像分割可以實(shí)現(xiàn)背景的替換做出各種非??犰诺男Ч?。我們將訓(xùn)練數(shù)據(jù)擴(kuò)充到人體分割,那么我們就是對(duì)人體做美顏特效處理,同時(shí)對(duì)背景做其他的特效處理,這樣整張畫(huà)面就會(huì)變得更加有趣,更加提高顏值了,這里我們對(duì)人體前景做美顏調(diào)色處理,對(duì)背景做了以下特效:
例子:
例子來(lái)源:https://blog.csdn.net/Trent1985/article/details/80578841
https://zhuanlan.zhihu.com/p/48080465 (實(shí)現(xiàn)背景灰化)
而在在實(shí)現(xiàn)這些效果之前,所需要的一步操作都是需要將人像摳出來(lái)。今天的主要內(nèi)容是要介紹如何使用UNet實(shí)現(xiàn)人像分割。
UNet的簡(jiǎn)介
UNet的結(jié)構(gòu)非常簡(jiǎn)單,廣泛應(yīng)用于醫(yī)學(xué)圖像分割,2015年發(fā)表在 MICCAI,谷歌學(xué)術(shù)上目前引用量8894,可以看出來(lái)其影響力。
UNet的結(jié)構(gòu),有兩個(gè)最大的特點(diǎn),U型結(jié)構(gòu)和skip-connection(如下圖)。
UNet網(wǎng)絡(luò),類型于一個(gè)U字母:首先進(jìn)行Conv(兩次)+Pooling下采樣;然后Deconv反卷積進(jìn)行上采樣(部分采用resize+線性插值上采樣),crop之前的低層feature map,進(jìn)行融合;然后再次上采樣。重復(fù)這個(gè)過(guò)程,直到獲得輸出388x388x2的feature map,最后經(jīng)過(guò)softmax獲得output segment map??傮w來(lái)說(shuō)與FCN思路非常類似。
U-Net采用了與FCN完全不同的特征融合方式:拼接!
參考資料:https://zhuanlan.zhihu.com/p/57437131
https://www.zhihu.com/question/269914775/answer/586501606
https://www.zhihu.com/people/george-zhang-84/posts
UNet實(shí)現(xiàn)人像分割
該項(xiàng)目是基于 https://github.com/milesial/Pytorch-UNet (2.6k star 車輛分割)修改的,并提供人像分割的數(shù)據(jù)集(1.15G)。
人像分割項(xiàng)目鏈接:https://github.com/leijue222/portrait-matting-unet-flask
官方下載鏈接:http://www.cse.cuhk.edu.hk/leojia/projects/automatting/index.html
或者:
百度網(wǎng)盤:http://pan.baidu.com/s/1dE14537
密碼:ndg8
該項(xiàng)目已經(jīng)提供了預(yù)訓(xùn)練模型,如果你不想重新訓(xùn)練,可以自己clone下來(lái),按照下面的操作一步一步運(yùn)行即可。
環(huán)境配置
Python 3.6
PyTorch >= 1.1.0
Torchvision >= 0.3.0
Flask 1.1.1
future 0.18.2
matplotlib 3.1.3
numpy 1.16.0
Pillow 6.2.0
protobuf 3.11.3
tensorboard 1.14.0
tqdm==4.42.1
# clone 項(xiàng)目
git clone https://github.com/leijue222/portrait-matting-unet-flask.git
# 進(jìn)入到文件夾中
cd portrait-matting-unet-flask/
# 準(zhǔn)備好一張待分割的人像圖片,運(yùn)行下面的代碼即可生成mask并保存
python predict.py -i image.jpg -o output.jpg
作者提供的測(cè)試demo
如果你想重新訓(xùn)練的話,也很容易,根據(jù)上面提供的數(shù)據(jù)集,將原圖和mask分別
放置在 文件夾 data/imgs和 data/masks 路徑下即可
然后運(yùn)行下面的代碼
python train.py -e 200 -b 1 -l 0.1 -s 0.5 -v 15.0
各個(gè)參數(shù)的含義
-e 表示 epoch 數(shù)
-b 表示 batch size
-l 表示學(xué)習(xí)率
-s 表示 scale
-v 表示 驗(yàn)證集所占的百分比
最后我們?cè)诳匆幌?UNet 網(wǎng)絡(luò)的核心代碼
定義UNet 需要用的主要模塊
class DoubleConv(nn.Module):
'''(convolution => [BN] => ReLU) * 2'''
def __init__(self, in_channels, out_channels):
super().__init__()
self.double_conv = nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1),
nn.BatchNorm2d(out_channels),
nn.ReLU(inplace=True),
nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=1),
nn.BatchNorm2d(out_channels),
nn.ReLU(inplace=True)
)
def forward(self, x):
return self.double_conv(x)
class Down(nn.Module):
'''Downscaling with maxpool then double conv'''
def __init__(self, in_channels, out_channels):
super().__init__()
self.maxpool_conv = nn.Sequential(
nn.MaxPool2d(2),
DoubleConv(in_channels, out_channels)
)
def forward(self, x):
return self.maxpool_conv(x)
class Up(nn.Module):
'''Upscaling then double conv'''
def __init__(self, in_channels, out_channels, bilinear=True):
super().__init__()
# if bilinear, use the normal convolutions to reduce the number of channels
if bilinear:
self.up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
else:
self.up = nn.ConvTranspose2d(in_channels // 2, in_channels // 2, kernel_size=2, stride=2)
self.conv = DoubleConv(in_channels, out_channels)
def forward(self, x1, x2):
x1 = self.up(x1)
# input is CHW
diffY = torch.tensor([x2.size()[2] - x1.size()[2]])
diffX = torch.tensor([x2.size()[3] - x1.size()[3]])
x1 = F.pad(x1, [diffX // 2, diffX - diffX // 2,
diffY // 2, diffY - diffY // 2])
x = torch.cat([x2, x1], dim=1)
return self.conv(x)
class OutConv(nn.Module):
def __init__(self, in_channels, out_channels):
super(OutConv, self).__init__()
self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=1)
def forward(self, x):
return self.conv(x)
利用上面定義好的模塊,輕松的實(shí)現(xiàn)UNet網(wǎng)絡(luò)
class UNet(nn.Module):
def __init__(self, n_channels, n_classes, bilinear=True):
super(UNet, self).__init__()
self.n_channels = n_channels
self.n_classes = n_classes
self.bilinear = bilinear
self.inc = DoubleConv(n_channels, 64)
self.down1 = Down(64, 128)
self.down2 = Down(128, 256)
self.down3 = Down(256, 512)
self.down4 = Down(512, 512)
self.up1 = Up(1024, 256, bilinear)
self.up2 = Up(512, 128, bilinear)
self.up3 = Up(256, 64, bilinear)
self.up4 = Up(128, 64, bilinear)
self.outc = OutConv(64, n_classes)
def forward(self, x):
x1 = self.inc(x)
x2 = self.down1(x1)
x3 = self.down2(x2)
x4 = self.down3(x3)
x5 = self.down4(x4)
x = self.up1(x5, x4)
x = self.up2(x, x3)
x = self.up3(x, x2)
x = self.up4(x, x1)
logits = self.outc(x)
return logits
資料匯總
人像分割項(xiàng)目鏈接:https://github.com/leijue222/portrait-matting-unet-flask
數(shù)據(jù)集下載
百度網(wǎng)盤:http://pan.baidu.com/s/1dE14537
密碼:ndg8
官方下載鏈接:http://www.cse.cuhk.edu.hk/leojia/projects/automatting/index.html
UNet相關(guān)知識(shí)點(diǎn)參考資料:
https://zhuanlan.zhihu.com/p/57437131
https://www.zhihu.com/question/269914775/answer/586501606
https://www.zhihu.com/people/george-zhang-84/posts
聯(lián)系客服