基于CNN的圖像缺陷分類
點(diǎn)擊上方“小白學(xué)視覺(jué)”,選擇加"星標(biāo)"或“置頂”
重磅干貨,第一時(shí)間送達(dá)
本文轉(zhuǎn)自|新機(jī)器視覺(jué)

來(lái)源:博客園 原文地址:https://www.cnblogs.com/BellaVita/p/10142266.html
1、前言
在工業(yè)產(chǎn)品缺陷檢測(cè)中,基于傳統(tǒng)的圖像特征的缺陷分類的準(zhǔn)確率達(dá)不到實(shí)際生產(chǎn)的要求,因此想采用CNN來(lái)進(jìn)行缺陷分類。
傳統(tǒng)缺陷分類思路:
1、缺陷圖片分離:先采用復(fù)雜的圖像處理方法,將缺陷從采集的圖像中分離處理;
2、特征向量構(gòu)建:通過(guò)對(duì)不同缺陷種類的特征進(jìn)行分析,定義需要提取的n維特征(比如缺陷長(zhǎng)、寬、對(duì)比度、紋理特征、熵、梯度等),構(gòu)成一組描述缺陷的
特征向量;特征向量的構(gòu)建需要對(duì)實(shí)際的問(wèn)題有很深入的分析,并且需要有很深厚的圖像處理知識(shí);這也是傳統(tǒng)分類問(wèn)題中最難的部分。
3、特征向量歸一化:由于特征向量每個(gè)維度的度量差別很大(比如缺陷長(zhǎng)50像素,對(duì)比度0.03),因此需要進(jìn)行特征縮放,特征歸一化;
4、人工標(biāo)記缺陷:將缺陷圖片存儲(chǔ)在人工標(biāo)記的文件夾內(nèi);
5、采用SVM對(duì)缺陷進(jìn)行分類,分類準(zhǔn)確率85%左右。
2、CNN網(wǎng)絡(luò)構(gòu)建
在缺陷圖片分離和人工標(biāo)記后,構(gòu)建CNN網(wǎng)絡(luò)模型;由于工業(yè)檢測(cè)中對(duì)實(shí)時(shí)性要求很高,因此想采用比較簡(jiǎn)單的網(wǎng)絡(luò)結(jié)構(gòu)來(lái)提高訓(xùn)練的速度和檢測(cè)速度;
網(wǎng)絡(luò)構(gòu)建:本文采用LeNet網(wǎng)絡(luò)結(jié)構(gòu)的基本思路,構(gòu)建一個(gè)簡(jiǎn)單的網(wǎng)絡(luò)


圖1:Tensorflow輸出的網(wǎng)絡(luò)模型
3、模型訓(xùn)練和測(cè)試
3.1 原始模型測(cè)試
開(kāi)始以為模型可能會(huì)出現(xiàn)過(guò)擬合的問(wèn)題,不過(guò)從精度和損失曲線看來(lái),沒(méi)有過(guò)擬合問(wèn)題,到是模型初始迭代的時(shí)候陷入了一個(gè)局部循環(huán)狀態(tài),可能是沒(méi)有得到特別好的特征或者是隨機(jī)選擇訓(xùn)練模型的數(shù)據(jù)集沒(méi)有完全分散,也有可能是訓(xùn)練的次數(shù)太少了。訓(xùn)練集上的準(zhǔn)確率有點(diǎn)低,因此需要用更好的模型,但是模型怎么改呢??盡管CNN可以自己訓(xùn)練出FIlters,但是依然不能很清晰的看到圖像被濾波后是怎么樣的狀態(tài)(圖2,圖3),對(duì)于一直做圖像底層算法的人來(lái)說(shuō),有點(diǎn)很不爽。


圖2 :卷積第一層

圖3:Relu激活函數(shù)層
通過(guò)分析圖2,發(fā)現(xiàn)濾波整體效果還不錯(cuò),缺陷的地方都能清晰的反映出來(lái);但是本來(lái)輸入的缺陷是往下凹的,濾波后的缺陷很多是向上凸的,不符合實(shí)際情況。
分析圖3,發(fā)現(xiàn)經(jīng)過(guò)Relu激活函數(shù)后,只留下了很明顯向下凹的缺陷特征圖片,但是有效的特征圖片(FeatureMap)太少,只有2個(gè)。

圖4:上凸圖片數(shù)據(jù)

圖5:下凹圖片數(shù)據(jù)
為了能得到更多的符合實(shí)際的缺陷特征圖片,考慮到需要更加突出缺陷邊緣,以致不被周圍大片圖像的干擾,因此決定將卷積核變小;卷積核由默認(rèn)的5x5改為3x3.
3.2 優(yōu)化卷積核大小后
模型整體的精度有明顯的上升,經(jīng)過(guò)Relu后的有效FeatureMap增加了。有點(diǎn)疑問(wèn)的是validation數(shù)據(jù)集的準(zhǔn)確率比訓(xùn)練還高5-8個(gè)點(diǎn)???



4、Code

# -*- coding: utf-8 -*-
# @Time : 18-7-25 下午2:33
# @Author : DuanBin
# @Email : [email protected]
# @File : catl_train.py
# @Software: PyCharm
# USAGE
# python catl_train.py --dataset data --model catl.model
# import the necessary packages
from keras.preprocessing.image import ImageDataGenerator
from keras.optimizers import Adam
from sklearn.model_selection import train_test_split
from keras.preprocessing.image import img_to_array
from keras.utils import to_categorical
from keras.models import Model
from keras.models import load_model
from lenet import LeNet
from imutils import paths
import matplotlib.pyplot as plt
import numpy as np
import argparse
import random
import cv2
import os
# set the matplotlib backend so figures can be saved in the background
import matplotlib
matplotlib.use("Agg")
dataPath = "data"
modelPath = "catl_5_5.model"
plotPath = "catl_plot_5_5_blog.png"
# initialize the number of epochs to train for, initia learning rate,
# and batch size
EPOCHS = 50
INIT_LR = 0.001
BS = 3
classNumber = 3
imageDepth = 1
# initialize the data and labels
print("[INFO] loading images...")
data = []
labels = []
# grab the image paths and randomly shuffle them
imagePaths = sorted(list(paths.list_images(dataPath))) # args["dataset"])))
random.seed(42)
random.shuffle(imagePaths)
# loop over the input images
for imagePath in imagePaths:
# load the image, pre-process it, and store it in the data list
image = cv2.imread(imagePath, 0)
image = cv2.resize(image, (28, 28))
image = img_to_array(image)
data.append(image)
# extract the class label from the image path and update the
# labels list
label = imagePath.split(os.path.sep)[-2]
if label == "dity":
label = 0
elif label == "tan":
label = 1
elif label == "valley":
label = 2
labels.append(label)
# scale the raw pixel intensities to the range [0, 1]
data = np.array(data, dtype="float") / 255.0
labels = np.array(labels)
# partition the data into training and testing splits using 75% of
# the data for training and the remaining 25% for testing
(trainX, testX, trainY, testY) = train_test_split(data, labels, test_size=0.3, random_state=42)
print(trainX.shape)
# convert the labels from integers to vectors
trainY = to_categorical(trainY, num_classes=classNumber)
testY = to_categorical(testY, num_classes=classNumber)
print(trainY.shape)
print(testX.shape)
# construct the image generator for data augmentation
aug = ImageDataGenerator(rotation_range=30, width_shift_range=0.1,
height_shift_range=0.1, shear_range=0.2, zoom_range=0.2,
horizontal_flip=True, fill_mode="nearest")
# # initialize the model
print("[INFO] compiling model...")
model = LeNet.build(width=28, height=28, depth=imageDepth, classes=classNumber)
opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS)
model.compile(loss="categorical_crossentropy", optimizer=opt,
metrics=["accuracy"])
model.summary()
# train the network
print("[INFO] training network...")
H = model.fit_generator(aug.flow(trainX, trainY, batch_size=BS),
validation_data=(testX, testY), steps_per_epoch=len(trainX) // BS,
epochs=EPOCHS, verbose=1)
# save the model to disk
print("[INFO] serializing network...")
model.save(modelPath) # args["model"])
model.save_weights("catl_5_5_wight.h5")
# plot the training loss and accuracy
plt.style.use("ggplot")
plt.figure()
N = EPOCHS
plt.plot(np.arange(0, N), H.history["loss"], label="train_loss")
plt.plot(np.arange(0, N), H.history["val_loss"], label="val_loss")
plt.plot(np.arange(0, N), H.history["acc"], label="train_acc")
plt.plot(np.arange(0, N), H.history["val_acc"], label="val_acc")
plt.title("Training Loss and Accuracy")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend(loc="lower left")
plt.savefig(plotPath) # args["plot"])
plt.show()
layer_outputs = [layer.output for layer in model.layers]
activation_model = Model(inputs=model.input, outputs=layer_outputs)
activations = activation_model.predict(testX[0].reshape(1, 28, 28, 1))
def display_activation(activations, col_size, row_size, act_index):
activation = activations[act_index]
activation_index = 0
fig, ax = plt.subplots(row_size, col_size, figsize=(row_size * 2.5, col_size * 1.5))
for row in range(0, row_size):
for col in range(0, col_size):
ax[row][col].imshow(activation[0, :, :, activation_index], cmap='gray')
activation_index += 1
plt.show()
display_activation(activations, 4, 5, 1)


# import the necessary packages
from keras.models import Sequential
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Flatten
from keras.layers.core import Dense
from keras.layers.core import Dropout
from tensorflow.keras import backend as K
class LeNet:
@staticmethod
def build(width, height, depth, classes):
# initialize the model
model = Sequential()
inputShape = (height, width, depth)
# if we are using "channels first", update the input shape
if K.image_data_format() == "channels_first":
inputShape = (depth, height, width)
else:
inputShape = (width, height, depth)
# first set of CONV => RELU => POOL layers
model.add(Conv2D(20, (3, 3), padding="same", input_shape=inputShape))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
# second set of CONV => RELU => POOL layers
model.add(Conv2D(50, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
# first (and only) set of FC => RELU layers
model.add(Flatten())
model.add(Dense(500))
model.add(Activation("relu"))
# softmax classifier
model.add(Dense(classes))
model.add(Activation("softmax"))
# return the constructed network architecture
return model

End 
交流群
歡迎加入公眾號(hào)讀者群一起和同行交流,目前有SLAM、三維視覺(jué)、傳感器、自動(dòng)駕駛、計(jì)算攝影、檢測(cè)、分割、識(shí)別、醫(yī)學(xué)影像、GAN、算法競(jìng)賽等微信群(以后會(huì)逐漸細(xì)分),請(qǐng)掃描下面微信號(hào)加群,備注:”昵稱+學(xué)校/公司+研究方向“,例如:”張三 + 上海交大 + 視覺(jué)SLAM“。請(qǐng)按照格式備注,否則不予通過(guò)。添加成功后會(huì)根據(jù)研究方向邀請(qǐng)進(jìn)入相關(guān)微信群。請(qǐng)勿在群內(nèi)發(fā)送廣告,否則會(huì)請(qǐng)出群,謝謝理解~

