全連接神經(jīng)網(wǎng)絡(luò)的原理及Python實(shí)現(xiàn)
點(diǎn)擊上方“小白學(xué)視覺(jué)”,選擇加"星標(biāo)"或“置頂”
重磅干貨,第一時(shí)間送達(dá)
作者:李小文,先后從事過(guò)數(shù)據(jù)分析、數(shù)據(jù)挖掘工作,主要開(kāi)發(fā)語(yǔ)言是Python,現(xiàn)任一家小型互聯(lián)網(wǎng)公司的算法工程師。
Github: https://github.com/tushushu
1. 原理篇
我們用人話而不是大段的數(shù)學(xué)公式來(lái)講講全連接神經(jīng)網(wǎng)絡(luò)是怎么一回事。
1.1 網(wǎng)絡(luò)結(jié)構(gòu)
靈魂畫師用PPT畫個(gè)粗糙的網(wǎng)絡(luò)結(jié)構(gòu)圖如下:

1.2 Simoid函數(shù)
Sigmoid函數(shù)的表達(dá)式是:

不難得出:

所以,Sigmoid函數(shù)的值域是(0, 1),導(dǎo)數(shù)為y * (1 - y)
1.3 鏈?zhǔn)角髮?dǎo)
z = f(y)
y = g(x)
dz / dy = f'(y)
dy / dx = g'(x)
dz / dz = dz / dy * dy / dx = f'(y) * g'(x)
1.4 向前傳播
將當(dāng)前節(jié)點(diǎn)的所有輸入執(zhí)行當(dāng)前節(jié)點(diǎn)的計(jì)算,作為當(dāng)前節(jié)點(diǎn)的輸出節(jié)點(diǎn)的輸入。
1.5 反向傳播
將當(dāng)前節(jié)點(diǎn)的輸出節(jié)點(diǎn)對(duì)當(dāng)前節(jié)點(diǎn)的梯度損失,乘以當(dāng)前節(jié)點(diǎn)對(duì)輸入節(jié)點(diǎn)的偏導(dǎo)數(shù),作為當(dāng)前節(jié)點(diǎn)的輸入節(jié)點(diǎn)的梯度損失。
1.6 拓?fù)渑判?/strong>
假設(shè)我們的神經(jīng)網(wǎng)絡(luò)中有k個(gè)節(jié)點(diǎn),任意一個(gè)節(jié)點(diǎn)都有可能有多個(gè)輸入,需要考慮節(jié)點(diǎn)執(zhí)行的先后順序,原則就是當(dāng)前節(jié)點(diǎn)的輸入節(jié)點(diǎn)全部執(zhí)行之后,才可以執(zhí)行當(dāng)前節(jié)點(diǎn)。
2. 實(shí)現(xiàn)篇
本人用全宇宙最簡(jiǎn)單的編程語(yǔ)言——Python實(shí)現(xiàn)了全連接神經(jīng)網(wǎng)絡(luò),便于學(xué)習(xí)和使用。簡(jiǎn)單說(shuō)明一下實(shí)現(xiàn)過(guò)程,更詳細(xì)的注釋請(qǐng)參考本人github上的代碼。
2.1 創(chuàng)建BaseNode抽象類
將BaseNode作為各種類型Node的父類。包括如下屬性:
name -- 節(jié)點(diǎn)名稱
value -- 節(jié)點(diǎn)數(shù)據(jù)
inbound_nodes -- 輸入節(jié)點(diǎn)
outbound_nodes -- 輸出節(jié)點(diǎn)
gradients -- 對(duì)于輸入節(jié)點(diǎn)的梯度
class BaseNode(ABC):
def __init__(self, *inbound_nodes, name=None):
self.name = name
self._value = None
self.inbound_nodes = [x for x in inbound_nodes]
self.outbound_nodes = []
self.gradients = dict()
for node in self.inbound_nodes:
node.outbound_nodes.append(self)
def __str__(self):
size = str(self.value.shape) if self.value is not None else "null"
return "<Node name: %s, Node size: %s>" % (self.name, size)
@property
def value(self)->ndarray:
return self._value
@value.setter
def value(self, value):
err_msg = "'value' has to be a number or a numpy array!"
assert isinstance(value, (ndarray, int, float)), err_msg
self._value = value
@abstractmethod
def forward(self):
return
@abstractmethod
def backward(self):
return
2.2 創(chuàng)建InputNode類
用于存儲(chǔ)訓(xùn)練、測(cè)試數(shù)據(jù)。其中indexes屬性用來(lái)存儲(chǔ)每個(gè)Batch中的數(shù)據(jù)下標(biāo)。
class InputNode(BaseNode):
def __init__(self, value: ndarray, name=None):
BaseNode.__init__(self, name=name)
self.value = value
self.indexes = None
@property
def value(self):
err_msg = "Indexes is None!"
assert self.indexes is not None, err_msg
return self._value[self.indexes]
@value.setter
def value(self, value: ndarray):
BaseNode.value.fset(self, value)
def forward(self):
return
def backward(self):
self.gradients = {self: 0}
for node in self.outbound_nodes:
self.gradients[self] += node.gradients[self]
2.3 創(chuàng)建LinearNode類
用于執(zhí)行線性運(yùn)算。
Y = WX + Bias
dY / dX = W
dY / dW = X
dY / dBias = 1
class LinearNode(BaseNode):
def __init__(self, data: BaseNode, weights: WeightNode, bias: WeightNode, name=None):
BaseNode.__init__(self, data, weights, bias, name=name)
def forward(self):
data, weights, bias = self.inbound_nodes
self.value = np.dot(data.value, weights.value) + bias.value
def backward(self):
data, weights, bias = self.inbound_nodes
self.gradients = {node: np.zeros_like(node.value) for node in self.inbound_nodes}
for node in self.outbound_nodes:
grad_cost = node.gradients[self]
self.gradients[data] += np.dot(grad_cost, weights.value.T)
self.gradients[weights] += np.dot(data.value.T, grad_cost)
self.gradients[bias] += np.sum(grad_cost, axis=0, keepdims=False)
2.4 創(chuàng)建MseNode類
用于計(jì)算預(yù)測(cè)值與實(shí)際值的差異。
MSE = (label - prediction) ^ 2 / n_label
dMSE / dLabel = 2 * (label - prediction) / n_label
dMSE / dPrediction = -2 * (label - prediction) / n_label
class MseNode(BaseNode):
def __init__(self, label: InputNode, pred: LinearNode, name=None):
BaseNode.__init__(self, label, pred, name=name)
self.n_label = None
self.diff = None
def forward(self):
label, pred = self.inbound_nodes
self.n_label = label.value.shape[0]
self.diff = (label.value - pred.value).reshape(-1, 1)
self.value = np.mean(self.diff**2)
def backward(self):
label, pred = self.inbound_nodes
self.gradients[label] = (2 / self.n_label) * self.diff
self.gradients[pred] = -self.gradients[label]
2.5 創(chuàng)建SigmoidNode類
用于計(jì)算Sigmoid值。
Y = 1 / (1 + e^(-X))
dY / dX = Y * (1 - Y)
class SigmoidNode(BaseNode):
def __init__(self, input_node: LinearNode, name=None):
BaseNode.__init__(self, input_node, name=name)
@staticmethod
def _sigmoid(arr: ndarray) -> ndarray:
return 1. / (1. + np.exp(-arr))
@staticmethod
def _derivative(arr: ndarray) -> ndarray:
return arr * (1 - arr)
def forward(self):
input_node = self.inbound_nodes[0]
self.value = self._sigmoid(input_node.value)
def backward(self):
input_node = self.inbound_nodes[0]
self.gradients = {input_node: np.zeros_like(input_node.value)}
for output_node in self.outbound_nodes:
grad_cost = output_node.gradients[self]
self.gradients[input_node] += self._derivative(self.value) * grad_cost
2.6 創(chuàng)建WeightNode類
用于存儲(chǔ)、更新權(quán)重。
class WeightNode(BaseNode):
def __init__(self, shape: Union[Tuple[int, int], int], name=None, learning_rate=None):
BaseNode.__init__(self, name=name)
if isinstance(shape, int):
self.value = np.zeros(shape)
if isinstance(shape, tuple):
self.value = np.random.randn(*shape)
self.learning_rate = learning_rate
def forward(self):
pass
def backward(self):
self.gradients = {self: 0}
for node in self.outbound_nodes:
self.gradients[self] += node.gradients[self]
partial = self.gradients[self]
self.value -= partial * self.learning_rate
2.7 創(chuàng)建全連接神經(jīng)網(wǎng)絡(luò)類
class MLP:
def __init__(self):
self.nodes_sorted = []
self._learning_rate = None
self.data = None
self.prediction = None
self.label = None
2.8 網(wǎng)絡(luò)結(jié)構(gòu)
def __str__(self):
if not self.nodes_sorted:
return "Network has not be trained yet!"
print("Network informantion:\n")
ret = ["learning rate:", str(self._learning_rate), "\n"]
for node in self.nodes_sorted:
ret.append(node.name)
ret.append(str(node.value.shape))
ret.append("\n")
return " ".join(ret)
2.9 學(xué)習(xí)率
存儲(chǔ)學(xué)習(xí)率,并賦值給所有權(quán)重節(jié)點(diǎn)。
@property
def learning_rate(self) -> float:
return self._learning_rate
@learning_rate.setter
def learning_rate(self, learning_rate):
self._learning_rate = learning_rate
for node in self.nodes_sorted:
if isinstance(node, WeightNode):
node.learning_rate = learning_rate
2.10 拓?fù)渑判?/strong>
實(shí)現(xiàn)拓?fù)渑判颍瑢⒐?jié)點(diǎn)按照更新順序排列。
def topological_sort(self, input_nodes):
nodes_sorted = []
que = copy(input_nodes)
unique = set()
while que:
node = que.pop(0)
nodes_sorted.append(node)
unique.add(node)
for outbound_node in node.outbound_nodes:
if all(x in unique for x in outbound_node.inbound_nodes):
que.append(outbound_node)
self.nodes_sorted = nodes_sorted
2.11 前向傳播和反向傳播
def forward(self):
assert self.nodes_sorted is not None, "nodes_sorted is empty!"
for node in self.nodes_sorted:
node.forward()
def backward(self):
assert self.nodes_sorted is not None, "nodes_sorted is empty!"
for node in self.nodes_sorted[::-1]:
node.backward()
def forward_and_backward(self):
self.forward()
self.backward()
2.12 建立全連接神經(jīng)網(wǎng)絡(luò)
def build_network(self, data: ndarray, label: ndarray, n_hidden: int, n_feature: int):
weight_node1 = WeightNode(shape=(n_feature, n_hidden), name="W1")
bias_node1 = WeightNode(shape=n_hidden, name="b1")
weight_node2 = WeightNode(shape=(n_hidden, 1), name="W2")
bias_node2 = WeightNode(shape=1, name="b2")
self.data = InputNode(data, name="X")
self.label = InputNode(label, name="y")
linear_node1 = LinearNode(
self.data, weight_node1, bias_node1, name="l1")
sigmoid_node1 = SigmoidNode(linear_node1, name="s1")
self.prediction = LinearNode(
sigmoid_node1, weight_node2, bias_node2, name="prediction")
MseNode(self.label, self.prediction, name="mse")
input_nodes = [weight_node1, bias_node1,
weight_node2, bias_node2, self.data, self.label]
self.topological_sort(input_nodes)
2.13 訓(xùn)練模型
使用隨機(jī)梯度下降訓(xùn)練模型。
def train_network(self, epochs: int, n_sample: int, batch_size: int, random_state: int):
steps_per_epoch = n_sample // batch_size
for i in range(epochs):
loss = 0
for _ in range(steps_per_epoch):
indexes = choice(n_sample, batch_size, replace=True)
self.data.indexes = indexes
self.label.indexes = indexes
self.forward_and_backward()
loss += self.nodes_sorted[-1].value
print("Epoch: {}, Loss: {:.3f}".format(i + 1, loss / steps_per_epoch))
print()
2.14 移除無(wú)用節(jié)點(diǎn)
模型訓(xùn)練結(jié)束后,將mse和label節(jié)點(diǎn)移除。
def pop_unused_nodes(self):
for _ in range(len(self.nodes_sorted)):
node = self.nodes_sorted.pop(0)
if node.name in ("mse", "y"):
continue
self.nodes_sorted.append(node)
2.15 訓(xùn)練模型
def fit(self, data: ndarray, label: ndarray, n_hidden: int, epochs: int,
batch_size: int, learning_rate: float):
label = label.reshape(-1, 1)
n_sample, n_feature = data.shape
self.build_network(data, label, n_hidden, n_feature)
self.learning_rate = learning_rate
print("Total number of samples = {}".format(n_sample))
self.train_network(epochs, n_sample, batch_size)
self.pop_unused_nodes()
def predict(self, data: ndarray) -> ndarray:
self.data.value = data
self.data.indexes = range(data.shape[0])
self.forward()
return self.prediction.value.flatten()
3 效果評(píng)估
3.1 main函數(shù)
使用著名的波士頓房?jī)r(jià)數(shù)據(jù)集,按照7:3的比例拆分為訓(xùn)練集和測(cè)試集,訓(xùn)練模型,并統(tǒng)計(jì)準(zhǔn)確度。
@run_time
def main():
print("Tesing the performance of MLP....")
data, label = load_boston_house_prices()
data = min_max_scale(data)
data_train, data_test, label_train, label_test = train_test_split(
data, label, random_state=20)
reg = MLP()
reg.fit(data=data_train, label=label_train, n_hidden=8,
epochs=1000, batch_size=8, learning_rate=0.0008)
get_r2(reg, data_test, label_test)
print(reg)
3.2 效果展示
擬合優(yōu)度0.803,運(yùn)行時(shí)間6.9秒。
效果還算不錯(cuò)~

3.3 工具函數(shù)
本人自定義了一些工具函數(shù),可以在github上查看
https://github.com/tushushu/imylu/tree/master/imylu/utils
1、run_time - 測(cè)試函數(shù)運(yùn)行時(shí)間
2、load_boston_house_prices - 加載波士頓房?jī)r(jià)數(shù)據(jù)
3、train_test_split - 拆分訓(xùn)練集、測(cè)試集
4、get_r2 - 計(jì)算擬合優(yōu)度
總結(jié)
矩陣乘法
鏈?zhǔn)角髮?dǎo)
拓?fù)渑判?br style="box-sizing: border-box;font-size: inherit;color: inherit;line-height: inherit;">梯度下降
好消息!
小白學(xué)視覺(jué)知識(shí)星球
開(kāi)始面向外開(kāi)放啦??????
下載1:OpenCV-Contrib擴(kuò)展模塊中文版教程 在「小白學(xué)視覺(jué)」公眾號(hào)后臺(tái)回復(fù):擴(kuò)展模塊中文教程,即可下載全網(wǎng)第一份OpenCV擴(kuò)展模塊教程中文版,涵蓋擴(kuò)展模塊安裝、SFM算法、立體視覺(jué)、目標(biāo)跟蹤、生物視覺(jué)、超分辨率處理等二十多章內(nèi)容。 下載2:Python視覺(jué)實(shí)戰(zhàn)項(xiàng)目52講 在「小白學(xué)視覺(jué)」公眾號(hào)后臺(tái)回復(fù):Python視覺(jué)實(shí)戰(zhàn)項(xiàng)目,即可下載包括圖像分割、口罩檢測(cè)、車道線檢測(cè)、車輛計(jì)數(shù)、添加眼線、車牌識(shí)別、字符識(shí)別、情緒檢測(cè)、文本內(nèi)容提取、面部識(shí)別等31個(gè)視覺(jué)實(shí)戰(zhàn)項(xiàng)目,助力快速學(xué)校計(jì)算機(jī)視覺(jué)。 下載3:OpenCV實(shí)戰(zhàn)項(xiàng)目20講 在「小白學(xué)視覺(jué)」公眾號(hào)后臺(tái)回復(fù):OpenCV實(shí)戰(zhàn)項(xiàng)目20講,即可下載含有20個(gè)基于OpenCV實(shí)現(xiàn)20個(gè)實(shí)戰(zhàn)項(xiàng)目,實(shí)現(xiàn)OpenCV學(xué)習(xí)進(jìn)階。 交流群
歡迎加入公眾號(hào)讀者群一起和同行交流,目前有SLAM、三維視覺(jué)、傳感器、自動(dòng)駕駛、計(jì)算攝影、檢測(cè)、分割、識(shí)別、醫(yī)學(xué)影像、GAN、算法競(jìng)賽等微信群(以后會(huì)逐漸細(xì)分),請(qǐng)掃描下面微信號(hào)加群,備注:”昵稱+學(xué)校/公司+研究方向“,例如:”張三 + 上海交大 + 視覺(jué)SLAM“。請(qǐng)按照格式備注,否則不予通過(guò)。添加成功后會(huì)根據(jù)研究方向邀請(qǐng)進(jìn)入相關(guān)微信群。請(qǐng)勿在群內(nèi)發(fā)送廣告,否則會(huì)請(qǐng)出群,謝謝理解~

