xxxx18hd日本hd教师,一级国产性感片,一级欧美日韩,亚洲日本高清,北条麻妃JUX-869无码播放,亚洲人妻自拍,成年人黄色视频在线观看,国产精品27p

點(diǎn)擊上方“視學(xué)算法”，選擇加"星標(biāo)"或“置頂”

重磅干貨，第一時(shí)間送達(dá)

作者 | 小新?

來源 | https://lhyxx.top?

編輯 | 極市平臺(tái)

導(dǎo)讀

本文從理論和實(shí)踐兩方面來全面梳理一下常用的損失函數(shù)。（避免自己總是一瓶子不滿半瓶子晃蕩……）。要么理論滿分，編碼時(shí)不會(huì)用；要么編碼是會(huì)調(diào)包，但是不明白其中的計(jì)算原理。本文來科普一下。?

本文從理論和實(shí)踐兩方面來全面梳理一下常用的損失函數(shù)。（避免自己總是一瓶子不滿半瓶子晃蕩……）。要么理論滿分，編碼時(shí)不會(huì)用；要么編碼是會(huì)調(diào)包，但是不明白其中的計(jì)算原理。本文來科普一下。

我們將每個(gè)損失函數(shù)分別從理論和pytorch中的實(shí)現(xiàn)兩個(gè)方面來拆解一下。

另外，解釋一下torch.nn.Module 和 torch.nn.functional(俗稱F)中損失函數(shù)的區(qū)別。

Module的損失函數(shù)例如CrossEntropyLoss、NLLLoss等是封裝之后的損失函數(shù)類，是一個(gè)類，因此其中的變量可以自動(dòng)維護(hù)。經(jīng)常是對(duì)F中的函數(shù)的封裝。而F中的損失函數(shù)只是單純的函數(shù)。

當(dāng)然我們也可以自己構(gòu)造自己的損失函數(shù)對(duì)象。有時(shí)候損失函數(shù)并不需要太復(fù)雜，沒有必要特意封裝一個(gè)類，直接調(diào)用F中的函數(shù)也是可以的。使用哪種看具體實(shí)現(xiàn)需求而定。

CrossEntropyLoss

交叉熵?fù)p失，是分類任務(wù)中最常用的一個(gè)損失函數(shù)。

理論

直接上理論公式：

其中是真實(shí)標(biāo)簽, 是預(yù)測(cè)的類分布（通常是使用softmax將模型輸出轉(zhuǎn)換為概率分布), 也就是與中的元素分別表示對(duì)應(yīng)類別的概率。

舉個(gè)例子，清晰明了：

# 假設(shè)該樣本屬于第二類 # 因?yàn)槭欠植? 所以屬于各個(gè)類的和為 1

pytorch-實(shí)現(xiàn)

from?torch.nn?import?CrossEntropyLoss

舉例：

實(shí)際使用中需要注意幾點(diǎn):

torch.nn.CrossEntropyLoss(input, target)中的標(biāo)簽target使用的不是one-hot形式，而是類別的序號(hào)。形如 target = [1, 3, 2] 表示3個(gè)樣本分別屬于第1類、第3類、第2類。
torch.nn.CrossEntropyLoss(input, target)的input是沒有歸一化的每個(gè)類的得分，而不是softmax之后的分布。

舉例，輸入的形式大概就像相面這種格式:

然后就將他們?nèi)拥紺rossEntropyLoss函數(shù)中，就可以得到損失。

loss?=?CrossEntropyLoss(input,?target)

我們看CrossEntropyLoss函數(shù)里面的實(shí)現(xiàn)，是下面這樣子的：

def?forward(self,?input,?target):
????return?F.cross_entropy(input,?target,?weight=self.weight,
???????????????????????????ignore_index=self.ignore_index,?reduction=self.reduction)

是調(diào)用的torch.nn.functional（俗稱F）中的cross_entropy()函數(shù)。

參數(shù)

input：預(yù)測(cè)值，（batch，dim），這里dim就是要分類的總類別數(shù)
target：真實(shí)值，（batch），這里為啥是1維的？因?yàn)檎鎸?shí)值并不是用one-hot形式表示，而是直接傳類別id。
weight：指定權(quán)重，（dim），可選參數(shù)，可以給每個(gè)類指定一個(gè)權(quán)重。通常在訓(xùn)練數(shù)據(jù)中不同類別的樣本數(shù)量差別較大時(shí)，可以使用權(quán)重來平衡。
ignore_index：指定忽略一個(gè)真實(shí)值，（int），也就是手動(dòng)忽略一個(gè)真實(shí)值。
reduction：在[none, mean, sum]中選，string型。none表示不降維，返回和target相同形狀；mean表示對(duì)一個(gè)batch的損失求均值；sum表示對(duì)一個(gè)batch的損失求和。

其中參數(shù)weight、ignore_index、reduction要在實(shí)例化CrossEntropyLoss對(duì)象時(shí)指定，例如：

loss?=?torch.nn.CrossEntropyLoss(reduction='none')

我們?cè)倏匆幌?strong style="color: black;">F中的cross_entropy的實(shí)現(xiàn)：

return?nll_loss(log_softmax(input,?dim=1),?target,?weight,?None,?ignore_index,?None,?reduction)

可以看到就是先調(diào)用log_softmax,再調(diào)用nll_loss。

log_softmax就是先softmax再取log：

nll_loss 是negative log likelihood loss:

詳細(xì)介紹見下面torch.nn.NLLLoss，計(jì)算公式如下:

例如假設(shè) , class ，則，class class

源碼中給了個(gè)用法例子：

#?input?is?of?size?N?x?C?=?3?x?5
input?=?torch.randn(3,?5,?requires_grad=True)
#?each?element?in?target?has?to?have?0?<=?value?
target?=?torch.tensor([1,?0,?4])
output?=?F.nll_loss(F.log_softmax(input),?target)
output.backward()

因此，其實(shí)CrossEntropyLoss損失，就是softmax + log + nll_loss的集成。

CrossEntropyLoss(input,?target)?=?nll_loss(log_softmax(input,?dim=1),?target)

CrossEntropyLoss中的target必須是LongTensor類型。

實(shí)驗(yàn)如下：

pred?=?torch.FloatTensor([[2,?1],?[1,?2]])
target?=?torch.LongTensor([1,?0])

loss_fun?=?nn.CrossEntropyLoss()

loss?=?loss_fun(pred,?target)??
print(loss)??#?輸出為tensor(1.3133)
loss2?=?F.nll_loss(F.log_softmax(pred,?dim=1),?target)
print(loss2)??#?輸出為tensor(1.3133)

數(shù)學(xué)形式就是:

torch-nn-BCELoss

理論

CrossEntropy損失函數(shù)適用于總共有N個(gè)類別的分類。當(dāng)N=2時(shí)，即二分類任務(wù)，只需要判斷是還是否的情況，就可以使用二分類交叉熵?fù)p失：BCELoss 二分類交叉熵?fù)p失。上公式 （y是真實(shí)標(biāo)簽，x是預(yù)測(cè)值）：

其實(shí)這個(gè)函數(shù)就是CrossEntropyLoss的當(dāng)類別數(shù)N=2時(shí)候的特例。因?yàn)轭悇e數(shù)為2，屬于第一類的概率為y，那么屬于第二類的概率自然就是(1-y)。因此套用與CrossEntropy損失的計(jì)算方法，用對(duì)應(yīng)的標(biāo)簽乘以對(duì)應(yīng)的預(yù)測(cè)值再求和，就得到了最終的損失。

實(shí)踐

torch.nn.BCELoss(x,?y)

x形狀（batch，*），y形狀與x相同。

x與y中每個(gè)元素，表示的是該維度上屬于（或不屬于）這個(gè)類的概率。

另外，pytorch中的BCELoss可以為每個(gè)類指定權(quán)重。通常，當(dāng)訓(xùn)練數(shù)據(jù)中正例和反例的比例差別較大時(shí)，可以為其賦予不同的權(quán)重，weight的形狀應(yīng)該是一個(gè)一維的，元素的個(gè)數(shù)等于類別數(shù)。

實(shí)際使用如下例，計(jì)算BCELoss(pred, target)：

pred?=?torch.FloatTensor([0.4,?0.1])??#?可以理解為第一個(gè)元素分類為是的概率為0.4，第二個(gè)元素分類為是的概率為0.1。
target?=?torch.FloatTensor([0.2,?0.8])??#?實(shí)際上第一個(gè)元素分類為是的概率為0.2，第二個(gè)元素分類為是的概率為0.8。
loss_fun?=?nn.BCELoss(reduction='mean')??#?reduction可選?none,?sum,?mean,?batchmean
loss?=?loss_fun(pred,?target)
print(loss)??#?tensor(1.2275)

a?=?-(0.2?*?np.log(0.4)?+?0.8?*?np.log(0.6)?+?0.8?*?np.log(0.1)?+?0.2?*?np.log(0.9))/2
print(a)??#?1.2275294114572126

可以看到，計(jì)算BCELoss(pred，target)與上面理論中的公式一樣。

內(nèi)部實(shí)現(xiàn)

pytorch 中的torch.nn.BCELoss類，實(shí)際上就是調(diào)用了F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction)

torch.nn.BCEWithLogitsLoss

理論

該函數(shù)實(shí)際上與BCELoss相同，只是BCELoss的輸入x，在輸入之前需要先手動(dòng)經(jīng)過sigmoid激活函數(shù)映射到（0， 1）區(qū)間，而該函數(shù)將sigmoid與BCELoss整合到一起了。

也就是先將輸入經(jīng)過sigmoid函數(shù)，然后計(jì)算BCE損失。

實(shí)踐

torch.nn.BCEWithLogitsLoss(x,?y)

x與y的形狀要求與BCELoss相同。

pred?=?torch.FloatTensor([0.4,?0.1])
target?=?torch.FloatTensor([0.2,?0.8])
loss_fun?=?nn.BCEWithLogitsLoss(reduction='mean')??#?reduction可選?none,?sum,?mean,?batchmean
loss?=?loss_fun(pred,?target)
print(loss)??#?tensor(0.7487)

#?上面的過程與下面的過程結(jié)果相同
loss_fun?=?nn.BCELoss(reduction='mean')??#?reduction可選?none,?sum,?mean,?batchmean
loss?=?loss_fun(torch.sigmoid(pred),?target)??#?先經(jīng)過sigmoid，然后與target計(jì)算BCELoss
print(loss)??#?tensor(0.7487)

可以看出，先對(duì)輸入pred調(diào)用sigmoid，在調(diào)用BCELoss，結(jié)果就等于直接調(diào)用BCEWithLogitsLoss。

torch.nn.L1Loss

理論

L1損失很簡(jiǎn)單，公式如下：

x是預(yù)測(cè)值，y是真實(shí)值。

實(shí)踐

torch.nn.L1Loss(x,?y)

x形狀：任意形狀

y形狀：與輸入形狀相同

pred?=?torch.FloatTensor([[3,?1],?[1,?0]])
target?=?torch.FloatTensor([[1,?0],?[1,?0]])
loss_fun?=?nn.L1Loss()
loss?=?loss_fun(pred,?target)
print(loss)??#?tensor(0.7500)

其中L1Loss的內(nèi)部實(shí)現(xiàn)為：

def?forward(self,?input,?target):
????return?F.l1_loss(input,?target,?reduction=self.reduction)

我們可以看到，其實(shí)還是對(duì)F.l1_loss的封裝。

torch.nn.MSELoss

理論

L1Loss可以理解為向量的1-范數(shù)，MSE均方誤差就可以理解為向量的2-范數(shù)，或矩陣的F-范數(shù)。

x是預(yù)測(cè)值，y是真實(shí)值。

實(shí)踐

torch.nn.MSELoss(x,?y)

x任意形狀，y與x形狀相同。

pred?=?torch.FloatTensor([[3,?1],?[1,?0]])
target?=?torch.FloatTensor([[1,?0],?[1,?0]])
loss_fun?=?nn.MSELoss()
loss?=?loss_fun(pred,?target)
print(loss)??#?tensor(1.2500)

其中MSELoss內(nèi)部實(shí)現(xiàn)為：

def?forward(self,?input,?target):
????return?F.mse_loss(input,?target,?reduction=self.reduction)

本質(zhì)上是對(duì)F中mse_loss函數(shù)的封裝。

torch.nn.NLLLoss

理論

NLLLoss（Negative Log Likelihood Loss），其數(shù)學(xué)表達(dá)形式為：

前面講到CrossEntropyLoss中用的nll_loss，實(shí)際上，該損失函數(shù)就是對(duì)F.nll_loss的封裝，功能也和nll_loss相同。

正如前面所說，先把輸入x進(jìn)行softmax，在進(jìn)行log，再輸入該函數(shù)中就是CrossEntropyLoss。

實(shí)踐

torch.nn.NLLLoss(x,?y)

x是預(yù)測(cè)值，形狀為（batch，dim）

y是真實(shí)值，形狀為（batch）

形狀要求與CrossEntropyLoss相同。

pred?=?torch.FloatTensor([[3,?1],?[2,?4]])
target?=?torch.LongTensor([0,?1])??#target必須是Long型
loss_fun?=?nn.NLLLoss()
loss?=?loss_fun(pred,?target)
print(loss)??#?tensor(-3.5000)

其內(nèi)部實(shí)現(xiàn)實(shí)際上就是調(diào)用了F.nll_loss()：

def?forward(self,?input,?target):
????return?F.nll_loss(input,?target,?weight=self.weight,?ignore_index=self.ignore_index,?reduction=self.reduction)

torch.nn.KLDivLoss

理論

KL散度通常用來衡量?jī)蓚€(gè)連續(xù)分布之間的距離。兩個(gè)分布越相似，KL散度越接近0。

KL散度又叫相對(duì)熵，具體理論可以參考：https://lhyxx.top/2019/09/15/%E4%BF%A1%E6%81%AF%E8%AE%BA%E5%9F%BA%E7%A1%80-%E7%86%B5/

注意，這里 x 與 y 都是分布，分布就意味著其中所有元素求和概率為1。

則：

本例中計(jì)算的都是以e為底的。

實(shí)踐

torch.nn.KLDivLoss(input,?target)

試驗(yàn)測(cè)試torch.nn.KLDivLoss，計(jì)算KL(pred|target)：

pred?=?torch.FloatTensor([0.1,?0.2,?0.7])
target?=?torch.FloatTensor([0.5,?0.2,?0.3])
loss_fun?=?nn.KLDivLoss(reduction='sum')??#?reduction可選?none,?sum,?mean,?batchmean
loss?=?loss_fun(target.log(),?pred)
print(loss)??#?tensor(0.4322)

#上面的計(jì)算過程等價(jià)于下面
a?=?(0.1?*?np.log(1/5)?+?0.2?*?np.log(1)?+?0.7?*?np.log(7/3))
print(a)??#?0.43216

input應(yīng)該是log-probabilities，target是probabilities。input和target形狀相同。

該函數(shù)是對(duì)F.kl_div(input, target, reduction=self.reduction)的封裝。其原型為：torch.nn.functional.kl_div(input, target, size_average=None, reduce=None, reduction='mean')

注意，使用nn.KLDivLoss計(jì)算KL(pred|target)時(shí)，需要將pred和target調(diào)換位置，而且target需要先取對(duì)數(shù)：

loss_fun(target.log(),?pred)

如果覺得有用，就請(qǐng)分享到朋友圈吧！

點(diǎn)個(gè)在看 paper不斷！

實(shí)操教程｜Pytorch常用損失函數(shù)拆解

CrossEntropyLoss

理論

pytorch-實(shí)現(xiàn)

torch-nn-BCELoss

理論

實(shí)踐

內(nèi)部實(shí)現(xiàn)

torch.nn.BCEWithLogitsLoss

理論

實(shí)踐

torch.nn.L1Loss

理論

實(shí)踐

torch.nn.MSELoss

理論

實(shí)踐

torch.nn.NLLLoss

理論

實(shí)踐

torch.nn.KLDivLoss

理論

實(shí)踐