當前位置：首頁 > 编程资源 > 综合教程 >内容正文

综合教程

Theoretically Principled Trade-off between Robustness and Accuracy

發布時間：2023/12/15 综合教程 44 生活家

生活随笔收集整理的這篇文章主要介紹了 Theoretically Principled Trade-off between Robustness and Accuracy 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

目錄概主要內容符號說明ErrorClassification-calibrated surrogate loss引理2.1定理3.1定理3.2由此導出的TRADES算法實驗概述代碼

Zhang H, Yu Y, Jiao J, et al. Theoretically Principled Trade-off between Robustness and Accuracy[J]. arXiv: Learning, 2019.

@article{zhang2019theoretically,
title={Theoretically Principled Trade-off between Robustness and Accuracy},
author={Zhang, Hongyang and Yu, Yaodong and Jiao, Jiantao and Xing, Eric P and Ghaoui, Laurent El and Jordan, Michael I},
journal={arXiv: Learning},
year={2019}}

概

從二分類問題入手, 拆分(mathcal{R}_{rob})為(mathcal{R}_{nat},mathcal{R}_{bdy}), 通過(mathcal{R}_{rob}-mathcal{R}_{nat}^*)的上界建立損失函數，并將這種思想推廣到一般的多分類問題.

主要內容

符號說明

(X, Y): 隨機變量;
(xin mathcal{X}, y): 樣本, 對應的標簽((1, -1));
(f): 分類器(如神經網絡);
(mathbb{B}(x, epsilon)): ({x'in mathcal{X}:|x'-x| le epsilon});
(mathbb{B}(DB(f),epsilon)): ({x in mathcal{X}: exist x'in mathbb{B}(x,epsilon), mathrm{s.t.} : f(x)f(x')le0}) ;
(psi^*(u)): (sup_u{u^Tv-psi(u)}), 共軛函數;
(phi): surrogate loss.

Error

[ ag{e.1}
mathcal{R}_{rob}(f):= mathbb{E}_{(X,Y)sim mathcal{D}}mathbf{1}{exist X' in mathbb{B}(X, epsilon), mathrm{s.t.} : f(X')Y le 0},
]

其中(mathbf{1}(cdot))表示指示函數, 顯然(mathcal{R}_{rob}(f))是關于分類器(f)存在adversarial samples 的樣本的點的測度.

[ ag{e.2}
mathcal{R}_{nat}(f) :=mathbb{E}_{(X,Y)sim mathcal{D}}mathbf{1}{f(X)Y le 0},
]

顯然(mathcal{R}_{nat}(f))是(f)正確分類真實樣本的概率, 并且(mathcal{R}_{rob} ge mathcal{R}_{nat}).

[ ag{e.3}
mathcal{R}_{bdy}(f) :=mathbb{E}_{(X,Y)sim mathcal{D}}mathbf{1}{X in mathbb{B}(DB(f), epsilon), :f(X)Y > 0},
]

顯然

[ ag{1}
mathcal{R}_{rob}-mathcal{R}_{nat}=mathcal{R}_{bdy}.
]

因為想要最優化(0-1)loss是很困難的, 我們往往用替代的loss (phi), 定義:

[mathcal{R}_{phi}(f):= mathbb{E}_{(X, Y) sim mathcal{D}} phi(f(X)Y), \
mathcal{R}^*_{phi}(f):= min_f mathcal{R}_{phi}(f).
]

Classification-calibrated surrogate loss

這部分很重要, 但是篇幅很少, 我看懂, 等回看了引用的論文再討論.

引理2.1

定理3.1

在假設1的條件下(phi(0)ge1), 任意的可測函數(f:mathcal{X} ightarrow mathbb{R}), 任意的于(mathcal{X} imes {pm 1})上的概率分布, 任意的(lambda > 0), 有

[egin{array}{ll}
& mathcal{R}_{rob}(f) - mathcal{R}_{nat}^* \
le & psi^{-1}(mathcal{R}_{phi}(f)-mathcal{R}_{phi}^*) + mathbf{Pr}[X in mathbb{B}(DB(f), epsilon), f(X)Y >0] \
le & psi^{-1}(mathcal{R}_{phi}(f)-mathcal{R}_{phi}^*) + mathbb{E} quad max _{X' in mathbb{B}(X, epsilon)} phi(f(X')f(X)/lambda). \
end{array}
]

最后一個不等式, 我知道是因為(phi(f(X')f(X)/lambda) ge1.)

定理3.2

結合定理(3.1, 3.2)可知, 這個界是緊的.

由此導出的TRADES算法

二分類問題, 最優化上界, 即:

擴展到多分類問題, 只需:

算法如下:

實驗概述

5.1: 衡量該算法下, 理論上界的大小差距;
5.2: MNIST, CIFAR10 上衡量(lambda)的作用, (lambda)越大(mathcal{R}_{nat})越小, (mathcal{R}_{rob})越大, CIFAR10上反映比較明顯;
5.3: 在不同adversarial attacks 下不同算法的比較;
5.4: NIPS 2018 Adversarial Vision Challenge.

代碼










import torch
import torch.nn as nn





def quireone(func): #a decorator, for easy to define optimizer
    def wrapper1(*args, **kwargs):
        def wrapper2(arg):
            result = func(arg, *args, **kwargs)
            return result
        wrapper2.__doc__ = func.__doc__
        wrapper2.__name__ = func.__name__
        return wrapper2
    return wrapper1


class AdvTrain:

    def __init__(self, eta, k, lam,
                 net, lr = 0.01, **kwargs):
        """
        :param eta: step size for adversarial attacks
        :param lr: learning rate
        :param k: number of iterations K in inner optimization
        :param lam: lambda
        :param net: network
        :param kwargs: other configs for optim
        """
        kwargs.update({'lr':lr})
        self.net = net
        self.criterion = nn.CrossEntropyLoss()
        self.opti = self.optim(self.net.parameters(), **kwargs)
        self.eta = eta
        self.k = k
        self.lam = lam

    @quireone
    def optim(self, parameters, **kwargs):
        """
        quireone is decorator defined below
        :param parameters: net.parameteres()
        :param kwargs: other configs
        :return:
        """
        return torch.optim.SGD(parameters, **kwargs)


    def normal_perturb(self, x, sigma=1.):

        return x + sigma * torch.randn_like(x)

    @staticmethod
    def calc_jacobian(loss, inp):
        jacobian = torch.autograd.grad(loss, inp, retain_graph=True)[0]
        return jacobian

    @staticmethod
    def sgn(matrix):
        return torch.sign(matrix)

    def pgd(self, inp, y, perturb):
        boundary_low = inp - perturb
        boundary_up = inp + perturb
        inp.requires_grad_(True)
        out = self.net(inp)
        loss = self.criterion(out, y)
        delta = self.sgn(self.calc_jacobian(loss, inp)) * self.eta
        inp_new = inp.data
        for i in range(self.k):
            inp_new = torch.clamp(
                inp_new + delta,
                boundary_low,
                boundary_up
            )
        return inp_new

    def ipgd(self, inps, ys, perturb):
        N = len(inps)
        adversarial_samples = []
        for i in range(N):
            inp_new = self.pgd(
                inps[[i]], ys[[i]],
                perturb
            )
            adversarial_samples.append(inp_new)

        return torch.cat(adversarial_samples)

    def train(self, trainloader, epoches=50, perturb=1, normal=1):

        for epoch in range(epoches):
            running_loss = 0.
            for i, data in enumerate(trainloader, 1):
                inps, labels = data

                adv_inps = self.ipgd(self.normal_perturb(inps, normal),
                                     labels, perturb)

                out1 = self.net(inps)
                out2 = self.net(adv_inps)

                loss1 = self.criterion(out1, labels)
                loss2 = self.criterion(out2, labels)

                loss = loss1 + loss2

                self.opti.zero_grad()
                loss.backward()
                self.opti.step()
                
                running_loss += loss.item()

                if i % 10 is 0:
                    strings = "epoch {0:<3} part {1:<5} loss: {2:<.7f}
".format(
                        epoch, i, running_loss
                    )
                    print(strings)
                    running_loss = 0.

總結

以上是生活随笔為你收集整理的Theoretically Principled Trade-off between Robustness and Accuracy的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： android对skia的封装,Skia
下一篇： Win7如何快速打开本地连接(如何安装W