攻击神经网络
Fast Gradient Method (Explaining & Harnessing Adversarial Examples)
linear model:
$$
y’=w^Tx’ = w^Tx+w^T\sigma
$$
where $\sigma$ is the change of original input x, and we have:
$$
\sigma = \varepsilon sign(w)
$$
理解: 在保持$|\sigma|$ 不变情况下,y’ 变化最大只可能 使 $w^T\sigma = |w|_1$
DNN
$$
\sigma = \varepsilon sign (\triangledown_xJ(\theta,x,y_{true}))
$$
where J is the loss function. In most cases (soft-max as classifier layer)
理解: 要让y’变化最大,及 应向loss 变大的方向变化。 即若$(\triangledown_xJ(\theta,x,y))$ 为负数(loss 与 x 的gradient)则 x应变小 ($\sigma$需为负数)
Iterative method
Basic iterative method
update equation (1) $x_2’ = x_1’ + \sigma$
Targeted
(3) becomes:
$$
\sigma = -\varepsilon sign (\triangledown_xJ(\theta,x,y_{targeted}))
$$
Our Method
What we have known and what we can obtained now?
Printable image size:
$P(i,j)$ as the indexed boxes with size: Black/white box: $P_h\times P_w$
Actual image ($100\times 100$) per frame.
$S_I$: starting index of the billboard for each frame.
$R_I$: range of the billboard for each frame.
$G_I$: gradient of each pixel in the bill board/
What we need to calculate
$G_P$ :gradient of each box in P.
How?
Transform:
Nearest
SPP:
Binary