Adversarial attack

攻击神经网络

Fast Gradient Method (Explaining & Harnessing Adversarial Examples)

linear model:

$$
y’=w^Tx’ = w^Tx+w^T\sigma
$$

where $\sigma$ is the change of original input x, and we have:
$$
\sigma = \varepsilon sign(w)
$$
理解：在保持$|\sigma|$ 不变情况下，y’ 变化最大只可能使 $w^T\sigma = |w|_1$

DNN
$$
\sigma = \varepsilon sign (\triangledown_xJ(\theta,x,y_{true}))
$$

where J is the loss function. In most cases (soft-max as classifier layer)

理解：要让y’变化最大，及应向loss 变大的方向变化。即若$(\triangledown_xJ(\theta,x,y))$ 为负数（loss 与 x 的gradient）则 x应变小（$\sigma$需为负数）

Iterative method

Basic iterative method

update equation (1) $x_2’ = x_1’ + \sigma$

Targeted

(3) becomes:
$$
\sigma = -\varepsilon sign (\triangledown_xJ(\theta,x,y_{targeted}))
$$

Our Method

What we have known and what we can obtained now?

Printable image size:

$P(i,j)$ as the indexed boxes with size: Black/white box: $P_h\times P_w$
Actual image ($100\times 100$) per frame.

$S_I$: starting index of the billboard for each frame.

$R_I$: range of the billboard for each frame.

$G_I$: gradient of each pixel in the bill board/

What we need to calculate

$G_P$ :gradient of each box in P.

How?

Transform:

Nearest
SPP:
Binary