Towards Total Recall in Industrial Anomaly Detection

Jul 21, 2021 3 min read Anomaly Detection

Go to Project Site

Jul 21, 2021 3 min read Anomaly Detection

Go to Project Site

1. どんなもの？

sybmol	description
$x_i$	i番目の入力画像
$\phi_j(x_i) = \phi_{i, j} \in \mathbb{R}^{c^* \times h^* \times w^*} $	pretraind model $\phi$の$j$層目の特徴量
$\phi_{i, j}(h, w) = \phi_j(x_i, h, w) \in \mathbb{R}^{c^*}$	$\phi_{i, j}$の位置$(h, w)$の特徴ベクトル
$p$	パッチサイズ
$f_{agg}$	average poolingなどの近傍のfeatureを集約する関数
$\psi \colon \mathbb{R}^d \rightarrow \mathbb{R}^{d^*} $	random linear projections

パッチの近傍を定義
$\displaystyle \begin{aligned} \mathcal{N}_{p}^{(h, w)}=\{(a, b) \mid &a \in[h-\lfloor p / 2\rfloor, \ldots, h+\lfloor p / 2\rfloor]\\ &b \in[w-\lfloor p / 2\rfloor, \ldots, w+\lfloor p / 2\rfloor]\} \end{aligned}$
そのパッチに対応するfeatureを定義
- $f_{agg}$はaverage poolingなどの近傍のfeatureを集約する関数 $\displaystyle \phi_{i, j}\left(\mathcal{N}_{p}^{(h, w)}\right)=f_{\text {agg }}\left(\left\{\phi_{i, j}(a, b) \mid(a, b) \in \mathcal{N}_{p}^{(h, w)}\right\}\right)$
$\phi_{i, j}$から生成されるpatch featureの集合を定義
$\displaystyle \begin{aligned} &\mathcal{P}_{s, p}\left(\phi_{i, j}\right)=\left\{\phi_{i, j}\left(\mathcal{N}_{p}^{(h, w)}\right) \mid\right. \\ &\left.h, w \bmod s=0, h<h^{*}, w<w^{*}, h, w \in \mathbb{N}\right\} \end{aligned}$
Patchcoreでは$j$層目の特徴と$j+1$層目の特徴をresize & concat
全訓練データ（正常）の特徴メモリバンクを定義
$\displaystyle \mathcal{M} = \bigcup_{x_i \in \mathcal{X}_N}{\mathcal{P}_{s,p}(\phi_j(x_i))}$

SPADEのように全サンプルに対して，距離計算するのは時間がかかる → coresetを生成して短縮したい
coresetの定義 $\displaystyle \mathcal{M}^*_c = \argmin_{\mathcal{M}_C \subset \mathcal{M}} \max_{m \in \mathcal{M}} \min_{n \in \mathcal{M}_C} \| m - n \| ^2$
$\mathcal{M}^*_c$ はNP-hardなので，iterative greedy approximationで近似

$x_{\text{test}}$のパッチ特徴集合$\mathcal{P}(x_{\text{test}}) = \mathcal{P}_{s,p}(\phi_j(x_{\text{test}}))$のそれぞれとcoreset内で最も近い特徴ベクトルを探し，その中でも最も距離が大きいものを求める
$\displaystyle m^{\text {test,* }}, m^{*}=\underset{m^{\text {test }} \in \mathcal{P}\left(x^{\text {test }}\right)}{\arg \max } \underset{m \in \mathcal{M}}{\arg \min }\left\|m^{\text {test }}-m\right\|_{2}$
その時の最大距離$s^*$は
$\displaystyle s^* = \| m^{\text{test}, *} - m^* \|_2$
近接パッチの情報を考慮したimage-levelの異常度は
- $\mathcal{M}$内のの$b$個の近傍パッチの集合を$\mathcal{N}_b(m^*)$とすると $\displaystyle s=\left(1-\frac{\exp \left\|m^{\text {test }, *}-m^{*}\right\|_{2}}{\sum_{m \in \mathcal{N}_{b}\left(m^{*}\right)} \exp \left\|m^{\text {test }, *}-m\right\|_{2}}\right) \cdot s^{*}$
patch-levelの異常度は
$\displaystyle \mathcal{S} = \{ \|m^{\text{test}} - m\|_2 \mid m^{\text{test}} \in \mathcal{P}(x_{\text{test}}), m=\argmin_{m \in \mathcal{M}} \| m^{\text{test}} - m \|_2 \}$