model
conference
paper
first author
institute
loss fun
DeepFace: Closing the Gap to Human-Level Performance in Face Verification
sofrmax loss + contrastive loss
Deep Learning Face Representation from Predicting 10,000 Classes
The Chinese University of Hongkong
sofrmax loss + contrastive loss
Deep Learning Face Representation by Joint Identification-Verification
The Chinese University of Hongkong
sofrmax loss + contrastive loss
FaceNet: A Unified Embedding for Face Recognition and Clustering
A Discriminative Feature Learning Approach for Deep Face Recognition
Shenzhen key lab of computer Vision and Pattern recognition
Large-Margin Softmax Loss for Convolutional Neural Networks
Weiyang Liu & Yandong Wen
Peking Uiversity, South China University of Technology
SphereFace: Deep Hypersphere Embedding for Face Recognition
Georgia Institute of Technology
CosFace: Large Margin Cosine Loss for Deep Face Recognition
ArcFace: Additive Angular Margin Loss for Deep Face Recognition
Imperial College London, InsightFace
additive angular margin loss
人脸识别属于度量学习的范畴,学习到的人脸特征具有以下特点
Comparison of open-set and closed-set recognition
Contrastive Loss
Ident ( f , t , θ i d ) = − ∑ i = 1 n − p i log p ^ i = − log p ^ t Verif ( f i , f j , y i j , θ v e ) = { 1 2 ∥ f i − f j ∥ 2 2 if y i j = 1 1 2 max ( 0 , m − ∥ f i − f j ∥ 2 ) 2 if y i j = − 1 \begin{array}{l} \operatorname{Ident}\left(f, t, \theta_{i d}\right)=-\sum_{i=1}^{n}-p_{i} \log \hat{p}_{i}=-\log \hat{p}_{t} \\ \operatorname{Verif}\left(f_{i}, f_{j}, y_{i j}, \theta_{v e}\right)=\left\{\begin{array}{ll} \frac{1}{2}\left\|f_{i}-f_{j}\right\|_{2}^{2} & \text { if } y_{i j}=1 \\ \frac{1}{2} \max \left(0, m-\left\|f_{i}-f_{j}\right\|_{2}\right)^{2} & \text { if } y_{i j}=-1 \end{array}\right. \end{array} Ident ( f , t , θ i d ) = − ∑ i = 1 n − p i log p ^ i = − log p ^ t Verif ( f i , f j , y ij , θ v e ) = { 2 1 ∥ f i − f j ∥ 2 2 2 1 max ( 0 , m − ∥ f i − f j ∥ 2 ) 2 if y ij = 1 if y ij = − 1 Triplet Loss
∑ i N [ ∥ f ( x i a ) − f ( x i p ) ∥ 2 2 − ∥ f ( x i a ) − f ( x i n ) ∥ 2 2 + α ] + \sum_{i}^{N}\left[\left\|f\left(x_{i}^{a}\right)-f\left(x_{i}^{p}\right)\right\|_{2}^{2}-\left\|f\left(x_{i}^{a}\right)-f\left(x_{i}^{n}\right)\right\|_{2}^{2}+\alpha\right]_{+} i ∑ N [ ∥ f ( x i a ) − f ( x i p ) ∥ 2 2 − ∥ f ( x i a ) − f ( x i n ) ∥ 2 2 + α ] + Center Loss
L C = 1 2 ∑ i = 1 m ∥ x i − c y i ∥ 2 2 Δ c j = ∑ i = 1 m δ ( y i = j ) ⋅ ( c j − x i ) 1 + ∑ i = 1 m δ ( y i = j ) L = L S + λ L C = − ∑ i = 1 m log e W y i T x i + b y i ∑ j = 1 n e W j T x i + b j + λ 2 ∑ i = 1 m ∥ x i − c y i ∥ 2 2 \begin{aligned} \mathcal{L}_{C}=& \frac{1}{2} \sum_{i=1}^{m}\left\|\boldsymbol{x}_{i}-\boldsymbol{c}_{y_{i}}\right\|_{2}^{2} \\ \Delta \boldsymbol{c}_{j}=& \frac{\sum_{i=1}^{m} \delta\left(y_{i}=j\right) \cdot\left(\boldsymbol{c}_{j}-\boldsymbol{x}_{i}\right)}{1+\sum_{i=1}^{m} \delta\left(y_{i}=j\right)} \\ \mathcal{L}=& \mathcal{L}_{S}+\lambda \mathcal{L}_{C} \\ =-& \sum_{i=1}^{m} \log \frac{e^{W_{y_{i}}^{T} \boldsymbol{x}_{i}+b_{y_{i}}}}{\sum_{j=1}^{n} e^{W_{j}^{T} \boldsymbol{x}_{i}+b_{j}}}+\frac{\lambda}{2} \sum_{i=1}^{m}\left\|\boldsymbol{x}_{i}-\boldsymbol{c}_{y_{i}}\right\|_{2}^{2} \end{aligned} L C = Δ c j = L = = − 2 1 i = 1 ∑ m ∥ x i − c y i ∥ 2 2 1 + ∑ i = 1 m δ ( y i = j ) ∑ i = 1 m δ ( y i = j ) ⋅ ( c j − x i ) L S + λ L C i = 1 ∑ m log ∑ j = 1 n e W j T x i + b j e W y i T x i + b y i + 2 λ i = 1 ∑ m ∥ x i − c y i ∥ 2 2 L-Softmax Loss
∥ W 1 ∥ ∥ x ∥ cos ( θ 1 ) > ∥ W 2 ∥ ∥ x ∥ cos ( θ 2 ) ∥ W 1 ∥ ∥ x ∥ cos ( θ 1 ) ≥ ∥ W 1 ∥ ∥ x ∥ cos ( m θ 1 ) > ∥ W 2 ∥ ∥ x ∥ cos ( θ 2 ) L i = − log ( e ∥ W y i ∥ ∥ x i ∥ ψ ( θ y i ) e ∥ W y i ∥ ∥ x i ∥ ψ ( θ y i ) + ∑ j ≠ y i e ∥ W j ∥ ∥ x i ∥ cos ( θ j ) ) ψ ( θ ) = { cos ( m θ ) , 0 ≤ θ ≤ π m D ( θ ) , π m < θ ≤ π ψ ( θ ) = ( − 1 ) k cos ( m θ ) − 2 k , θ ∈ [ k π m , ( k + 1 ) π m ] \begin{array}{l} \left\|\boldsymbol{W}_{1}\right\|\|\boldsymbol{x}\| \cos \left(\theta_{1}\right)>\left\|\boldsymbol{W}_{2}\right\|\|\boldsymbol{x}\| \cos \left(\theta_{2}\right) \\ \\ \begin{aligned} \left\|\boldsymbol{W}_{1}\right\|\|\boldsymbol{x}\| \cos \left(\theta_{1}\right) & \geq\left\|\boldsymbol{W}_{1}\right\|\|\boldsymbol{x}\| \cos \left(m \theta_{1}\right) \\ &>\left\|\boldsymbol{W}_{2}\right\|\|\boldsymbol{x}\| \cos \left(\theta_{2}\right) \end{aligned} \\ L_{i}=-\log \left(\frac{e^{\left\|\boldsymbol{W}_{y_{i}}\right\|\left\|\boldsymbol{x}_{i}\right\| \psi\left(\theta_{y_{i}}\right)}}{e^{\left\|\boldsymbol{W}_{y_{i}}\right\|\left\|\boldsymbol{x}_{i}\right\| \psi\left(\theta_{y_{i}}\right)}+\sum_{j \neq y_{i}} e^{\left\|\boldsymbol{W}_{j}\right\|\left\|\boldsymbol{x}_{i}\right\| \cos \left(\theta_{j}\right)}}\right) \end{array} \\ \psi(\theta)=\left\{\begin{array}{l} \cos (m \theta), \quad 0 \leq \theta \leq \frac{\pi}{m} \\ \mathcal{D}(\theta), \quad \frac{\pi}{m}<\theta \leq \pi \end{array}\right. \\ \psi(\theta)=(-1)^{k} \cos (m \theta)-2 k, \quad \theta \in\left[\frac{k \pi}{m}, \frac{(k+1) \pi}{m}\right] \\ ∥ W 1 ∥ ∥ x ∥ cos ( θ 1 ) > ∥ W 2 ∥ ∥ x ∥ cos ( θ 2 ) ∥ W 1 ∥ ∥ x ∥ cos ( θ 1 ) ≥ ∥ W 1 ∥ ∥ x ∥ cos ( m θ 1 ) > ∥ W 2 ∥ ∥ x ∥ cos ( θ 2 ) L i = − log ( e ∥ W y i ∥ ∥ x i ∥ ψ ( θ y i ) + ∑ j = y i e ∥ W j ∥ ∥ x i ∥ c o s ( θ j ) e ∥ W y i ∥ ∥ x i ∥ ψ ( θ y i ) ) ψ ( θ ) = { cos ( m θ ) , 0 ≤ θ ≤ m π D ( θ ) , m π < θ ≤ π ψ ( θ ) = ( − 1 ) k cos ( m θ ) − 2 k , θ ∈ [ m kπ , m ( k + 1 ) π ] SphereFace
Weight Norm and zero bias ∥ W i ∥ = 1 , b i = 0 Classification boundary cos ( m θ 1 ) = cos ( θ 2 ) L a n g = 1 N ∑ i − log ( e ∥ x i ∥ ψ ( θ y i , i ) e ∥ x i ∥ ψ ( θ y i , i ) + ∑ j ≠ y i e ∥ x i ∥ cos ( θ j , i ) ) \begin{aligned} &\text { Weight Norm and zero bias } \quad\left\|\boldsymbol{W}_{i}\right\|=1, b_{i}=0\\ &\text { Classification boundary } \cos \left(m \theta_{1}\right)=\cos \left(\theta_{2}\right)\\ &L_{\mathrm{ang}}=\frac{1}{N} \sum_{i}-\log \left(\frac{e^{\left\|\boldsymbol{x}_{i}\right\| \psi\left(\theta_{y_{i}, i}\right)}}{e^{\left\|\boldsymbol{x}_{i}\right\| \psi\left(\theta_{y_{i}, i}\right)}+\sum_{j \neq y_{i}} e^{\left\|\boldsymbol{x}_{i}\right\| \cos \left(\theta_{j, i}\right)}}\right) \end{aligned} Weight Norm and zero bias ∥ W i ∥ = 1 , b i = 0 Classification boundary cos ( m θ 1 ) = cos ( θ 2 ) L ang = N 1 i ∑ − log e ∥ x i ∥ ψ ( θ y i , i ) + ∑ j = y i e ∥ x i ∥ c o s ( θ j , i ) e ∥ x i ∥ ψ ( θ y i , i ) CosFace
Weight Norm and Feature Norm
W = W ∗ ∥ W ∗ ∥ x = x ∗ ∥ x ∗ ∥ cos ( θ j , i ) = W j T x i \begin{aligned} W &=\frac{W^{*}}{\left\|W^{*}\right\|} \\ x &=\frac{x^{*}}{\left\|x^{*}\right\|} \\ \cos \left(\theta_{j}, i\right) &=W_{j}^{T} x_{i} \end{aligned} W x cos ( θ j , i ) = ∥ W ∗ ∥ W ∗ = ∥ x ∗ ∥ x ∗ = W j T x i classification boundary
cos ( θ 1 ) − m > cos ( θ 2 ) and cos ( θ 2 ) − m > cos ( θ 1 ) \cos \left(\theta_{1}\right)-m>\cos \left(\theta_{2}\right) \text { and } \cos \left(\theta_{2}\right)-m>\cos \left(\theta_{1}\right) cos ( θ 1 ) − m > cos ( θ 2 ) and cos ( θ 2 ) − m > cos ( θ 1 ) loss function
L l m c = 1 N ∑ i − log e s ( cos ( θ y i , i ) − m ) e s ( cos ( θ y i , i ) − m ) + ∑ j ≠ y i e s cos ( θ j , i ) L_{l m c}=\frac{1}{N} \sum_{i}-\log \frac{e^{s\left(\cos \left(\theta_{y_{i}, i}\right)-m\right)}}{e^{s\left(\cos \left(\theta_{y_{i}, i}\right)-m\right)}+\sum_{j \neq y_{i}} e^{s \cos \left(\theta_{j, i}\right)}} L l m c = N 1 i ∑ − log e s ( c o s ( θ y i , i ) − m ) + ∑ j = y i e s c o s ( θ j , i ) e s ( c o s ( θ y i , i ) − m ) 其中,NSL是Normalized version of Softmax Loss。
Arcface
cos ( θ 1 − m ) > cos ( θ 2 ) L 3 = − 1 N ∑ i = 1 N log e s ( cos ( θ y i + m ) ) e s ( cos ( θ y i + m ) ) + ∑ j = 1 , j ≠ y i n e s cos θ j . L 4 = − 1 N ∑ i = 1 N log e s ( cos ( m 1 θ y i + m 2 ) − m 3 ) e s ( cos ( m 1 θ j i + m 2 ) − m 3 ) + ∑ j = 1 , j ≠ y i n e s cos θ j . \begin{array}{l} \cos \left(\theta_{1}-m\right)>\cos \left(\theta_{2}\right) \\ L_{3}=-\frac{1}{N} \sum_{i=1}^{N} \log \frac{e^{s\left(\cos \left(\theta_{y_{i}}+m\right)\right)}}{e^{s\left(\cos \left(\theta_{y_{i}}+m\right)\right)}+\sum_{j=1, j \neq y_{i}}^{n} e^{s \cos \theta_{j}}} . \\ L_{4}=-\frac{1}{N} \sum_{i=1}^{N} \log \frac{e^{s\left(\cos \left(m_{1} \theta_{y_{i}}+m_{2}\right)-m_{3}\right)}}{\left.e^{s\left(\cos \left(m_{1} \theta_{j_{i}}+m_{2}\right)-m_{3}\right.}\right)+\sum_{j=1, j \neq y_{i}}^{n} e^{s \cos \theta_{j}}} . \end{array} cos ( θ 1 − m ) > cos ( θ 2 ) L 3 = − N 1 ∑ i = 1 N log e s ( c o s ( θ y i + m ) ) + ∑ j = 1 , j = y i n e s c o s θ j e s ( c o s ( θ y i + m ) ) . L 4 = − N 1 ∑ i = 1 N log e s ( c o s ( m 1 θ j i + m 2 ) − m 3 ) + ∑ j = 1 , j = y i n e s c o s θ j e s ( c o s ( m 1 θ y i + m 2 ) − m 3 ) . 人脸损失可视化
采用通用的人脸损失公式,采用不同的参数如下,在minist上的可视化效果见Visualization of Face Loss
loss name
w_norm
x_norm
s
m1
m2
m3