Efficient Discriminative Learning of Parts-based Models
M. Pawan Kumar
Andrew Zisserman
Philip Torr
http://www.robots.ox.ac.uk/~vgg
http://cms.brookes.ac.uk/research/visiongroup
Aim: To efficiently learn parts-based models which
Efficient Reformulation
discriminate between positive and negative poses of
the object category
Parts-based Model
G = (V, E)
b
Restricted to Tree
f:V
a
Pose of V (h values)
Q(f) = Qa(f(a)) + Qab(f(a), f(b))
Qa(f(a)) : Unary potential for f(a)
Computed using features
Qab(f(a), f(b)): Pairwise potential
Restricted to Potts
for validity of (f(a),f(b))
Qa(f(a)) : wa (f(a))
Qa(f(a),f(b)) : wab (f(a),f(b))
T
T
min ||w|| + C i
wT(f+i) + 1 - +i
-
(f(a),f(b))
-
w (f ij) + -1 + i
T
For all j (exponential in |V|)
Related Work
Local Iterative Support Vector Machine (ISVM-1)
Start with a small subset of negative examples (1 per image)
Solve for w and b
Replace negative examples with current MAP estimates
Converges to local optimum
Global Iterative Support Vector Machine (ISVM-2)
Start with a small subset of negative examples (1 per image)
Solve for w and b
Add current MAP estimates to set of negative examples
Converges to global optimum
Drawback: Requires obtaining MAP estimate of each image
at each iteration (computationally expensive)
Linear in |V|
M (k) wbb(l), for all l
M (k) wbb(l) + wab,
for all (k,l) Lab
i
waa(k) + b M ba(k) + -1 + i
i
ba
i
ba
100 training images, 95 test images
ISVM-1
Linear in h
Linear in |Lab|
ISVM-2
Our
Solving the Dual
Q(f) : w (f)
Maximize margin, minimize hinge loss
High energy for all positive examples
Low energy for all negative examples
Exponential in |V|
-
M (k) analogous to messages in Belief Propagation (BP)
Efficient BP using distance transform: Felzenszwalb and Huttenlocher, 2004
T
= 1, if (f(a),f(b)) Lab,
= 0, otherwise.
-
w (f ij) + -1 + i, for all j
T
Results - Sign Language
i
ba
b
The Learning Problem
ISVMs run for twice as long
a
max abT1 - abTKabab
s.t.
T
ab
b
y = 0, ab 0
max
T
ba
1-
Kbaba
T
ba
Our: 86.4% Buehler et al.,2008: 87.7%
s.t. baTy = 0, ba 0
0 iab(k) + iab(k,l)
a
iab(k) + iab(k,l) C
0 iba(k) + iba(k,l)
ba(k) + ba(k,l) C
i
i
Problem (1)
Problem (2)
Problem (1) learns the unary weight vector wa and pairwise weight wab
Problem (2) learns the unary weight vector wb and pairwise weight wab
i
i
(k)
=
Constraint (3)
Results in a large minimal problem
k
ab
l ba(l)
Dual Decomposition
minimal
problem
size = 2
ISVM-1
max min gi(xi) + i(xi - x), s.t. xi P
Solve min gi(xi) + ixi
Master
Problem(1)
196 training images, 204 test images
min i gi(x), subject to x P
min gi(xi), s.t. xi P, xi = x
KKT Condition: i = 0
Results - Buffy
SVM-like
problems
Implementation Details
i = i +xi*
Project
ISVM-2
Update Lagrange multiplier of (3)
Problem (2)
Modified SVM
Light
Features Shape: HOG Appearance: (x,x2), x = fraction of skin pixels
Data Positive examples: Provided by user Negative examples: All other poses
Occlusion Each putative pose can be occluded (twice the number of labels)
Our
Our: 39.2% Ferrari et al.,2008: 41.0%