FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:   || 2 | 3 |

«Abstract. We propose a novel Expanded Part-based Metric Learning (EPML) model for face verification. The model is capable of mining out the ...»

-- [ Page 1 ] --

EPML: Expanded Parts based Metric Learning

for Occlusion Robust Face Verification

Gaurav Sharma1, Fr´d´ric Jurie2 and Patrick P´rez1

ee e




GREYC CNRS UMR 6072, University of Caen Basse-Normandy

Abstract. We propose a novel Expanded Part-based Metric Learning

(EPML) model for face verification. The model is capable of mining out

the discriminative regions at the right locations and scales, for identity

based matching of face images. It performs well in the presence of oc- clusions, by avoiding the occluded regions and selecting the next best visible regions. We show quantitatively, by experiments on the standard benchmark dataset Labeled Faces in the Wild (LFW), that the model works much better than the traditional method of face representation with metric learning, both (i) in the presence of heavy random occlu- sions and, (ii) also, in the case of focussed occlusions of discriminative face regions such as eyes or mouth. Further, we present qualitative results which demonstrate that the method is capable of ignoring the occluded regions while exploiting the visible ones.

1 Introduction Face verification technology is critical for many modern systems. Handling oc- clusions is a major challenge in its real world application. In the present paper, we propose a novel Expanded Parts based Metric Learning (EPML) model which is capable of identifying many discriminative parts of the face, for the task of predicting if two faces are of the same person or not, especially in the presence of occlusions.

Metric Learning approaches [1–3] have recently shown promise for the task of face verification. However the face representation is usually fixed and is separate from the task of learning the model. The faces are usually represented as either features computed on face landmarks [3] or over a fixed grid of cells [4]. Once such fixed representations are obtained, a discriminative metric learning model is learned, with annotated same and not-same faces, for comparing faces based on identity. Since the representation is fixed, it is completely the model’s re- sponsibility to tackle the challenge of occlusions that might occur in in-the-wild applications. In the present, the proposed EPML model learns a collection of discriminative parts (of the faces) along with the discriminative model to com- pare faces with such a collection of parts. The collection of discriminative parts is automatically mined from a large set of randomly sampled candidate parts, in the learning phase. The distance function used considers the distances between all the n different parts in the model and computes the final distance between 2 Sharma, Jurie and P´rez e the two faces with the closest (small number of) k n parts. Such min operation lends non-linearity to the model and allows it to selectively choose/reject parts at runtime, which is in contrast to the traditional representation where the model has no choice but to consider the whole face. This capability is specially useful in case of occlusion: while the traditional method is misguided by the occluded part(s), the proposed method can simply choose to ignore significantly occluded parts and use only the visible parts. We discuss this further later, along with qualitative results, in §3. In the following, we first set the context by briefly describing the traditional metric learning approaches (§1.1). We then motivate our method (§2) and present it in detail (§2.1) and finally we give experimental results (§3) and conclude the article (§4).

1.1 Background: Face verification using metric learning Given a face image dataset X of positive (of the same person) pairs of images and negative pairs of images, represented with some feature vectors i.e.,

–  –  –

with xi ∈ RD a face feature vector and yij ∈ {−1, +1}, the task is to learn a function to predict if two unseen face images, potentially of unseen person(s), are of the same person or not.

Metric learning approaches, along with standard image representations, have been recently shown to be well suited to this task [1–3]. Such approaches learn from X a Mahalanobis-like metric parametrized by matrix M ∈ RD×D, i.e.,

–  –  –

with L ∈ Rd×D and d ≤ D. The metric learning can then be seen as an embedding learning problem: To compare two vectors, first project them on the d-dim row-space of L and then compare them with their 2 distance in the projected space, i.e.,

–  –  –

Many regularized loss minimization algorithms have been proposed to learn such metrics with the loss functions arising from probabilistic (likelihood) or margin-maximizing motivations [1–3].

EPML: Expanded Parts based Metric Learning 3

1.2 Related work The recognition of face under occlusions has a long history in computer vision literature. One of the pioneering work was that of Leonardis et al. [5] who proposed to make the Eigenface method more robust to occlusion by computing the coefficients of the eigenimages with a hypothesize-and-test paradigm using subsets of image points. Since then, more efficient face matching algorithms have been proposed, raising the question of how to make them more robust to occlusions. The best performing state-of-the-art methods (e.g. [6, 7]) are holistic in the sense that they represent the whole face by a single vector and are, hence, expected to be sensitive to occlusions.

The impact of occlusions on face recognition has been studied by Rama et al. [8], who evaluated three different approaches based on Principal Component Analysis (PCA) (i.e., the eigenface approach, a component-based method built on the eigen-features and an extension of the Lophoscopic Principal Component Analysis). They analysed the three different strategies and compared them when used for identifying partially occluded faces. The paper also explored how prior knowledge about occlusions, which might be present, can be used to improve the recognition performance.

Generally speaking, two different methodologies have been proposed in the literature. One consists of detecting the occlusions and reconstructing occluded parts prior to doing face recognition, while the other one relies on integrated approaches (of description and recognition together) i.e., those that are robust to occlusions by construction.

Within the family of detect and reconstruct approaches, one can mention the works of Colombo et al. [9, 10], who detect occlusions by doing a comparison of the input image with a generic model of face and reconstruct missing part with the Gappy Principal Component Analysis (GPCA) [11]. Lin et al. [12], propose another approach for automatically detecting and recovering the occluded facial regions. They consider the formation of an occluded image as a generative process which is then used to guide the procedure of recovery. More recently, Oh et al. [13] proposed to detect occluded parts by dividing the image into a finite number of disjoint local patches coded by PCA and then using 1-NN threshold classifier. Once detected, only the occlusion-free image patches are used for the later face recognition stage, where the matching is done with a nearest neighbor classifier using the Euclidean distance. Wright et al. [14] and Zhou et al. [15] explored the use of sparse coding, proposing a way to efficiently and reliably identify corrupted regions and exclude them from the sparse representation.

Sparse coding has also been used efficiently by Ou et al. [16], while Morelli et al. [17] have proposed using compressed sensing. Min et al. [18] proposed to detect the occlusion using Gabor wavelets, PCA and support vector machines (SVM), and then do recognition with the non-occluded facial parts, using blockbased Local Binary Patterns of Ojala et al. [19]. Tajima et al. [20] suggested detecting occluded regions using Fast-Weighted Principal Component Analysis (FW-PCA) and using the occluded regions for weighting the blocks for face representation. Alyuz et al. [21] proposed to deal with occlusions by using fully 4 Sharma, Jurie and P´rez e automatic 3-D face recognition system in which face alignment is done through an adaptively selected model based registration scheme (where only the valid non-occluded patches are utilized), while during the classification stage, they proposed a masking strategy to enable the use of subspace analysis techniques with incomplete data. Min et al. [22] proposed to compute an occlusion mask indicating which pixel in a face image is occluded and to use a variant of local Gabor binary pattern histogram sequences (LGBPHS) to represent occluded faces by excluding features extracted from the occluded pixels. Finally, different from previous approaches, Colombo et al. [23] addressed the question of detection and reconstruction of faces using 3D data.

The second paradigm for addressing the recognition of occluded faces, which is to develop method that are intrinsically robust to occlusion, has received less attention in the past. Liao et al. [24] developed an alignment-free face representation method based on Multi-Keypoint Descriptors matching, where the descriptor size of a face is determined by the actual content of the image. Any probe face image (holistic or partial) can hence be sparsely represented by a large dictionary of gallery descriptors, allowing partial matching of face components.

Weng et al. [25] recently proposed a new partial face recognition approach by aligning partial face patches to holistic gallery faces automatically, hence being robust to occlusions and illumination changes. Zhao [26] used a robust holistic feature relying on stable intensity relationships of multiple point pairs, being intrinsically invariant to changes in facial features, and exhibiting robustness to illumination variations or even occlusions.

Face verification in real world scenarios has recently attracted much attention, specially fueled by the availability of the excellent benchmark: Labeled Faces in the Wild (LFW) [27]. Many recent papers address the problem with novel approaches, e.g. discriminative part-based approach by Berg and Belhumeur [28], probabilistic elastic model by Li et al. [29], Fisher vectors with metric learning by Simonyan et al. [2], novel regularization for similarity metric learning by Cao et al. [30], fusion of many descriptors using multiple metric learning by Cui et al. [31], deep learning by Sun et al. [32], method using fast high dimensional vector multiplication by Barkan et al. [33]. Many of the most competitive approaches on LFW combine different features, e.g. [34–36] and/or use external data, e.g. [37, 38].

The works of Liao et al. [24] and Weng et al. [25] are the most closely related and competing works to the proposed EPML. They are based on feature set matching (via dictionaries obtained with sparse representation). Like in image retrieval, there are two ways of doing occlusion robust face matching: (i) match local features detected around keypoints from the two faces or (ii) aggregate the local features to make a global (per cell) feature vector and then match two image vectors. These works fall in the first category while the proposed method falls in the second. The first type of methods are robust to occlusions due to the matching process, while for the second type, the model and aggregation together have to account for the occlusion. Also, the first type of methods claim that they don’t need alignment of faces. If a face detector is used then by the statistical EPML: Expanded Parts based Metric Learning 5 properties of the detector the faces will be already approximately aligned (LFW is made this way and strong models already give good results without further alignment). So the first type of methods are arguably more useful when an operator outlines a difficult ‘unaligned’ face manually and gives it as an input.

In that case, we could also make her approximately align the faces as well. And in the case when the face detector is trained to detect faces in large variations in pose, then probably the pose will come out as a latent information from the detector itself, which can then be used to align the face approximately. In summary, we argue that both the approaches have merit and the second type, which is the subject of this paper, has the potential to be highly competitive when used with recently developed strong features with a model-and-aggregation designed to be robust to occlusion like the proposed EPML.

Our work could also be contrasted with feature selection algorithms, e.g.

Viola and Jones [39], and many other works in similar spirit, where a subset of features (in a cascaded fashion) are selected from a very large set of features.

The proposed method is similar to feature selection methods as we are selecting a subset of parts from among a large set of potential candidate parts. However, it is distinctly different as it performs a dynamic test time selection of most reliable parts, from among the parts selected at training, which are available for the current test pair.

Finally, our work is also reminiscent of the mid-level features stream of work, e.g. see Doersch et al. [40] and the references within, which aim at extracting visually similar recurring and discriminative parts in the images. In a similar spirit, we are interested in finding parts of faces which are discriminative for verification, after the learnt projection.

2 Motivation and Approach

A critical component in computer vision applications is the image representation.

The state-of-the-art image representation methods first compute local image statistics (or features as they are usually called) and then aggregate them to form a fixed length representation of the images. This aggregation/pooling step reduces a relatively large number of local features to a smaller fixed length vector and there is a trade-off involved here, specially at a spatial level; it is now commonly accepted, e.g. for image classification, that, instead of a global image level aggregation, including finer spatial partitions of the image leads to better results [41]. Learning such partitions does better still [42, 43].

Pages:   || 2 | 3 |

Similar works:

«Gratitude: Being Thankful for the Unseen People in Our Lives Source: Captain J. Charles Plumb, United State Navy (retired), motivational speaker/author http://speaker.charlieplumb.com/about-captain/ On Charlie Plumb’s 75th mission during the Vietnam War, just five days before the end of his tour, Plumb was shot down over Hanoi, taken prisoner, tortured, and then spent the next 2,103 days as a Prisoner Of War. He is currently a renowned motivational speaker documenting his life from a farm kid...»

«2 Dear INASCON 2016 participant, We would like to warmly welcome you all to Baarlo, the Netherlands for the 10th International Nanoscience Student Conference (INASCON). It is the 10th edition of such an energetic gathering organized and participated among the young nanoscientists, a clear sign that there is a sustained interest in nanotechnology among the young generation of students. The feature of multidisciplinary in nanotechnology is clear. In the following days, a variety of subjects...»

«FoU rapport 2013:5 Utvärdering av familjecentrum i Ljusdal Annika Almqvist & Per Åsbrink Utvärdering av familjecentrum i Ljusdal Annika Almqvist & Per Åsbrink FoU Rapport 2013:5 FoU Välfärd Region Gävleborg Grafisk form: Baringo reklam & kommunikation Tryck: Backman Info, Gävle ISSN 1654-8272 4 (48) Förord I Gävleborg startades under perioden 2000-2006 sju familjecentraler, vilka ingick i en utvärdering som gjordes 2008. FoU Välfärd fick därefter vid årsskiftet 2012/2013 i...»

«Welkom Open dag 2016 Coornhert Gymnasium Onze school Het Coornhert Gymnasium is een categorale school: er wordt maar één type onderwijs aangeboden. Er zijn in Nederland 38 categorale gymnasia. Het gymnasium is net als het atheneum een school voor voorbereidend wetenschappelijk onderwijs (vwo) en leidt op tot een studie aan een hogeschool of universiteit. Het belangrijkste verschil tussen het atheneum en het gymnasium is dat op het gymnasium Grieks en Latijn worden gegeven en dat de leerling...»

«NEVADA LEGISLATURE NEVADA SILVER HAIRED LEGISLATIVE FORUM (Nevada Revised Statutes 427A.320 through 427A.400) SUMMARY MINUTES AND ACTION REPORT The fourth meeting of the Nevada Silver Haired Legislative Forum (NSHLF) for the 20132014 interim period was held on Tuesday, February 18, 2014, at 10:00 a.m. in Room 4401 of the Grant Sawyer State Office Building, 555 East Washington Avenue, Las Vegas, Nevada. The meeting was video-conferenced to Room 3138 of the Legislative Building, 401 South Carson...»

«June 17-19 or June 24-26 THE FORCE IS STRONG IN THIS ONE 2016 Cub Scout Resident Camp Leaders Guide Book Page 1 of 20 Rev. 12/2015 Table of Contents Camp Director Welcome Letter.............................................. 4 I. PRE-CAMP PLANNING Planning Your Resident Camp Adventure...................................... 5 Registration Requirements...................................»

«Biomechanical Study of the New Axe Handle Baseball Bats and Comparison with Standard Round Knob Bats Vijay Gupta, Ph.D. Abstract A biomechanical study was performed to understand the kinematics of the handle of a bat within the grip of a hitter during a baseball swing. The focus of the study was to compare and contrast the performance of traditional round handle bats with those equipped with the newly designed axe handle. The bats were evaluated with respect to injury potential, comfort,...»

«The Sum is Greater than the Parts: Doubling Shared Prosperity in Indonesia Through Local and Global Integration i Table of Contents Page v Preface vi Abstract vii Executive Summary 1 CHAPTER ONE Indonesia’s Development Challenge: Doubling Shared Prosperity by Accelerating Sustainable, Inclusive Growth Cukup Baik Tidak Cukup – Good Enough is Not Enough 1 How fast has Indonesia grown? 3 How competitive is Indonesia’s growth? 12 How fairly has Indonesia grown? 16 Comparative Growth...»

«NATIONAL INFRASTRUCTURE ASSESSMENT CALL FOR EVIDENCE October 2016 The National Infrastructure Assessment | Call for Evidence The National Infrastructure Assessment | Call for Evidence CONTENTS 1. Introduction 4 2. Call for evidence 5 3. Context 7 4. Questions 8 5. How to respond 12 3 The National Infrastructure Assessment | Call for Evidence 1. INTRODUCTION 1.1 The National Infrastructure Commission (‘the Commission’) was established by the Chancellor of the Exchequer in October 2015, with...»

«STS 14013 Examensarbete 30 hp 12 Maj 2014 Wind power integration in island-based smart grid projects A comparative study between Jeju Smart Grid Test-bed and Smart Grid Gotland Hampus Piehl Abstract Wind power integration in island-based smart grid projects Hampus Piehl Teknisknaturvetenskaplig fakultet Smart grids seem to be the solution to use energy from renewable and intermittent UTH-enheten energy sources in an efficient manner. There are many research projects around the Besöksadress:...»

«CUMBRIA FOOT AND MOUTH DISEASE INQUIRY REPORT 1 2 PREFACE This report presents the findings and recommendations of an independent Public Inquiry into the Foot and Mouth Disease (FMD) epidemic that occurred in Cumbria in 2001. The Inquiry was conducted under the umbrella of the Cumbria FMD Task Force and was supported and facilitated by the Cumbria County Council. In preparing the report we have sought to set out the essential background and key facts about the epidemic, to provide objective...»

«“YOU IDIOTS!” HOUSE, M.D. AND THE CONTINUED VITALITY OF THE BYRONIC HERO by CATHERINE A. RICCIO (Under the Direction of Linda Brooks) ABSTRACT The Byronic Hero did not die out with the appearance of Byron’s Don Juan. On the contrary, through the use of humor in the narrator’s commentary, a new Byronic Hero emerged, one capable of uncovering the uncomfortable truths behind an artificial society. The inclusion of humor strengthened this character type’s resiliency and made him more...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.