IMAGE PROCESSING FACE RECOGNITION USING EIGENFACES
ABSTRACT
Images containing faces are essential
to intelligent vision-based human computer interaction, and research efforts in
face processing include face recognition, face tracking, pose estimation, and
expression recognition. However, many reported methods assume that the faces in
an image or an image sequence have been identified and localized. To build
fully automated systems that analyze the information contained in face images,
robust and efficient face detection algorithms are required. Given a single
image, the goal of face detection is to identify all image regions which
contain a face regardless of its three-dimensional position, orientation, and
lighting conditions.
It is a near-real-time computer system
that can identify an unknown subject’s face and then recognize the person by
comparing characteristics of the face to those of known individuals. The
computational approach taken in this system is motivated by both physiology and
information theory, as well as by the practical requirements of near-real-time
performance and accuracy.
The system functions by projecting face
images onto a feature space that spans the significant variations among known
face images. The significant features are known as Eigenfaces because they are
the eigenvectors of the set of faces. The projection operation characterizes an
individual face by a weighted sum of the eigenface features, and so to
recognize a particular face it is necessary only to compare these weights to
those of known individuals. Some particular advantages of our approach are that
it provides for the ability to learn and later recognize new faces in an
unsupervised manner.
INTRODUCTION
The face is our primary focus of attention in
social intercourse, playing a major role in conveying identity and emotion.
Although the ability to infer intelligence or character from facial appearance
is suspect, the human ability to recognize faces is remarkable. We can
recognize thousands of faces learned throughout our lifetime and identify
familiar faces at a glance even after years of separation.
This skill is quite robust,
despite large changes in the visual stimulus due to viewing conditions,
expression, aging, and distractions such as glasses, beards or changes in hair
style. Face recognition has become an important issue in many applications such
as security systems, credit card verification and criminal identification. For
example, the ability to model a particular face and distinguish it from a large
number of stored face models would make it possible to vastly improve criminal
identification. Even the ability to merely detect faces, as opposed to
recognizing them, can be important. Detecting faces in photographs for
automating color film development can be very useful, since the effect of many
enhancement and noise reduction techniques depends on the image content.
Although it is clear that people are good at face recognition, it is not at all
obvious how faces are encoded or decoded by the human brain. Human face
recognition has been studied for more than twenty years. So we need a system
which is similar to the human eye in some sense to identify a person. To cater
this need and using the observations of human psychophysics, face recognition
as a field emerged. Unfortunately developing a computational model of face
recognition is quite difficult, because faces are complex, multi-dimensional
visual stimuli. Therefore, face recognition is a very high level computer
vision task, in which many early vision techniques can be involved.
The wide variety of applications of
real-time face recognition is what makes it so attractive to the government and
industry. Some possible applications include facility access control, user
authentication for ATM machines, airport security, remote access to networks,
and aiding in the recovery of missing children and fugitives. Real-time face
recognition helps to create a safer, more secure world because of the ability
to recognize a person through a physical trait, but it is less-invasive than
other biometrics. It is cost-effective, requiring only a video camera and a PC.
It also eliminates the need for multiple passwords, PINs, and access cards.
INTRODUCTION TO
PRINCIPAL COMPONENT ANALYSIS [PCA]
The objective of the Principal
Component Analysis - PCA is to take the total variation on the training set of
faces and to represent this variation with just some little variables. In fact,
an observation described by little variables is easier to manipulate and to
understand than described by a big quantity of variables. Mainly when we are
working with great amounts of images, reduction of space dimension is very
important. PCA intends to reduce the dimension of a group or to space it better
so that the new base describes the typical model of the group. In our case, the
model is a training set of faces. The new base is built through linear
combination. The components in this space of faces are not correlated and
maximize the variation considered for the original variables. The Principal
Component Analysis was developed firstly by Statistics, later it was
reformulated using neural network paradigm. Thus, there are two manners to
explain its principles. Both points of view give a good understanding about
PCA, because they are complementary.
The image space is highly
redundant when it describes faces. This happens because each pixel in a face is
highly correlated to the others pixels. The objective of PCA is to reduce the
dimension of the work space. The maximum number of principal components is the
number of variables in the original space. Even so to reduce the dimension,
some principal components should be omitted. This means that some principal
components can be discarded because they only have a small quantity of data,
considering that the larger quantity of information is contained in the other
principal components.
Face space dimension is smaller
than image space dimension. In this approach, the principal components
(eigenvectors) are in descending order in relation to its eigenvalues and the
last ones are discarded, that represents a great reduction of the PCA
dimension. This fact is very important because the decrease of the eigenvalues
is exponential in the face space, where few larger eigenvalues contain the main
information.
The proposed model is based on
PCA, decomposing the face images in a small group of characteristics, in the
eigenfaces, based in Linear Algebra concepts. Thus, the eigenfaces are the
principal components of the original face images, obtained by the decomposition
of PCA, forming the face space from these images. Face Recognition is performed
from the projection of the analysed face into the face space and the measuring
of the euclidean distance between the new face and the face classes. If the
distance is inside the threshold of a certain class and it is the smallest
value, then there is recognition. The face space is described by an eigenfaces
group. Each face is represented by its projection on the space expanded by the
eigenfaces.
This approach proposes to extract
the significant information from the face images, to code them in the most
possible efficient way, and to compare these coefficients with a database with
well-known faces. This codification is performed by means of capturing
variations of the whole faces group used for tests, independently of the
individual face characteristics. It is used later in the comparison of the
analysed faces.
Because the reconstruction
can be made based on just some few eigenvectors with larger eigenvalues, after
repeated experiments with several eigenfaces, to compare the value and the
representativity of the eigenvalues and the values of the chosen eigenfaces, a
rule that allows the recognition of an analysed face was found.
FACE RECOGNITION METHODS
This
sub-topic gives a brief history of Face Recognition and discusses the existing
techniques. There are three methods that most of us are familiar with, these
are:
- Artificial Neural Network
- Active Appearance Model
- Eigenface Approach
ARTIFICIAL NEURAL NETWORK
A neural network or some other
classifier is trained using supervised learning with 'face' and 'non-face’
examples, thereby enabling it to classify a particular pixel region in an image
as a 'face' or 'non-face'. Unfortunately, while it is relatively easy to find
face examples, how would one find a representative sample of images which
represent on-faces? Therefore face detection systems using example based
learning need literally thousands of 'face' and 'non-face' example images for
effective training. In this study we used a deformable template to detect the
image invariants of a human face. This technique did not need the extensive
training of a neural network based approach yet yielded a perfect detection
rate for frontal-view face images with a reasonably plain background.
- Load the pictures to the network.
- Note that what enters the network are the picture’s coefficients, calculated using the spanning base, and not the picture itself.
- Train the network on these pictures for a number of iterations.
- Load a picture to identify (and calculate its coefficients).
- Identify picture by using the network
This method first extracts the
features from the input image. The features are then fed into the neural
network. The neural network outputs a value between 0 and 1. This neural
network is a back propagation feedforward network with three layers - input,
hidden and output layers. For every class, we construct a network during
training. Whenever, an image is given for classification, it is fed into each
of the networks. The network which gives the maximum output is given as the
matched class.
ACTIVE APPEARANCE MODEL
In particular,
statistical models of appearance, which can synthesize both face shape and
texture, are becoming popular as they can encode a face in a relatively compact
parameter vector and fast algorithms have been developed for matching such
models to new images.
A statistical
appearance model contains models of the shape and grey-level appearance of the
object of interest which can ’explain’ almost any valid example in terms of a
compact set of model parameters. The appearance model is built based on a set
of labelled images, where key landmark points are marked on each example
object. The marked examples are aligned to a common co-ordinate and each can be
represented by a vector, x. Applying a principal component analysis
(PCA) to the data, the shape
model can be written as
x = ¯x + Psbs (1)
Where ¯x is the mean shape, Ps is
a set of orthogonal models of variation and bs is a set of shape parameters.
After warp the texture within
the region of interest to the mean shape, the texture vector g (a raster scan
of the grey-levels) can be similarly modelled as
g = ¯g + Pgbg (2)
where ¯g is the mean normalised
grey-level vector, Pg is a set of orthogonal models of variation and bg is a
set of grey-level parameters.
A further PCA can be applied to
the shape and texture parameters to obtain a combined appearance model
x = ¯x + Qsc
g = ¯g + Qgc (3)
where c is a vector of parameters
controlling both shape and texture together and Qs,Qg are matrices describing
the modes of variation derived from the training set. A facial appearance model
has been built up based on a database containing 1267 images of 103 different
people. The trained model has covered 99.5% of the variation of the faces in
training data set which includes both head pose and expression change. The
model has 349 modes controlling facial appearance.
EIGENFACE PRINCIPLE
Eigenfaces is a well known
Principle Component Analysis (PCA) based face recognition algorithm developed
by researchers at MIT. Though the mathematical underpinnings of Eigenfaces are
complex, the entire algorithm is simple and has a structure quite amenable.
Training images are represented as a set of flattened vectors and assembled
together into a single matrix. The Eigen vectors of the matrix are then
extracted and stored in a database. The training face images are projected onto
a feature space, called face space, defined by the Eigen vectors. This captures
the variation between the set of faces without emphasis on any one facial
region like the eyes or nose. The projected face space representation of each
training image is also saved to a database. To identify a face, the test image
is projected to face space using the saved Eigen vectors.
Consider the set of all
possible images, those representing a face, make up only a small fraction of
it. We decide to represent images as very long vectors, instead of the usual
matrix representation. This makes up the entire Image space where each image is
a point. Since faces however poses similar structure (eye, nose and mouth,
position, etc.) the vectors representing them will be correlated. We will see
that faces will group at a certain location in the Image space. We might say
that faces lie in a small and ``separate'' region from other images. The idea
behind Eigen images (in our case eigenfaces) is to find a lower dimensional
space in which shorter vectors well describe face images, and only those. The
following figure illustrates this idea graphically
Fig: Image Space
& Face Space
An eigenvector of a matrix is a vector
such that, if multiplied with the matrix, the result is always an integer
multiple of that vector. This integer value is the corresponding eigenvalue of
the eigenvector. This relationship can be described by the equation M × u = λ ×
u, where u is an eigenvector of the matrix M and λ is the corresponding
eigenvalue.
Eigenvectors possess following
properties:
§
They can be determined only for square matrices
§
There are n eigenvectors (and corresponding
eigenvalues) in a n × n matrix.
§
All eigenvectors are perpendicular, i.e. at
right angle with each other.
The projected test image is then
compared against each saved projected training image for similarity. The
identity of the person in the test image is assumed to be the same as the
person depicted in the most similar training image.
Different studies have shown
that there are 22 critical features in a human face which constitute a feature
set which can be used for identification using the face. However, all these 22
features can not be obtained if we also consider slight change in orientation
and emotions. However, the following 8 features to constitute the basic feature
set:
Ø
Inter-ocular distance
Ø
Distance between the lips and the nose
Ø
Distance between the nose tip and the eyes
Ø
Distance between the lips and the line joining
the two eyes
Ø
Eccentricity of the face
Ø
Ratio of the dimensions of the bounding box of
the face
width
of the lips.
How does it work?
The task of facial
recognition is discriminating input signals (image data) into several classes
(persons). The input signals are highly noisy (e.g. the noise is caused by
differing lighting conditions, pose etc.), yet the input images are not
completely random and in spite of their differences there are patterns which
occur in any input signal. Such patterns, which can be observed in all signals,
could be - in the domain of facial recognition - the presence of some objects
(eyes, nose, mouth) in any face as well as relative distances between these
objects. These characteristic features are called Eigenfaces in the facial
recognition domain (or principal components generally). They can be extracted
out of original image data by means of a mathematical tool called Principal
Component Analysis (PCA).
By means of PCA one can
transform each original image of the training set into a corresponding
Eigenface. An important feature of PCA is that one can reconstruct any original
image from the training set by combining the Eigenfaces. Remember that
Eigenfaces are nothing less than characteristic features of the faces.
Therefore one could say that the original face image can be reconstructed from
Eigenfaces if one adds up all the Eigenfaces (features) in the right
proportion.
Each Eigenface represents
only certain features of the face, which may or may not be present in the
original image. If the feature is present in the original image to a higher
degree, the share of the corresponding Eigenface in the sum of the Eigenfaces
should be greater. If, contrary, the particular feature is not (or almost not)
present in the original image, then the corresponding Eigenface should
contribute a smaller (or not at all) part to the sum of Eigenfaces. So, in
order to reconstruct the original image from the Eigenfaces, one has to build a
kind of weighted sum of all Eigenfaces. That is, the reconstructed original
image is equal to a sum of all Eigenfaces, with each Eigenface having a certain
weight. This weight specifies, to what degree the specific feature (Eigenface)
is present in the original image.
If one uses all the Eigenfaces extracted
from original images, one can reconstruct the original images from the
Eigenfaces exactly. But one can also use only a part of the Eigenfaces. Then
the reconstructed image is an approximation of the original image. However, one
can ensure that losses due to omitting some of the Eigenfaces can be minimized.
This happens by choosing only the most important features (Eigenfaces).
Omission of Eigenfaces is necessary due to scarcity of computational resources.
How does this relate to facial
recognition?
The clue is that it
is possible not only to extract the face from Eigenfaces given a set of
weights, but also to go the opposite way. This opposite way would be to extract
the weights from Eigenfaces and the face to be recognized. These weights tell
nothing less, as the amount by which the face in question differs from
”typical” faces represented by the Eigenfaces. Therefore, using this weights
one can determine two important things:
Ø
Determine if the image in question is a face at
all. In the case the weights of the image differ too much from the weights of
face images (i.e. images, from which we know for sure that they are faces), the
image probably is not a face.
Ø
Similar faces (images) possess similar features
(Eigenfaces) to similar degrees (weights). If one extracts weights from all the
images available, the images could be grouped to clusters. That is, all images
having similar weights are likely to be similar faces.
The procdure that defines the Eigenface is:
Ø
Collect a bunch (call this number N) of images
and crop them so that the eyes and chin are included, but not much else.
Ø
Convert each image (which is x by y pixels) into
a vector of length xy.
Ø
Pack these vectors as columns of a large matrix.
Ø
Add xy - N zero vectors so that the matrix will
be square (xy by xy).
Ø
Compute the eigenvectors of this matrix and sort
them according to the corresponding eigenvalues. These vectors are your
Eigenfaces. Keep the M Eigenfaces with the largest associated eigenvalues.
Face Detection
With two-dimensional
images, it is necessary, first of all, that these images are scrambled like a
vector. For this reason it is enough to connect each file with the preceding
file and afterwards to create a matrix formed by columns that contain the
vectors of images. Now the analysis of principal components can be carried out.
The calculus of the P Eigen - vectors is given by the solution to the following
equation:
Var(X)
P = P L
and the image's
projection on P will be:
Y =
P' X
being X the
matrix that contains the images in its columns.
Var(X) the
dispersion matrix of the initial data matrix X.
L the vector of
Var(X) eigen - values.
P the matrix of
Eigen - vectors associated with L.
The analysis of principal components is
a holistic method of characteristic's extraction, the characteristics are
determined by their more or less constancy through the images, so, if all the
images have an oval structure forming a face, this will appear in the first
eigen - vector and so on. If the initial groups of images are faces, the first
eigen - faces will be a general average, like a low pass filter that
represented the thick structure of an image. If an image is projected on that
eigen - face, the result will be close to one; if the image does not contain a
face, the result will be one sufficient enough to establish a threshold which
will determinate if the image contains a face or not.
On comparing all the
above existing implementation, performance of Eigenfaces is better than the
other two methods. The following comparison will prove the above statement.
We have tested with 10
classes (5 males, 5 females). We trained each of the networks with some of
these images (negative cases, for which the network should output 0, were also
included). We worked on Pentium-II, 500MHz machines, and it took around 50
hours for the training of all the 14 networks (some of them however were
trained in 2 hours). We tested with sets of 12 images. The following results
were obtained.
|
NEURAL
NETWORK
|
EIGENFACE
|
||
Best
match
|
2nd
match
|
Best
match
|
2nd
match
|
|
Set
A
|
5
|
3
|
8
|
3
|
Set
B
|
7
|
2
|
9
|
2
|
Set
C
|
7
|
3
|
8
|
4
|
Similar
results were obtained for large number of classes (14).
In the above graph, the
number of face images (per person) taken for training is shown on the x-axis.
The number of successful matches is shown on the y-axis. The graph shows that
the recognition rate increases with the increase in the size of training set.
The slope of the curve decreases for large values.
From the above results, we
can see that the performance of Eigenfaces is better than the feature based
approach. This has been observed by different implementation by different
people. In fact, Eigenfaces approach is used in the commercially available
system for automated face classification.
Scope of this project
determines the applications and the user environment of the
proposed system. The scope of the subject of Face Recognition is very
vast and still is not fully explored.
The proposed software would be
a simple, easy-to-use product .Face Recognition systems like these are a very
important means to safeguard the identity of a person. This software has wide
ranging applications in the security and defense systems. Without ensuring that
the information pieces like passwords, credit card numbers,etc. are secure face
recognition assures that it would be out of harm's way.
This project will be valuable
resource for those looking for developing a secure infrastructure for all their
confidential information. As is implicit the software could be a part of a
larger system where the user interfaces with it in a lucent manner. The user
will be provided with the menu options to get the input.
The efficiency of the product
will directly depend on the procedure followed to implement the system. Hence,
special care will be taken to ensure that the techniques and the algorithms
used are efficient.
BIBLIOGRAPHY
WEBSITES:
www.howstuffworks.com
www.stanford.edu
www.cim.mcgill.ca
www.cs.otago.ac.nz
http://sirio.psi.ucm.es
www.vorburger.ch
http://visl.technion.ac.il
www.google.co.in
PAPERS:
1. Ashbourn, Julian, “Real World Biometrics” , The Journal of the
Association for Biometrics, 1999.
2. Avanti, “The Journal of the Association for Biometrics”, Issue 1,
Number 1, from March, 1998 until September, 1999.
3. Bruce, Vicki and Young, Andy, "In The Eye of the Beholder. The
science of face perception", Oxford University Press Inc., 1998.
4. Kirby, M. and Sirovich, L., "Application of the Karhunen-Loeve
Procedure for the Characterisation of Human Faces", IEEE Transactions on
Pattern Analysis and Machine Intelligence, 1990.
5. Moghaddam, Baback; Wahid, Wasiuddin and Pentland, Alex, "Beyond
Eigenfaces: Probabilistic Matching for Face Recognition", MIT Media
Laboratory Perceptual Computing Section Technical Report No. 443, Appears in:
The 3rd IEEE Int'l Conference on Automatic Face & Gesture Recognition,
Nara, Japan, April 1998.
6. Pentland, Alex; Moghaddam, Baback and Starner, Thad, "View-Based
and Modular Eigenspaces for Face Recognition", Vision and Modeling Group,
The Media Laboratory, Massachusetts Institute of Technology,
7. Author_A; Author_B and Author_C, “Blanked blanked blanked. It will be
completed later, on the final version”.
8. Turk, Matthew and Pentland, Alex, “Eigenfaces for Recognition”,
Vision and Modeling Group, The Media Laboratory, MIT, In Journal of Cognitive
Neuroscience, Volume 3, Number 1, pages 71 to 86, 1991.
9. Young, Andrew W., "Face and Mind", Oxford University Press
Inc., 1998.
No comments:
Post a Comment