Facial Recognition with Celebrities

Benjamin Cho
5 min readOct 16, 2018

There is a lot of visual data out in the world and it is important that we are able to utilize and interpret this data. This project is a baseline direction towards computer vision by using deep learning techniques. How accurately can we predict and find the correct name of the celebrity in a given set of images.

The Data

In order to build an image classification model on faces, data collecting and preprocessing is a very crucial step of the process. The dataset used for my model was collected from: https://github.com/prateekmehta59/Celebrity-Face-Recognition-Dataset. Since there were many peoples’ images to choose from, I decided to start off small by classifying five of my favorite celebrities, my friend and myself.

Examples of messy data images

First of all, my required dataset has to have as many frontal face images of the five celebrities that I am trying to predict. The initial approach after gathering the data is to make sure that there is only one face in each image, this way we can assure that OpenCV’s face detector will capture the face we want. This means that the images have to be clean with the correct celebrity face and with no other noise (i.e. any other face) in them.

Visualizations

Once the data was clean, there are multiple ways you can look at images through python, such as messing with the color channels or focusing on the linear features of the images.

Only green color channel on images
Linear features in each image

Preprocessing

The next step was to install the library OpenCV because it has a prebuilt face detector that will help in cropping each of the faces. I would like to use the Dlib library to capture the facial landmarks myself in the future, however, for the sake of time I used OpenCV’s face cascades. The face detector required some fine tuning to find the optimal setting for cropping as many faces. At this point you have to resize the cropped images to the width and height of your choice so that all the data dimensions will be the same, unless they are already. This is important for when you are feeding the data into your model. Below is a copy of my function that crops the faces.

#create a function to grab each image, detect the face, crop the face, save the face image
def crop_faces(path, scale):
#grabs all image directory paths
img_list = glob.glob(path + '/*.jpg')

#face cascade from OpenCV
haar_face_cascade = cv2.CascadeClassifier('./haarcascade_frontalface_alt.xml')

for img_name in img_list:
img = cv2.imread(img_name)
faces = haar_face_cascade.detectMultiScale(img, scaleFactor=scale, minNeighbors=5)

#resize cropped images
for (x, y, w, h) in faces:
face_cropped = img[y:y+h, x:x+w]
face_resized_img = cv2.resize(img[y:y+h, x:x+w], (175,175), interpolation = cv2.INTER_AREA)

#save cropped face images
new_img_name = img_name.replace('.jpg', '')
cv2.imwrite(new_img_name + '.jpg', face_resized_img)

Once the pictures are cropped, resized and saved, the dataset should be complete at this point and ready to be separated into your X (i.e. image arrays) and y (i.e. image labels).

I labeled my image folders this way so that I can use the class name and class label for predictions. This is how I set up my data folders:

images
---Anne_Hathaway1
------1.jpg
------2.jpg
------... .jpg
---Dave_Chappelle2
------1.jpg
------2.jpg
------... .jpg
.
.
.

Model

This was my first time working with image classification, especially with faces, so I wasn’t sure how complex my model needed to be. However, this model turned out to work the best with the small dataset that I had.

def build_model():
model = Sequential()

model.add(Conv2D(32, kernel_size=(5,5), activation='relu', input_shape=(175, 175, 3), padding='same'))
model.add(Conv2D(32, kernel_size=(3,3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2)))
model.add(Dropout(0.25))

model.add(Conv2D(64, kernel_size=(5,5), activation='relu', padding='same'))
model.add(Conv2D(64, kernel_size=(3,3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(7, activation='softmax'))

return model

There were additional parameters in Keras that you can play around with to get better results, such as adjusting the adam optimizer’s learning rate. Also, it definitely helped to use image data generator to augment or “oversample” on the images I currently had, basically another way to increase my dataset. Finally, I had to find the right number of epochs to do early stopping and the optimal batch size to use. All this got me an accuracy of 98%.

Loss and Accuracy Curve for my best model

Final Thoughts

I was able to test out my model by implementing real time predictions through the webcam with OpenCV’s video capture tool. The model works great until you start adding more people because then it starts to overfit. This suggests how important having a substantial amount of data is to an image classification model.

The more classes to predict, the more images you will need for each class!

Although my dataset only includes images of seven people, the model should run successfully for other people so long as there is enough data to be trained on. In this case, I used the faces of celebrities because the data is more attainable and easier to demonstrate. In the real world, if the data is accessible, everyone will be able to use this technology and we could provide efficiency and convenience. An example would be the recent iPhone models that recognize your face to unlock the phone. But ideally speaking we should be able to substitute a lot of our old methods with facial recognition. For instance, using our face to buy train tickets, unlock the door to our building, or enhancing security systems for law enforcement. The numerous opportunities for implementation are endless and worth exploring.

--

--

Benjamin Cho

Data Science/Business/Project Management Enthusiast