Facial Keypoint Detection Using PyTorch

January 10, 2021

Facial Keypoint Detection Using PyTorch

First of, happy new year! I can't believe I spent the entire 2020 sitting at home. But I'd have to admit I was a bit productive in terms of gearing up my technical skills. Knowing me tho, I tend to forget a whole bunch of them in a snap. That is why I am blogging them on here so I can go back to these notes any time I want. I hope you find this useful as well.

As I have mentioned in my previous blog, I have been partaking this really cool nanodegree program on Computer Vision by Udacity. And this post shall be dedicated to my first project, which is Facial Keypoint Detection. This might be a long post.

Поехали!

This project will be all about defining and training a convolutional neural network to perform facial keypoint detection, and using computer vision techniques to transform images of faces.

Facial keypoints (also called facial landmarks) are the small magenta dots shown on each of the faces in the image above. In each training and test image, there is a single face and 68 keypoints, with coordinates (x, y), for that face. These keypoints mark important areas of the face: the eyes, corners of the mouth, the nose, etc. These keypoints are relevant for a variety of tasks, such as face filters, emotion recognition, pose recognition, and so on. Here they are, numbered, and you can see that specific ranges of points match different portions of the face.

Step 1: Load and Visualize the Data

This set of image data has been extracted from the YouTube Faces Dataset, which includes videos of people in YouTube videos. These videos have been fed through some processing steps and turned into sets of image frames containing one face and the associated keypoints.

Training and Testing Data

This facial keypoints dataset consists of 5770 color images. All of these images are separated into either a training or a test set of data.

3462 of these images are training images
2308 are test images

In nutshell, I looked into the dataset, made sure that they are stored in a nice, accessible format (dictionary with image, and keypoints), and all images have the same sizes by rescaling, cropping, normalizing, and transforming the arrays into a tensor.

Step 2: Define the Convolutional Neural Network

In a nutshell, here's what I did.

Define a CNN with images as input and keypoints as output
Construct the transformed FaceKeypointsDataset, just as before
Train the CNN on the training data, tracking loss
See how the trained model performs on test data
If necessary, modify the CNN structure and model hyperparameters, so that it performs well

A CNN's architecture can be define by the following types of layers:

Convolutional layers
Maxpooling layers
Fully-connected layers

Step 3: Facial and Facial Keypoint Detection

Test before:

Test after:

Can't say it's good. Can't say it's bad either. Well, ok, it's not THAT good. I only used 4 convolutional layers, a dropout layer of 0.4 and one dense layer. Also I used the MSE for my loss function and ADAM as my optimizer. There is still room for improvement. There always is. So there's that!

It's not so bad given I was also able to detect the Obamas. First detection using the not-so-powerful haar cascade.

Followed by the actual detection of facial keypoints.

One thing I find annoying here is I needed to resize the pictures I am using because the images used for training were of different sizes.

I uploaded this project on my github page: https://github.com/avpresbitero/Project-Facial-Key-Features.

Search This Blog

As Complex As It is

Facial Keypoint Detection Using PyTorch

Training and Testing Data

Comments

Post a Comment

Popular Posts

The Confusion with Confusion Matrices

Understanding Recommender Systems