implementing simple graphical effects that use the information extracted from the human face achieved through TensorFlow, sklearn, skimage, and Keras. Face segmentation

FACE ALIGNMENT AND SEGMENTATION SYSTEM

ABSTRACT

Detection, extraction, segmentation, and recognition of face as part of the intelligent living application is very vital in our daily life. Alignment and segmentation are part and play a significant role in face recognition systems. The landmarks derived from the shape of the faces by alignment methods are used in many applications, including real face classification and virtual face animation. In this paper, Sklearn, Keras, Skimage, and TensorFlow methods will be used to train the model with provided datasets, align the image data to identify facial landmarks, and segment data to realize face vector points for the segmented image.

INTRODUCTION

Image identification is very fundamental in application areas such as Biometric systems and computer vision. The objective of this paper is to implement a Face Segmentation system that can partition a human face into sections, identify and align corresponding landmarks of that image. This paper also focuses on implementing simple graphical effects that use the information extracted from the human face. Face segmentation will be of focus in this paper rather than face detection, which only involves determining whether objects are faced or not.

This paper is significant due to many ongoing and emerging applications in the field of computer vision. Such applications include Facial Recognition System, which can identify and recognize human faces uniquely based on pixels properties. Facial recognition system has several applications such as login credentials in android and computer systems, video surveillance using the concept of face tracking, among others.

Extensive accelerated research has been done on face segmentation, but still, there is a lot to be fully and convincingly solved. The problem has been fueled by the existence of complex content of images and their applications.

This paper is organized into distinct parts, preprocessing, face alignment, face segmentation, and graphical effect implementation. The paper will also have a recommendation and a conclusion.

PREPROCESSING

Machine learning algorithms will only work with vectors alone. There is, therefore, the need to convert training data into vectors. In this paper, the dataset provided is in matrix form. We will take advantage of the TensorFlow iterator object known as image_data_generator, which creates several variables that can be used to preprocess the data.

FACE ALIGNMENT

Face alignment is the process of locating image components or landmarks that describes the face and pose of a face. It is sometimes referred to as feature finding or face landmark detection. Fig 1 and Fig 2 below shows the various landmarks and pose of human and animal faces respectively.

Fig 1

Fig 2

FACE ALIGNMENT METHODS

A training model will be created to achieve face alignment with the help of image data provided for training. The model will learn how to align face from already aligned facial images. The model will then try to predict facial landmarks by itself and provide outputs in coordinates.

Discussion

The face alignment was achieved through TensorFlow, sklearn, skimage, and Keras.

The training model created was able to learn from aligned images and was able to predict on its own.

The accuracy is much higher compared to other methods as it focuses on learning based on decreasing of training errors. The method was able to extract different pixel features of a face.

The predicted facial landmarks images were saved in a file named results.csv.

FACE SEGMENTATION.

Segmentation is the process of labeling each pixel in the face as belonging to a particular object. In this paper, estimation of each pixel coordinates will be carried out using TensorFlow and sklearn. The provided image will be used to train a model to learn how to predict the coordinates of various landmarks. More of face segmentation will be discussed later in the code implementation.

GRAPHICAL EFFECTS

The face landmarks obtained as a result of face alignment and face segmentation can be used to perform graphical effects such as human anatomy in medicine, detection of crying, sad or sad face, and facial recognition, among other effects.

These effects are sown below.

Code implementation

Links to draw training images from sussex.

!wget “https://sussex.box.com/shared/static/cbvuazevif0oczabom06zl2x7ukakbqj.npz” -O training_images.npz

# The test images (without points)

!wget “http://users.sussex.ac.uk/~is321/test_images.npz” -O test_images.npz

# The example images are here

!wget “http://users.sussex.ac.uk/~is321/examples.npz” -O examples.npz

Importing all required libraries

import tensorflow as tf #tensorflow pacakage

import numpy as np

import matplotlib.pyplot as plt # for visualization

%matplotlib inline

from sklearn.model_selection import train_test_split

from keras.utils.np_utils import to_categorical # convert to one-hot-encoding

from keras.models import Sequential

from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D

from keras.optimizers import RMSprop

from keras.preprocessing.image import ImageDataGenerator

from keras.callbacks import ReduceLROnPlateau

from sklearn.preprocessing import MinMaxScaler

from skimage.transform import resize,rescale

import cv2

import numpy as np

from sklearn.cluster import KMeans

from PIL import Image

from sklearn.linear_model import LinearRegression

import sklearn.metrics as metrics

Loading the dataset images and saving them to the new variable training_images, weight is reserved to training point.

training_data = np.load(‘training_images.npz’, allow_pickle=True)

test_data = np.load(‘test_images.npz’, allow_pickle=True)

examples=np.load(‘examples.npz’,allow_pickle=True)

training_images = training_data[‘images’]

training_pts = training_data[‘points’]

test_images = test_data[‘images’]

# and the data points

examples_images=examples[‘images’]

Visualization of the images with their respective vectors

print(training_images.shape,training_pts.shape)

print(“There are {} images in the dataset”.format(len(training_images)))

print(“Size of each image is {}x{}”.format(training_images.shape[1],training_images.shape[2]))

print(“Pixel values scale:{}”.format(training_images[0][0,:4]))

def visualise_pts(img, training_images):

plt.imshow(img)

plt.plot(training_images[:, 0], training_images[:, 1], ‘+r’)

plt.show()

for i in range(3):

idx = np.random.randint(0, training_images.shape[0])

visualise_pts(training_images[idx, …],training_pts[idx, …])

The output of visualization

Preprocessing of image data

image_data_generator=tf.keras.preprocessing.image.ImageDataGenerator(

featurewise_center=True, samplewise_center=True,

featurewise_std_normalization=True, samplewise_std_normalization=True, rotation_range=0,brightness_range=None,zoom_range=0.0, fill_mode=’nearest’,horizontal_flip=True, vertical_flip=True, rescale=1/255, validation_split=0.0, dtype=None

)

tf.keras.preprocessing.image.NumpyArrayIterator(

training_images, training_pts, image_data_generator, batch_size=32, shuffle=True, sample_weight=None,#numpyarrayiterator act as a for loop and apply all the functions

seed=None, data_format=None, save_to_dir=None, subset=None, dtype=None #above tothe dataset

)

The array values vary in size. For example, values are over 100 while others are less than 10. To ensure that all values fall between 0 and 1, we divide the image array with 255.

training_images_r=training_images/255 # training images

We will then resize the images from size 236by236 to 96by96; this is necessary to reduce the algorithm training time and prevent the model from learning from the noise. I have separated the cells because the process uses a lot of rams; thus, if executed together, it will use all the colab allocated ram crushing the session.

training_imagesp=resize(training_images_r,(2811,96,96,1)) # resizing training images

test_images_r=resize(test_images,(554,96,96,1)) # resizing test images

test_images.shape # checking the new shape

training_imagesp.shape # checking the new shape

Reshape the shape of the array into a vector,this is because the algorithm does not accept 4D or 4D arrays.

X=training_imagesp.reshape((training_imagesp.shape[0],training_imagesp.shape[1]*training_imagesp.shape[2]*training_imagesp.shape[3]))

Y=training_pts.reshape((training_images.shape[0],training_pts.shape[1]*training_pts.shape[2]))

test=test_images.reshape((test_images.shape[0],test_images.shape[1]*test_images.shape[2]*test_images.shape[3]))

spliting the dataset to get a set for training and a set for validating the model

x_train, x_test, y_train, y_test=train_test_split(X,Y,test_size=0.3 , random_state=0)

print(“X_train shape:”,x_train.shape)

print(“y_train shape:”,y_train.shape)

print(“X_test shape:”,x_test.shape)

print(“y_test shape:”,y_test.shape)

lr = LinearRegression() # creating the model

y=lr.fit(x_train,y_train) # training the model

pred_y = lr.predict(x_test) #making predictions

pred_y

pred_y.shape # checking the shape of the predictions

def euclid_dist(pred_pts, gt_pts):

“””

Calculate the Euclidean distance between pairs of points

: param pred_pts: The predicted points

: param gt_pts: The ground truth points

:return: An array of shape (no_points,) containing the distance of each predicted point from the ground truth

“””

pred_pts = np.reshape(pred_pts, (-1, 2))

gt_pts = np.reshape(gt_pts, (-1, 2))

return np.sqrt(np.sum(np.square(pred_pts – gt_pts), axis=-1))

The predicted image point is then saved in google drive and will be submitted for method support.

The colab link for the worksheet:

https://colab.research.google.com/drive/1gLm6TLGUxG3vtStnZD5OsKUJ0sroFcXr?usp=sharing

RECOMMENDATIONS

This paper has explicitly based on methods of aligning and segmenting human faces from an image. It does not look into classifying and determining the characteristics of the facial landmarks extracted from images. A computer vision system that can extract properties and analyze these landmarks further is therefore proposed.

CONCLUSIONS

This paper describes the algorithm that is used for facial alignment and facial segmentation. The machine learning approach was used to create a training model, train it until it was fit to make predictions on alignment and segmentation on its own. Although this paper was able to achieve the objectives, it does not mean face alignment and segmentation problems have been solved. There are challenges in achieving this on complex facial images and multiple pose angles. There are still future endeavors to explore and solve this problem

REFERENCES

Shu, B., Mu, J., & Zhu, Y. (2019, June). AMNet: Convolutional Neural Network embeded with Attention Mechanism for Semantic Segmentation. In Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference (pp. 261-266).

Chai, D., & Ngan, K. N. (1999). Face segmentation using skin-color map in videophone applications. IEEE Transactions on circuits and systems for video technology, 9(4), 551-564.

Cao, X., Wei, Y., Wen, F., & Sun, J. (2014). Face alignment by explicit shape regression. International Journal of Computer Vision, 107(2), 177-190.

Zhang, F., Zhang, T., Mao, Q., & Xu, C. (2020). A unified deep model for joint facial expression recognition, face synthesis, and face alignment. IEEE Transactions on Image Processing, 29, 6574-6589.

Nair, P., & Cavallaro, A. (2009). 3-D face detection, landmark localization, and registration using a point distribution model. IEEE Transactions on multimedia, 11(4), 611-623.

Published by howdy