implementing simple graphical effects that use the information extracted from the human face achieved through TensorFlow, sklearn, skimage, and Keras. Face segmentation
FACE ALIGNMENT AND SEGMENTATION SYSTEM
BY
ABSTRACT
Detection, extraction, segmentation, and recognition of face as part of the intelligent living application is very vital in our daily life. Alignment and segmentation are part and play a significant role in face recognition systems. The landmarks derived from the shape of the faces by alignment methods are used in many applications, including real face classification and virtual face animation. In this paper, Sklearn, Keras, Skimage, and TensorFlow methods will be used to train the model with provided datasets, align the image data to identify facial landmarks, and segment data to realize face vector points for the segmented image.
INTRODUCTION
Image identification is very fundamental in application areas such as Biometric systems and computer vision. The objective of this paper is to implement a Face Segmentation system that can partition a human face into sections, identify and align corresponding landmarks of that image. This paper also focuses on implementing simple graphical effects that use the information extracted from the human face. Face segmentation will be of focus in this paper rather than face detection, which only involves determining whether objects are faced or not.
This paper is significant due to many ongoing and emerging applications in the field of computer vision. Such applications include Facial Recognition System, which can identify and recognize human faces uniquely based on pixels properties. Facial recognition system has several applications such as login credentials in android and computer systems, video surveillance using the concept of face tracking, among others.
Extensive accelerated research has been done on face segmentation, but still, there is a lot to be fully and convincingly solved. The problem has been fueled by the existence of complex content of images and their applications.
This paper is organized into distinct parts, preprocessing, face alignment, face segmentation, and graphical effect implementation. The paper will also have a recommendation and a conclusion.
PREPROCESSING
Machine learning algorithms will only work with vectors alone. There is, therefore, the need to convert training data into vectors. In this paper, the dataset provided is in matrix form. We will take advantage of the TensorFlow iterator object known as image_data_generator, which creates several variables that can be used to preprocess the data.
FACE ALIGNMENT
Face alignment is the process of locating image components or landmarks that describes the face and pose of a face. It is sometimes referred to as feature finding or face landmark detection. Fig 1 and Fig 2 below shows the various landmarks and pose of human and animal faces respectively.
Fig 1
Fig 2
FACE ALIGNMENT METHODS
A training model will be created to achieve face alignment with the help of image data provided for training. The model will learn how to align face from already aligned facial images. The model will then try to predict facial landmarks by itself and provide outputs in coordinates.
Discussion
The face alignment was achieved through TensorFlow, sklearn, skimage, and Keras.
The training model created was able to learn from aligned images and was able to predict on its own.
The accuracy is much higher compared to other methods as it focuses on learning based on decreasing of training errors. The method was able to extract different pixel features of a face.
The predicted facial landmarks images were saved in a file named results.csv.
FACE SEGMENTATION.
Segmentation is the process of labeling each pixel in the face as belonging to a particular object. In this paper, estimation of each pixel coordinates will be carried out using TensorFlow and sklearn. The provided image will be used to train a model to learn how to predict the coordinates of various landmarks. More of face segmentation will be discussed later in the code implementation.
GRAPHICAL EFFECTS
The face landmarks obtained as a result of face alignment and face segmentation can be used to perform graphical effects such as human anatomy in medicine, detection of crying, sad or sad face, and facial recognition, among other effects.
These effects are sown below.
Code implementation
Links to draw training images from sussex.
!wget “https://sussex.box.com/shared/static/cbvuazevif0oczabom06zl2x7ukakbqj.npz” -O training_images.npz
# The test images (without points)
!wget “http://users.sussex.ac.uk/~is321/test_images.npz” -O test_images.npz
# The example images are here
!wget “http://users.sussex.ac.uk/~is321/examples.npz” -O examples.npz
Importing all required libraries
import tensorflow as tf #tensorflow pacakage
import numpy as np
import matplotlib.pyplot as plt # for visualization
%matplotlib inline
from sklearn.model_selection import train_test_split
from keras.utils.np_utils import to_categorical # convert to one-hot-encoding
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D
from keras.optimizers import RMSprop
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ReduceLROnPlateau
from sklearn.preprocessing import MinMaxScaler
from skimage.transform import resize,rescale
import cv2
import numpy as np
from sklearn.cluster import KMeans
from PIL import Image
from sklearn.linear_model import LinearRegression
import sklearn.metrics as metrics
Loading the dataset images and saving them to the new variable training_images, weight is reserved to training point.
training_data = np.load(‘training_images.npz’, allow_pickle=True)
test_data = np.load(‘test_images.npz’, allow_pickle=True)
examples=np.load(‘examples.npz’,allow_pickle=True)
training_images = training_data[‘images’]
training_pts = training_data[‘points’]
test_images = test_data[‘images’]
# and the data points
examples_images=examples[‘images’]
Visualization of the images with their respective vectors
print(training_images.shape,training_pts.shape)
print(“There are {} images in the dataset”.format(len(training_images)))
print(“Size of each image is {}x{}”.format(training_images.shape[1],training_images.shape[2]))
print(“Size of each image is {}x{}”.format(training_images.shape[1],training_images.shape[2]))
print(“Pixel values scale:{}”.format(training_images[0][0,:4]))
def visualise_pts(img, training_images):
plt.imshow(img)
plt.plot(training_images[:, 0], training_images[:, 1], ‘+r’)
plt.show()
for i in range(3):
idx = np.random.randint(0, training_images.shape[0])
visualise_pts(training_images[idx, …],training_pts[idx, …])
The output of visualization
Preprocessing of image data
image_data_generator=tf.keras.preprocessing.image.ImageDataGenerator(
featurewise_center=True, samplewise_center=True,
featurewise_std_normalization=True, samplewise_std_normalization=True, rotation_range=0,brightness_range=None,zoom_range=0.0, fill_mode=’nearest’,horizontal_flip=True, vertical_flip=True, rescale=1/255, validation_split=0.0, dtype=None
)
tf.keras.preprocessing.image.NumpyArrayIterator(
training_images, training_pts, image_data_generator, batch_size=32, shuffle=True, sample_weight=None,#numpyarrayiterator act as a for loop and apply all the functions
seed=None, data_format=None, save_to_dir=None, subset=None, dtype=None #above tothe dataset
)
The array values vary in size. For example, values are over 100 while others are less than 10. To ensure that all values fall between 0 and 1, we divide the image array with 255.
training_images_r=training_images/255 # training images
We will then resize the images from size 236by236 to 96by96; this is necessary to reduce the algorithm training time and prevent the model from learning from the noise. I have separated the cells because the process uses a lot of rams; thus, if executed together, it will use all the colab allocated ram crushing the session.
training_imagesp=resize(training_images_r,(2811,96,96,1)) # resizing training images
test_images_r=resize(test_images,(554,96,96,1)) # resizing test images
test_images.shape # checking the new shape
training_imagesp.shape # checking the new shape
Reshape the shape of the array into a vector,this is because the algorithm does not accept 4D or 4D arrays.
X=training_imagesp.reshape((training_imagesp.shape[0],training_imagesp.shape[1]*training_imagesp.shape[2]*training_imagesp.shape[3]))
Y=training_pts.reshape((training_images.shape[0],training_pts.shape[1]*training_pts.shape[2]))
test=test_images.reshape((test_images.shape[0],test_images.shape[1]*test_images.shape[2]*test_images.shape[3]))
spliting the dataset to get a set for training and a set for validating the model
x_train, x_test, y_train, y_test=train_test_split(X,Y,test_size=0.3 , random_state=0)
print(“X_train shape:”,x_train.shape)
print(“y_train shape:”,y_train.shape)
print(“X_test shape:”,x_test.shape)
print(“y_test shape:”,y_test.shape)
lr = LinearRegression() # creating the model
y=lr.fit(x_train,y_train) # training the model
pred_y = lr.predict(x_test) #making predictions
pred_y
pred_y.shape # checking the shape of the predictions
def euclid_dist(pred_pts, gt_pts):
“””
Calculate the Euclidean distance between pairs of points
: param pred_pts: The predicted points
: param gt_pts: The ground truth points
:return: An array of shape (no_points,) containing the distance of each predicted point from the ground truth
“””
pred_pts = np.reshape(pred_pts, (-1, 2))
gt_pts = np.reshape(gt_pts, (-1, 2))
return np.sqrt(np.sum(np.square(pred_pts – gt_pts), axis=-1))
The predicted image point is then saved in google drive and will be submitted for method support.
The colab link for the worksheet:
https://colab.research.google.com/drive/1gLm6TLGUxG3vtStnZD5OsKUJ0sroFcXr?usp=sharing
RECOMMENDATIONS
This paper has explicitly based on methods of aligning and segmenting human faces from an image. It does not look into classifying and determining the characteristics of the facial landmarks extracted from images. A computer vision system that can extract properties and analyze these landmarks further is therefore proposed.
CONCLUSIONS
This paper describes the algorithm that is used for facial alignment and facial segmentation. The machine learning approach was used to create a training model, train it until it was fit to make predictions on alignment and segmentation on its own. Although this paper was able to achieve the objectives, it does not mean face alignment and segmentation problems have been solved. There are challenges in achieving this on complex facial images and multiple pose angles. There are still future endeavors to explore and solve this problem
REFERENCES
Shu, B., Mu, J., & Zhu, Y. (2019, June). AMNet: Convolutional Neural Network embeded with Attention Mechanism for Semantic Segmentation. In Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference (pp. 261-266).
Chai, D., & Ngan, K. N. (1999). Face segmentation using skin-color map in videophone applications. IEEE Transactions on circuits and systems for video technology, 9(4), 551-564.
Cao, X., Wei, Y., Wen, F., & Sun, J. (2014). Face alignment by explicit shape regression. International Journal of Computer Vision, 107(2), 177-190.
Zhang, F., Zhang, T., Mao, Q., & Xu, C. (2020). A unified deep model for joint facial expression recognition, face synthesis, and face alignment. IEEE Transactions on Image Processing, 29, 6574-6589.
Zhang, F., Zhang, T., Mao, Q., & Xu, C. (2020). A unified deep model for joint facial expression recognition, face synthesis, and face alignment. IEEE Transactions on Image Processing, 29, 6574-6589.
Nair, P., & Cavallaro, A. (2009). 3-D face detection, landmark localization, and registration using a point distribution model. IEEE Transactions on multimedia, 11(4), 611-623.