Monday, March 30, 2020

Sinhala Sign Classification using Convolutional Neural Networks

A Convolutional Neural Network (CNN) is a class of deep neural networks.




In our project, we have used a CNN for classifying the gestures.The image that shown above is the summary of the model.There are 5 classes of Sinhala gestures which we  have used  ("ආයුබෝවන් ","ඔබ ","හමුවීම","'සතුටක් ","මම ඔබට ආදරෙයි" ) for classification.There are 3000 images for each sign  including the black images with some background noises to train the model ,and the total of images are 18000.

As a input to  the CNN , each image consists 64*64 of width and height.And we have used 3 convolution layer which are in 32 layer size.After a convolution layer, it is common to apply a pooling layer. This is important because pooling reduces the dimensionality of feature maps, which subsequently reduces the network training time. These new images can contain negative values. In order to avoid this issue, a rectified linear unit (ReLU) is used to replace negative values by zero. The outputs of this layer are called feature maps.And most importantly, model separated 0.3 percent images as validation data.

CNNs are one of the preferred techniques for deep learning. Its architecture allows automatic extraction of diverse image features, like edges, circles, lines, and texture. The extracted features are increasingly optimized in further layers. 


Thursday, March 26, 2020

Speech to Sign mode

According to our project there are main two objectives. There are:
  • Sign to Text mode
  • Speech to Sign mode
If we consider about Speech to Sign mode, process is; ordinary user should execute this part and then he can speak in Sinhala to his microphone. When he speaks, system captures his speech and convert it into Sinhala text and separate words of that speech. This part was done by my team member. Finally I get that converted Sinhala text and map that Sinhala text with pre-animated animations. As an example; if ordinary user says “ මම යනවා ”, then the system separated its words as “ මම “ , “ යනවා “. Next I map these Sinhala words with particular pre-animated sign. Then we can see it. 

If we go into deep, here I use ‘ blender 2.79 ’ for the whole process. By using ‘blender’, I can animate the character according to Sinhala sign gestures. After creating some animations for Sinhala sign gestures, then I take these animations’ start and end key frame values and also the animation name to a text file. By doing this I can easily map this text file content with separated words which I mention above. These processes are done by ‘Python’. Because ‘blender’ not only use for animating purposes but also use for scripting purposes.

Scripting is the major part of this objective because without doing scripting we can only do the animation part of the character. To map the separated Sinhala words with pre-animated gestures, get the start and end key frame values and animation name to a text file, execute the program with user keyboard inputs, I used ‘Python’. Most important thing is play animations according to user’s speech is done by scripting in blender.
Below video shows a working time of me. Here I create an animation and then show how the final output will look like. 

Note: video playing speed increased.



Monday, March 16, 2020

Capturing and Pre-processing the Sign gesture

  
First we capture the sign gesture using web camera and then apply the Gaussian blur filter to the captured image to reduce the noise. After that, we convert the image from RGB to gray-scale. Then we try to locate the extract features in the hand using OTSU thresholding. From the above step, a function detect Multi-Scale returns 4 values x-coordinate, y-coordinate, width(w) and height(h) of the detected feature of the hand. Based on these 4 values draw a rectangle around the hand. Finally we can see the pre-processed image.

As in this picture, we capture the gesture using web camera by positioning the hand in this rectangular region. There should be proper light intensity in front of the camera and the background should be clear without any disturbing material.

Below images show some pre-processed gestures.
Pre-processed images 
                      

Wednesday, March 4, 2020

User Interface Designing of the System

Designing user interfaces for a system plays a major role in system design phase.  Users of the system uses the user interface to interact with the system. In case of that, every single detail about the user interface should be considered carefully to deliver a great experience to the user. User interfaces with less complexity makes easier for user to learn the system. When developing a system for speaking impaired people, the user interface plays more important role in the system.

We have selected Pyqt5 as the design tools for designing the user interfaces. Because, PyQt5 is a GUI widgets toolkit. It is a Python interface for Qt, one of the most powerful, and popular cross-platform GUI library. PyQt5 is a blend of Python programming language and the Qt library.

In order to keep the user interface simple as possible, the system has only three buttons to select introduction, sign to text mode or speech to sign mode. In sign to text mode, there is a text area to display text that displays the meaning of performed sign gesture . Further, in it has tips for how to use the system. Also,the layout of speech to sign mode consists 3D avatar that displays sign gesture related to the speech and text area to display the speech using text.
Main User Interface
Introduction
Display an alert box whether camera can or can not access
Sign to Text Mode
Instructions for how to use Sign to Text mode



 Usability of the system

We have measured the usability of the system by getting feedback form both speaking impaired and ordinary users. the questionnaire and some of the feedback given by the users are shown in below.
Their feedback were the best path to correct our mistakes in the system. According to all of these feedback we have design a graph to measure the usability of the system interface. The figure is shown in below.

Our Team