According necessary to find basic means of communication

According to the World Health Organization1
, there are more than 28 million deaf and hard-of-hearing people around the
world and up to 3 million deaf within Egypt, 37 million people worldwide are
blind. It is necessary to find basic means of communication among hard – of-
hearing or deaf people, blind and normal people. Systems that could act as
interpreters between vocal and hearing-impaired people would facilitate the
life of deaf and blind by integrating them in the society. Such systems should
perform bidirectional translation of sign language and spoken language. Few
systems were developed to solve the deaf communicating with blind or normal
translation. Each system restricted by several limitations based on either its
direction of inputs and outputs: or the methodology they use
(hardware/software, vision based system or hybrid system)   

The Arabic sign dictionary system 1 is vision software-
system working from text to signals direction. This system generates static signals
as typing font for static signs (letters, numbers), can be integrated with
different software as Microsoft Word. The Glove-Based Systems for example the
portable glove system 2 is a Hardware system working from signs to text
direction. In this system signer wear electronic hardware gloves to measure
hands and face motion level up to 6 degrees of freedom. Those systems are very
efficient on its measurements and can exactly describe each moving object. But
this kind of systems suffer from high cost, the signer has to be in one static
place. Moreover, they suffer from hard and long time to adapt with the hardware
used from the user’s perspective. Tessa 3 system translates English UK accent
sentences clerks into British Sign Language (BSL) signs, by using a 3D virtual
human. Transactions time is longer that is worse than transactions handled
without the system. This system limits the sentences that the system accepts to
a set of pre-defined ones that caused these negative results The system
described in 4 supports medical conversation between a hearing physician and
a deaf patient by showing the written transcription of physician’s spoken
sentences together with medical images (e.g., diet plans) on a tabletop
display. The system uses forms of communication without signs do not integrate
sign language support.  iCommunicator 5
makes effective two-way communication possible for persons who are deaf,
hard-of-hearing or experience unique communication challenges. The
iCommunicator translates in real-time: Speech to Text, Speech/Text to and Video
Sign-Language, Speech/Text to Computer Generated Voice. Limitations of this
system are the difficulty to adapt with the hardware as contains many Hardware
pieces with different purposes.

Current hardware based systems as mentioned before very
costly and make the signer not able to freely signing. On the other hand
hardware based system can measure 3D movements and exactly differentiate among
these movements while vision based systems use 2D imagining which losses the 3D
information. Thus its effectiveness isn’t as hardware based system but still
vision based system simplicity makes it hardly required.

SVBiComm system aims to provide new e-services to improve
communication. This is done by developing a new vision-based tool for
translating sign language (sequence of images) to text and then to speech as
one direction and translating voice spoken words to text and then to signs
represented by a 3D model  as a second
direction. The first direction: As a preprocessing phase video stream is
captured by the deaf user’s camera. Frames of interest are extracted and passed
down to server to be processed. The input could be as a text which is directly sends
to the server using a special purpose deaf-keyboard.  The server then applies some image processing
techniques on the passed frames.  Then
the skin detection mechanism is applied on the image, in order to get the black
and white image; where the white parts are the skin detected parts. The next
step is passing down the image to a median filter to remove any possible noise
on the image. The image is then portioned to pieces. Afterwards, the pieces are
passed to a scale down function to make them all of one size. Then merge the
pieces again into one image in one specific order starting from top left. The
last step is to feed the image into a well-trained neural network to recognize
the image and return the corresponding text. Finally, the text produced is
converted to speech using TTS tool, then this speech is played on the normal/blind
client’s machine. The second direction: – The system records audio
stream from the normal/blind client and send it to the server to be recognize.
The speech is filtered before recognition then feature extraction function is
applied. The speech is recognized by using Dynamic Time Wrapping (DTW), and the
corresponding text is returned. The text is then passed down to the 3D
graphical model on the deaf machine to animate the text as visual signals.
SVBiComm system with its video and voice capturing devices uses the object
oriented programming language to implement image processing algorithm. The
image could be captured from any distance with non/black background without any
skin colored objects; speech should be in quite space with minimum noise. The proposed
vision-based hand tracking system does not require any special markers or
gloves that can operate on a commodity PC with low-cost cameras. SVBiComm
system provides user by: static sign translation; isolated words animation;
isolated word translation; continuous sentence translation; continuous sentence
playing by voice; inputs from deaf keyboard. SVBiComm system analyses video and
voice streams and its content on real time with high speed network connection
and high performance computing capabilities.

1 1