Research On Gesture Spotting Techniques
Volumn 2

Research On Gesture Spotting Techniques: A Review

Gaurav G. Sambhe
Department of M.Tech (VLSI Design)
Jhulelal Institute of Technology, Lonara, Nagpur(INDIA)

Prof. Nakul Nagpal
Asst. Professor
Jhulelal Institute of Technology, Lonara, Nagpur(INDIA)


Gesture recognition can be seen as a way for computers to begin to understand human body language, thus building a richer bridge between machines and humans than primitive text user interfaces or even GUIs (graphical user interfaces), which still limit the majority of input to keyboard and mouse. The applications of gesture recognition are manifold, ranging from sign language through medical rehabilitation to virtual reality, keeping in this mind we proposed a system, which will allow humans, to control the machines by simple human gesture.

Keywords – Gesture Spotting, Gesture Recognition, Pattern Recognition, Computer Vision..


Gesture recognition is a topic of modern technology with the goal of interpreting human gestures via mathematical algorithms.[1] Reliable and natural human-robot interaction is a subject that has extensively studied by researchers in the last few decades. Nevertheless, in most of cases, human beings continue to interact with robots recurring to the traditional process. Probably, this is because these “more natural” interaction modalities have not yet reached the desired level of maturity and reliability. Gestures are one of the most important modes of communicating with computer in an interactive environment. Recent advances in computer vision and machine learning have led to a number of techniques for modeling gestures in real-time environment.[1][2]Gestures can originate from any bodily motion or state but commonly originate from the face or hand. Human motion recognition is a field with a wide variety of applications. Of particular interest is gesture recognition for new modes of human-computer interaction, and gait recognition for video surveillance systems and intrusion detection. The process of communication is to transfer information from one entity to another. Naturally, hand gestures are powerful human-to-human communication channel, which forms a major part of information transfer in our everyday life. There are many ways to perform and interpret a human action using either hands and/or arms. A gesture is a spatio-temporal pattern, which may be static, dynamic, or both. Hand gestures are easy to use and more convenient for humans to interact with computers It is very common to see a human being explaining something to another human being using hand gestures. Making an analogy, and given our demand for natural human-robot interfaces, gestures can be used to interact with machines in an intuitive way. Recent research in gesture spotting aimed at applications in many different fields, such as sign language (SL) recognition, electronic appliances control, video-game control and human-computer/robot interaction.[2][4] The development of reliable and natural human-robot interaction platforms can open the door to new robot users and thus contribute to increase the number of existing robots..


The gestural equivalent of direct manipulation interfaces is those, which use gesture alone.[1][11] These can range from interfaces that recognize a few symbolic gestures to those that implement fully-fledged sign language interpretation. Similarly, interfaces may recognize static hand poses, or dynamic hand motion, or a combination of both. In all cases, each gesture has an unambiguous semantic meaning associated with it that can use in the interface

A. Tracking Technologies

Gesture only interfaces with syntax of many gestures typically require precise hand pose tracking. A common technique is to instrument the hand with a glove, which is equipped with a number of sensors, which provide information about hand position, orientation, and flex of the fingers.[2][4]

B. Natural Gesture Only Interfaces

At the simplest level, effective gesture interfaces can developed which respond to natural gestures, especially dynamic hand motion.


In computer interfaces, two types of gestures are distinguished:

  1. Offline gestures: Those gestures that are processed after the user interaction with the object. An example is the gesture to activate a menu.
  2. Online gestures: Direct manipulation gestures. They are used to scale or rotate a tangible object.

A. Types of Gestures

  1. Gesticulation: Spontaneous movements of the hands and arms that accompany speech.
  2. Language-like gestures: Gesticulation that is integrated into a spoken utterance, replacing a particular spoken word or phrase.
  3. Pantomimes: Gestures that depict objects or actions, with or without accompanying speech.
  4. Emblems: Familiar gestures such as V for victory, thumbs up, and assorted rude gestures.
  5. Sign languages: Linguistic systems, such as American Sign Language, which are well defined.


Here we can see that the user action is captured by a camera and the image input is fed into the gesture recognition system, in which it is processed and compared efficiently with the help of an algorithm. The virtual object or the 3-d model is then updated accordingly and the user interfaces with machine with the help of a user interface display. [4][7]


Gesture recognition is useful for processing information from humans, which is not conveyed through speech or type. As well, there are various types of gestures, which can be identified by computers. [1][5][6]

  1. Sign language recognition: Just as speech recognition can transcribe speech to text, certain types of gesture recognition software can transcribe the symbols represented through sign language into text.
  2. Sign language recognition: Just as speech recognition can transcribe speech to text, certain types of gesture recognition software can transcribe the symbols represented through sign language into text.
  3. Directional indication through pointing: Pointing has a very specific purpose in our society, to reference an object or location based on its position relative to ourselves. The use of gesture recognition to determine where a person is pointing is useful for identifying the context of statements or instructions. This application is of particular interest in the field of robotics.
  4. Control through facial gestures: Controlling a computer through facial gestures is a useful application of gesture recognition for users who may not physically be able to use a mouse or keyboard. Eye tracking in particular may be of use for controlling cursor motion or focusing on elements of a display.
  5. Alternative computer interfaces: Foregoing the traditional keyboard and mouse setup to interact with a computer, strong gesture recognition could allow users to accomplish frequent or common tasks using hand or face gestures to a camera.
  6. Immersive game technology: Gestures can be used to control interactions within video games to try to make the game player’s experience more interactive or immersive.
  7. Virtual controllers: For systems where the act of finding or acquiring a physical controller could require too much time, gestures can be used as an alternative control mechanism. Controlling secondary devices in a car or controlling a television set are examples of such usage.
  8. Affective computing: In affective computing, gesture recognition is used in the process of identifying emotional expression through computer systems.
  9. Remote control: With gesture recognition, “remote control with the wave of a hand” of various devices is possible. The signal must not only indicate the desired response, but also which device to be controlled.


There are many challenges associated with the accuracy and usefulness of gesture recognition software. For image-based gesture recognition there are limitations on the equipment used and image noise. Images or video may not be under consistent lighting, or in the same location. Items in the background or distinct features of the users may make recognition more difficult. [8][9][10]

Gesture Recognition Challenges:

  1. Latency: Image processing can be significantly slow creating unacceptable latency for video games and other similar applications.
  2. Lack of Gesture Language: Different users make gestures differently, causing difficulty in identifying motions.
  3. Robustness: Many gesture recognition systems do not read motions accurately or optimally due to factors like insufficient background light, high background noise etc.
  4. Performance: Image processing involved in gesture recognition is quite resource intensive and the applications may found difficult to run on resource constrained devices.


I would like to express sincere gratitude and appreciation to all those who gave me the possibility to complete this paper. A special thanks to my Project Guide Prof. Nakul Nagpal, Whose help, stimulating suggestions and encouragement, helped to coordinate project especially in writing this paper. Words often fail to pay one’s gratitude oneself, still we would like to convey sincere thanks to our H.O.D Prof. Sanjeev Sharma , without whose encouragement and guidance this project would not have materialized.


  1. S. Mitra and T. Acharya, “Gesture recognition: a survey,” IEEE Trans. Systems, Man Cybernetics, vol. 37, no. 3, pp. 311–324, 2007.
  2. H. Francke, J. R. del Solar and R. Verschae, “Real-time hand gesture detection recognition using boosted classifiers and active learning,” in Proc. Pacific Rim Adv. Image and Video Tech., 2007, pp. 533–547.
  3. T. Kirishima, K. Sato and K. Chihara, “Real-time gesture recognition by learning and selective control of visual interest points,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 27, no. 3, pp. 351–364, 2005.
  4. D. Weinland, R. Ronfard and E. Boyer, “Free viewpoint action recognition using motion history volumes,” Computer Vision and Image Understanding, vol. 104, no. 2/3, pp. 249–257, 2006.
  5. I. Mihara, Y. Yamauchi and M. Doi, “A real-time vision-based interface using motion processor and applications to robotics,” Systems and Computers in Japan, vol. 34, no. 3, pp. 10–19, 2003.
  6. G. Lalit and M. Suei, “Gesture-based interaction and communication: automated classification of hand gesture contours,” IEEE Trans. Systems, Man and Cybernetics, vol. 31, no. 1, pp. 114–120, 2001.
  7. J. Yang, W. Bang, E. Choi, S. Cho, J. Oh, J. Cho, S. Kim, E. Ki and D. Kim, “A 3D hand-drawn gesture input device using fuzzy ARTMAP-based recognizer,” J. of Systemics, Cybernetics and Informatics, vol. 4, no. 3, pp. 1–7, 2006.
  8. Y. Yamashita and J. Tani, “Emergence of functional hierarchy in a multiple timescale neural network model: a humanoid robot experiment,” PLoS Comput. Biol., vol. 4, no. 11, pp. 1–17, 2008.
  9. B. Peng and G. Qian, “Online gesture spotting from visual hull data,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 33, no. 6, pp. 1175–1188, 2011.
  10. H. H. Avilés, W. Aguilar and L. A. Pineda, “On the selection of a classification technique for the representation and recognition of dynamic gestures,” in IBERAMIA 2008, H. Geffner et al., Ed. Berlin- Heidelberg: Springer, 2008, pp. 412–421.
  11. H. D. Yang, S. Sclaroff and S. W. Lee, “Sign language spotting with a threshold model based on conditional random fields,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 31, no. 7, pp. 1264– 1277, 2009.

Related posts





“To Study the Industry-Institutes Interface with Reference to Management Education in RTM Nagpur University, Nagpur”


Leave a Comment