Write individual characters in the air in front of a webcam and have them recognized and displayed in real time.
Use the voice output feature to have recognized characters spoken aloud for accessibility or kiosk demos.
Retrain the CNN on a custom character dataset to support a different language or script.
Build a contactless text input prototype for AR or kiosk applications using the gesture recognition pipeline.
Requires Python 3.10, a working webcam, and pip-installed dependencies. GPU is optional for inference but speeds up retraining.
This project is a Python application that lets you write characters in the air in front of a webcam, then recognizes what you wrote and displays the result on screen. You move your finger through the air as if writing on an invisible surface, and the system figures out which letter or character you intended. The recognition pipeline works in a few steps. The webcam captures video continuously. A library called MediaPipe analyzes each frame to find your hand and locate your fingertip in space. As you move your fingertip, the system records the path and draws it onto a virtual canvas. That canvas image is then fed into a neural network trained to recognize handwritten characters, and the predicted character appears in real time along with a confidence score. The machine learning part uses a Convolutional Neural Network built with TensorFlow and Keras. This type of network is commonly used for image classification tasks. The system also shows a frames-per-second counter and includes voice output so the recognized character can be spoken aloud. The project has some noted limitations. Recognition accuracy drops under poor lighting or with a low-quality webcam. Fast hand movement reduces accuracy, and the current version is limited to individual characters rather than continuous word or sentence input. The README lists future work including full sentence recognition, multilingual support, and mobile or AR/VR integration, but those are not part of the current release. To run it, you need Python 3.10, a working webcam, and the dependencies installed via pip. The main entry point is a single Python script. A GPU is optional and would help only if you are retraining the model yourself.
← 13127905 on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.