Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
Wang et al., 2024 - Google Patents
[go: Go Back, main page]

Wang et al., 2024 - Google Patents

Watch your mouth: silent speech recognition with depth sensing

Wang et al., 2024

View PDF
Document ID
14780443601759146232
Author
Wang X
Su Z
Rekimoto J
Zhang Y
Publication year
Publication venue
Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems

External Links

Snippet

Silent speech recognition is a promising technology that decodes human speech without requiring audio signals, enabling private human-computer interactions. In this paper, we propose Watch Your Mouth, a novel method that leverages depth sensing to enable …
Continue reading at dl.acm.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • G06K9/00268Feature extraction; Face representation
    • G06K9/00281Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00335Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/20Image acquisition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/24Speech recognition using non-acoustical features
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions

Similar Documents

Publication Publication Date Title
Wang et al. Watch your mouth: silent speech recognition with depth sensing
US12452390B2 (en) Word flow annotation
US11482134B2 (en) Method, apparatus, and terminal for providing sign language video reflecting appearance of conversation partner
Ong et al. Automatic sign language analysis: A survey and the future beyond lexical meaning
Luettin Visual speech and speaker recognition
US12430833B2 (en) Realtime AI sign language recognition with avatar
Von Agris et al. Recent developments in visual sign language recognition
Su et al. Liplearner: Customizable silent speech interactions on mobile devices
Zhang et al. Speechin: A smart necklace for silent speech recognition
WO2017112813A1 (en) Multi-lingual virtual personal assistant
US11861778B1 (en) Apparatus and method for generating a virtual avatar
KR20120120858A (en) Service and method for video call, server and terminal thereof
Rathipriya et al. A comprehensive review of recent advances in deep neural networks for lipreading with sign language recognition
Lim et al. Spellring: Recognizing continuous fingerspelling in american sign language using a ring
Yau Video analysis of mouth movement using motion templates for computer-based lip-reading
Zhang et al. Speech-driven personalized gesture synthetics: Harnessing automatic fuzzy feature inference
Liu et al. A survey on deep multi-modal learning for body language recognition and generation
Kunhoth et al. VisualAid+: Assistive system for visually impaired with TinyML enhanced object detection and scene narration
CN111191490A (en) Lip reading research method based on Kinect vision
Innocente et al. Deep Learning-Based Lip-Reading for Vocal Impaired Patient Rehabilitation.
Cai et al. SignGlass: First-Person View Comprehensive and Generalizable ASL Translation Using Wearable Glass
Chokchaitam et al. A System for Detecting and Translating Thai Sign Language Using Image Processing and Artificial Intelligence
CN120877390B (en) Sign language translation methods, devices, computer equipment and storage media
Vaishnavi et al. AI Powered Trifocals for Sign Language Detection and Speech Recognition
Su et al. Multimodal Silent Speech-based Text Entry with Word-initials Conditioned LLM