Please use this identifier to cite or link to this item: https://elibrary.tucl.edu.np/handle/123456789/18866
Title: GESTURE SYNTHESIS USING MULTIMODAL SUPERVISED LEARNING
Authors: BHANDARI, PRASUN
ROUNIYAR, RAHUL
SHRESTHA, RONAB
GURAU, SUNIL
Keywords: Gesture synthesis,;Supervised learning,;Human Computer Interaction,
Issue Date: May-2023
Publisher: I.O.E. Pulchowk Campus
Institute Name: Institute of Engineering
Level: Bachelor
Abstract: One of the long-standing ambitions of the modern science and engineering has been to create a non-human entity that manifests human-like intelligence and behavior. One step to achieving the goal is executing a communication just like the humans do. Human speech is often accompanied by a variety of gestures which add rich non-verbal information to the message the speaker is trying to convey. Gestures add clarity to the intention and emotions of the speaker and enhance the speech by adding visual cues alongside audio signal. Our project aims to synthesize co-speech gestures by learning from individual speaker’s style. We follow a data-driven approach instead of rule-based approach as the audio-gesture relation is poorly captured by a rule-based system due to issues like asynchrony and multi-modality. As is the current trend, we train the modal from in-the-wild videos embedded with audio instead of relying on the motion capture of subjects in lab for video annotation. For establishing the ground truth for the data set of video frames, we rely on an automatic pose detection system. Although the ground truth signal tends to be not as accurate as manually annotated frames, the approach relieves us of time and labor expense. We perform the crossmodal translation from monologue speech of a single speaker to their hand and arm motion based on the learning of temporal correlation between the sequence of pose and audio sample.
Description: One of the long-standing ambitions of the modern science and engineering has been to create a non-human entity that manifests human-like intelligence and behavior. One step to achieving the goal is executing a communication just like the humans do.
URI: https://elibrary.tucl.edu.np/handle/123456789/18866
Appears in Collections:Electronics and communication Engineering

Files in This Item:
File Description SizeFormat 
Prasun Bhandari et al. be report electronics may 2023.pdf1.81 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.