VOICE MORPHING BY LINEAR PREDICTIVE CODING COEFFICEINTS MAPPING USING ARTIFICIAL NEURAL NETWORK

Basnet, Binod

Please use this identifier to cite or link to this item: https://elibrary.tucl.edu.np/handle/123456789/7605

Title:	VOICE MORPHING BY LINEAR PREDICTIVE CODING COEFFICEINTS MAPPING USING ARTIFICIAL NEURAL NETWORK
Authors:	Basnet, Binod
Keywords:	Voice Morphing;Neural Network
Issue Date:	Nov-2015
Publisher:	Pulchowk Campus
Institute Name:	Institute of Engineering
Level:	Masters
Citation:	MASTER OF SCIENCE IN COMPUTER SYSTEM AND KNOWLEDGE ENGINEERING
Abstract:	Voice Morphing (VM) modifies speaker voice (source speaker) to be perceived as if another speaker (target speaker) had uttered it. The voice morphing has been done by two methods a) Voice Morphing based on LPC mapping and PSOLA b) Voice Morphing based on LPC mapping using NN. The Voice conversion in first method is done by PSOLA (Pitch Synchronous Overlap and Add) on source based on target speech to obtain an intermediate speech followed by residual or excitation signal extraction that approximates the target speaker excitation that finally combines with mapped spectral/LPC parameters (LPC coefficients, Formants) of the target speaker to produce the Morphed speech. Whereas the Voice morphing using LPC mapping by Artificial Neural Network is based on the training and finding the transformation network that transformed the source speaker LPC or Vocal Tract parameters and excitation into the targeted speaker Vocal Tract Parameters. The Parameters are further used to resynthesize the target speech. The Artificial Neural Networks (ANN) approximates the mapping function that predicts the highly nonlinear relationship between vocal tracts and pitch of a source speaker to that of a target speaker. The results of voice Morphing are evaluated among various 12-24-12, 14-28-14 and 16-32-16 Neural Network architectures. Also the best NN architecture is compared with PSOLA based VM architecture. The subjective and objective measure is used to perform evaluation. The transformation results of ANN architectures and PSOLA based Voice Morphing system are compared based on the quality, SNR and spectral properties of the converted target speech.
Description:	Voice Morphing (VM) modifies speaker voice (source speaker) to be perceived as if another speaker (target speaker) had uttered it.
URI:	https://elibrary.tucl.edu.np/handle/123456789/7605
Appears in Collections:	Electronics and Computer Engineering

Files in This Item:

File	Description	Size	Format
069MSCS668.zip		2.05 MB	Unknown	View/Open

Show full item record

TUCL eLibrary

Easy and open access to all types of digital resources of TUCL