Please use this identifier to cite or link to this item:
https://elibrary.tucl.edu.np/handle/123456789/7605
Title: | VOICE MORPHING BY LINEAR PREDICTIVE CODING COEFFICEINTS MAPPING USING ARTIFICIAL NEURAL NETWORK |
Authors: | Basnet, Binod |
Keywords: | Voice Morphing;Neural Network |
Issue Date: | Nov-2015 |
Publisher: | Pulchowk Campus |
Institute Name: | Institute of Engineering |
Level: | Masters |
Citation: | MASTER OF SCIENCE IN COMPUTER SYSTEM AND KNOWLEDGE ENGINEERING |
Abstract: | Voice Morphing (VM) modifies speaker voice (source speaker) to be perceived as if another speaker (target speaker) had uttered it. The voice morphing has been done by two methods a) Voice Morphing based on LPC mapping and PSOLA b) Voice Morphing based on LPC mapping using NN. The Voice conversion in first method is done by PSOLA (Pitch Synchronous Overlap and Add) on source based on target speech to obtain an intermediate speech followed by residual or excitation signal extraction that approximates the target speaker excitation that finally combines with mapped spectral/LPC parameters (LPC coefficients, Formants) of the target speaker to produce the Morphed speech. Whereas the Voice morphing using LPC mapping by Artificial Neural Network is based on the training and finding the transformation network that transformed the source speaker LPC or Vocal Tract parameters and excitation into the targeted speaker Vocal Tract Parameters. The Parameters are further used to resynthesize the target speech. The Artificial Neural Networks (ANN) approximates the mapping function that predicts the highly nonlinear relationship between vocal tracts and pitch of a source speaker to that of a target speaker. The results of voice Morphing are evaluated among various 12-24-12, 14-28-14 and 16-32-16 Neural Network architectures. Also the best NN architecture is compared with PSOLA based VM architecture. The subjective and objective measure is used to perform evaluation. The transformation results of ANN architectures and PSOLA based Voice Morphing system are compared based on the quality, SNR and spectral properties of the converted target speech. |
Description: | Voice Morphing (VM) modifies speaker voice (source speaker) to be perceived as if another speaker (target speaker) had uttered it. |
URI: | https://elibrary.tucl.edu.np/handle/123456789/7605 |
Appears in Collections: | Electronics and Computer Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
069MSCS668.zip | 2.05 MB | Unknown | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.