Cepstral Community

by **mikebayus** » Fri Feb 01, 2013 2:59 pm

I was just reading the Q and A page on the Cepstral web site regarding SSML tags, and it just hit me that you already have a way to implement vocal gestures for Cepstral voices. By vocal gestures, I meen those sounds that we all make that are not words.
Now that there is a trend toward making synthetic voices more emotional, and since I have nothing better to do just this here now minute, I thought I would compile a list of vocal gestures that I might use from time to time with my Amy voice. I do realize that her voice model is not there to record the wave files that are needed. Anyway, as I said, I have nothing better to do.
Here's a list of vocal gestures that I would wish for Amy:

|| '''Gesture ID''' || '''Gesture content''' ||
|| g0001_001 || cough ||
|| g0001_002 || cough twice ||
|| g0001_003 || cough hard ||
|| g0001_004 || clear throat ||
|| g0001_005 || breath in ||
|| g0001_006 || sharp intake of breath ||
|| g0001_007 || breath in through teeth ||
|| g0001_008 || sigh happy ||
|| g0001_009 || sigh sad ||
|| g0001_010 || hmm question ||
|| g0001_011 || hmm yes ||
|| g0001_012 || hmm thinking ||
|| g0001_013 || umm ||
|| g0001_014 || ummm ||f|| g0001_015 || err ||
|| g0001_016 || errr ||
|| g0001_017 || giggle ||
|| g0001_018 || giggle sarcastic ||
|| g0001_019 || laugh small ||
|| g0001_020 || laugh big ||
|| g0001_021 || laugh demonic ||
|| g0001_022 laugh hysterical ||
|| g0001_023 cry small ||
|| g0001_024 cry big ||
|| g0001_025 || ah positive ||
|| g0001_026 || ah negative ||
|| g0001_027 || yeah question ||
|| g0001_028 || yeah positive ||
|| g0001_029 || yeah resigned ||
|| g0001_030 || sniff ||
|| g0001_031 || sniff twice ||
|| g0001_032 || argh ||
|| g0001_033 || arrgh ||
|| g0001_034 || ugh ||
|| g0001_035 || ocht ||
|| g0001_036 || yay ||
|| g0001_037 || oh positive ||
|| g0001_038 || oh negative ||
|| g0001_039 || sarcastic noise ||
|| g0001_040 || yawn ||
|| g0001_041 || yawn big ||
|| g0001_042 || snore ||
|| g0001_043 || snore phew ||
|| g0001_044 || zzz ||
|| g0001_045 || raspberry ||
|| g0001_046 || raspberry twice ||
|| g0001_047 || brrr cold ||
|| g0001_048 || snortf ||
|| g0001_049 || ha ha (sarcastic) ||
|| g0001_050 || doh ||
|| g0001_051 || gasp ||

I think this is how to implement them:

<audio src='clear throat .wav' />"Excuse me,<audio src='cough twice.wav'Hello/>"
Anyway, I think that I brought this up with Cepstral some years ago, when I first found out about Loquendo and how they did it, but no one took me quite seriously. I think that the voice models used for this task would have to, well, not be afraid to "let there hair down." Now that I think about it, I don't know if Amy's voice model would have been up to the task.

by **AdamW** » Thu Feb 07, 2013 9:26 am

Hi Mike,

Yes, you could do this now if you had the sound files. IF this were ever to be done, we would probably just include the same sound effect files with all our voices.

Thanks for the feature request. We will let you know, if we decide to develop it.

-- AdamW
Cepstral Support

by **mikebayus** » Thu Feb 07, 2013 1:09 pm

Well that's almost right. I think that each voice model, when he or she records, would have to record each vocal gesture. The remark that I made at the end of my original post on this thread was in no way ment to be a slite, or anything like that. I know that I would feal silly doing some of them. Cry big for instance: "Waaaaa, haaaaah haaaaah haaaaah!
Even though I would be in a room all by myself, I would be inhibited. I don't know if Amy's Voice model, not being an actress, would be able to do it convincingly enough.

Cepstral Community

it just hit me

it just hit me

Re: it just hit me

Re: it just hit me

Who is online