finding help writing software...

How-to questions and answers

finding help writing software...

Postby RadSurfer » Sat Apr 14, 2007 1:28 pm

Is there anyone in this forum at all trying to write their own software
supporting Cepstral Engine(s) (particularly on linux) ?

What little experiementation I've been able to do results in sluggish
interpretation of text strings and is not really very practical for real-world
applications.
Now, if I could see that the software I can compile myself is actually able
to speak text at a reasonable pace (minimum of delay between discreet strings) then I would start working immediately on a plugin for the linux
version of Xchat.... which would prove very useful indeed.

Therefore, I am hoping to learn how to write the most efficient way to get
from a c-string to hearing the rendered spoken speech...

_cached_ speech is likely the best approach, but I get the impression that
swift doesn't really cache that much, nor is the user permitted any control
over the cache level...

DecTalk ISA card, for example caches too deeply and ordinary methods to
get DecTalk to respond once its cache is full is very troublesome;
using the Interrupt-driven DecTalk interface only works if you are fortunate enough to have that technical information, which I have not been
able to obtain yet. DecTalk ISA cards are considered deprecated; I do
not see the harm in being able to track down that information and use it
purely for my own personal use. I am not interested in any marketing.

Anyone here familiar with Speakupmodified.org? Wonderful people there too.

//RadSurfer//
RadSurfer
 
Posts: 8
Joined: Wed Apr 11, 2007 2:30 pm
Location: Pennsylvania, USA

Postby manojb » Sun Jun 17, 2007 2:20 am

Hi,
Included below are my reponses based on my limited knowledge of swift api:

1.) cepstral does offer a C API for speech synthesis. It is very intuitive to use. However for most purposes the command line utility swift should be sufficient to use. I have been using that primarily.

2.) To turn on caching pass '-p voice/cache-policy=mmap' to swift. I have noticed over 50% speed up in synthesis of the same text when mmap is passed to swift.

If you are using the C api the following call should do the job for you.

swift_port_set_param(port, "voice/cache-policy", swift_val_string("mmap"), SWIFT_ASYNC_NONE);

3.) I am not familiar with speakupmodified.org.
If you are familiar with high quality female TTS voices(open source/commercial) please point me to it.

Thanks,

Manoj.
manojb
 
Posts: 19
Joined: Fri Mar 09, 2007 3:35 pm

thank you

Postby RadSurfer » Sun Jun 17, 2007 10:44 am

Thank you for your informative reply.

Can you please offer any comments how to speed up the incredibly ANNOYING high-delay time following periods?
I do not understand why Cepstral sets an extraordinary LONG delay after periods (and comma's for that matter), but I am looking for ways to REDUCE it.

This what I use now to launch swift:
swift -n David -p "voice/cache-policy=mmap,speech/rate=210,audio/sampling-rate=16000" -t -f mp_.txt

under CentOS 5 linux.

The text is broken at . and ... and , on long text lines,
I can not imagine if the cepstral engine actually needs all that time to generate phonetics between sentences, of if this DELAY is due to shelling out to the shared object library, .so, to perform the actual speech on the fly.
The current method definitely introduces a DELAY which is annoying, and breaks my concentration compared to using
other speech engines.

As mentioned, I prefer the Dectalk ISA card, which I can run under MS-DOS, and Win98Se, and even Linux 2.6 kernel, and now I even have the Fonix Dectalk RT 4.64 speech package.
Their 'say' application, under linux, is a pure joke. A royal waste of $50.00 Fonix charges for this.

Thank you again for your reply. //Rich//
RadSurfer
 
Posts: 8
Joined: Wed Apr 11, 2007 2:30 pm
Location: Pennsylvania, USA

Postby manojb » Mon Jun 18, 2007 11:26 pm

1.) Here is a comment from multiport (a moderator) regarding the duration model.

"Cepstral's duration models as calculated for symbols such as periods, commas, etc., are fixed and not able to be managed by the user. "

Given the above limitation the only solution I can think of is to replace or periods/commas by <break time='1ms'/> before giving it to swift.

You can do this with the following fine lines of code in perl.

#!/usr/bin/perl -w

while (<>) {
$_ =~ s@(\.+)|(\,+)@<break time='1ms'/>@g;
print $_;
}

The simply do, cat mp.txt | perl foo.pl | swift -f - -p "voice/cache-policy=mmap,speech/rate=210,audio/sampling-rate=16000"

If this doesn't work, let me know.

2.) By Dectalk ISA do you mean the TTS software offered by fonixspeech? I have never tried it. I couldn't find a demo/evaluation version on their website.

Thanks,

Manoj.
manojb
 
Posts: 19
Joined: Fri Mar 09, 2007 3:35 pm


Return to How do I... ?

Who is online

Users browsing this forum: No registered users and 15 guests

cron