Control over punctuation delays.

PostPosted: Wed Apr 11, 2007 6:01 pm
by RadSurfer
I have used many software-based voices over the years, and some are definately better than others where understandability and features are concerned. Some, like AT&T have nice features but slur the speach to the point its really unsuable...

Cepstral David that I ordered has a much better ability to pronounce words than AT&T seems to have, yet has this very annoying pause between taking in text and delivering actual speech audio, add to this, I find that the delays between ; and , and . to be somewhat exaggerated, and not adjusted with the user-specified words-per-minute.

I wonder if the delay for ; , . is even adjusted at all for a given wpm?

I am not sure if ; and , have different "weights", I'd use the same delay value, but . (period) is a little longer than a , (comma), also the
: (colon) should NOT cause delays in my opinion.

I did discover that if you output to a WAVE FILE that the delay is not at all as exaggerated! I am actually much more pleased with the performance of the Wave output then that pathetic Text-Aloud-MP3 which has become more garbage than usable for me in Windows.
Text-Aloud-MP3 crashes for no apparently reason on text files and its it nightmare to make any progress with it.

So far, saving out to wave using Cepstral under linux has been BEAUTIFUL and flawless.

However, I still urge the addition of a user-define delay in miliseconds for the pause taken up by , ; and . as it greatly enhances the users ability to absorb what is spoken more uniformly. Delays actually break my concentration! If I need that extra time, I will set it to the proper value.

I also have DecTalk ISA and DoubleTalk ISA cards; of these I prefer listening to DecTalk. DecTalk is 100% absolutely consistant in the way it pronouces words! When I time DecTalk's reading of a file, I know I can make accurate predictions how long a file will take to read through completely. And DecTalk does not suffer from exaggerated punctuation delays.

Just my 2 cents worth. Cepstal is the real winner for me, because it works
very well in Linux (CentOS and Fedora Core) and probably Win32.

PostPosted: Wed Apr 18, 2007 1:35 pm
by Spectral
RadSurfer wrote:Text-Aloud-MP3 crashes for no apparently reason on text files and its it nightmare to make any progress with it.

I quite agree that crashing apps are a nightmare.

Yes, Text-Aloud-MP3 may crash but typically this only happens in the following situations:

(i) the program hasn’t been legally acquired
(ii) the program has been registered with a fake key
(iii) antivirus/firewall software is not properly configured to work with the app
(iv) the program is an outdated version, e.g., v 1.xx. (Current version is v 2.240)

Issues (i-ii) are fixed by making a legal purchase.
Issues (iii-iv) can be fixed in a few hours, if not minutes, by contacting the developer support team directly. They are truly responsive.

Happy conversion!


Text-Aloud-MP3 not fully tested

PostPosted: Fri Apr 20, 2007 11:11 am
by RadSurfer
It is quite obvious you have not fully tested Text-Aloud-MP3.
Anyone who has tried using it over the long term quickly learns the program
gags on the least expected paragraph... single sentence, or character
Sometimes reformatting the text actually will get you past the point the original text crashes,
for example removing all CR/LF's and letting Text-Aloud perform internal
wrapping (or whatever it does),
It is either Text-Aloud's own internal operation, and/or some combination
of its internal operation and the selected speech engine perofrmance.

Take a poll: ask if others have not ran into a similar problem!

There is also the reported problems with a memory leak in Text-Aloud, and other specific issues. These are well known.

I do not see myself purchasing another version; and would really have to
be convinced in the future to want to use.

I used it with AT&T voices which I discovered tend to slur words too much anyways, and also pronouce words differently based on Case And Other Issues. And there's no file describing how to cope with these reading characteristics. Just not worth my time anymore.

PostPosted: Fri Apr 20, 2007 4:15 pm
by Spectral
PostPosted: Fri May 25, 2007 8:43 pm
Can some one please comment on the original issue mentioned in the mail thead(Control over punctuation delays.)?

Is there some way to control punctuation delays other than manually inserting SSML tags? IMHO, the auditory experience would be better if I could make the pause between commas and periods slightly smaller without making rest of the speech faster.



PostPosted: Sat Jun 09, 2007 8:43 am
by MultiPort
Manoj, RadSurfer,

This is a good point. Cepstral's duration models as calculated for symbols such as periods, commas, etc., are fixed and not able to be managed by the user.

The user can influence non-grammatical pauses by inserting breaks via SSML. (See Example #1)

Cepstral has considered adding a user XML vehicle to allow definitions such as the ones listed above or even treatment of "$", or preferred date format and more. This is non trivial, but perhaps will be added to the next major release.

Punctuation modulation

PostPosted: Sun Jun 15, 2008 10:09 pm
by TaoPhoenix
I have experimented a little with "stacking" punctuation. For texts such as SciFi stories, I found that certain complex phrases took the mind some time to process, and if the voice continued too quickly, it felt somewhat like a jolt. There, I simply added extra punctuation to force longer delays.

Occasionally the word sounds differently if certain punctuation follows it. If you want the nuance of the word to be different, you can completely swap out the existing punctuation.

When a punctuation is too long, the sentence "hangs". Same theme - replace the period with something faster, like a dash.

You'd save the modified text file with a special label so a person reading it wouldn't be confused by the alterations.

(Same thing with words... if your Mind's Ear has a dialect of its own, you can fiddle with the word into what looks like a nonsense spelling.)