Cepstral, LLC
Known Issues under Sparc-Solaris




SSML Implementation:
  • "interpret-as" attributes of "<say-as>" elements are not properly handled.
  • If the input text contains back-to-back <break> SSML elements, swift only handles the last one in the list. For instance, the following only pauses for three seconds instead of five:
    "This is a test and now there should be a 5 second pause <break time="1s" /> <break time="1s" /> <break time="3s" /> Now i am back after a 5 second pause."
  • RIFF audio files with a sampling rate that differs from the native sampling rate of the voice are not correcly resampled when specified using the <audio> SSML element.
  • Applications using Swift TTS exit silently when a sentence that begins with an SSML <break> or <mark> tag is fed to a non-US English voice for synthesis.
  • Swift does not currently support the xml:lang attribute of SSML tags that make use of it, such as <speak>, <voice>, etc.
  • Swift does not reject bad SSML markup. It will complain about blatant syntactic errors, such as missing close tags, but not about semantic errors, such as invalid tags and attributes.
  • The SSML <s> and <p> tags may not have the proper effect. This is supposed to be read as two sentences, but we read it as one:
    "<speak>one sentence <s>and another</s></speak>"
  • Swift does not currently support the SSML <lexicon> tag for adding additional lexicons.
  • The SSML tag:attribute pair <voice gender="neutral"> should select a neuter-gendered voice, but it currently does nothing.
  • The "age" attribute to the SSML <voice> tag is looking for an exact match, when it probably ought to be looking for the closest match. For instance, <voice age="10"> will only select a ten-year-old voice, or fall back to the default voice if one is not found.
  • Cannot provide multiple voice selections in the SSML <voice> tag. The tag <voice name="David William"> should request David, and use William as a fallback. Instead, Swift interprets as a request for a voice named "David William."
  • The default SSML voice may not be the default voice for that gender. To illustrate, Swift may switch voices while executing the following SSML code, when it should not. (Assume the default voice is David.)
    <speak>This is the default voice, which should also be the <voice gender="male">default voice</voice></speak>
  • Changing voice in SSML resets prosody and pitch. In the following examples, the <voice> tag should not reset the values of the <prosody> tag:
    "<prosody range='-95%'><voice name='David'>The cat jumped over the moon.</voice></prosody>"
    -and-
    "<prosody pitch='+200Hz'><voice name='David'>The cat jumped over the moon.</voice></prosody>"
  • The duration attribute of the SSML <prosody> tag is not supported. The following examples should play for exactly 1 second:
    "<prosody duration='1s'>hello world</prosody>"
    -and-
    "<prosody duration='1000ms'>hello world</prosody>"
  • Prosody as specified in the SSML prosody element is not applied to output generated using the SSML phoneme element. For example:
    "<prosody rate='-55%'>Test. <phoneme ph='t eh1 s t'> Test.</phoneme></prosody>"
  • Pitch contours passed in via the SSML <prosody> tag with values not falling between 0% and 100% are supposed to be ignored. Swift attempts to synthesize with these bogus values.
  • SAMPA stress markers in SSML <phoneme> tag not supported.
  • The SSML <mark> tag causes a crash in Canadian French, Americas Spanish, and Italian voices.
  • Text position synthesis event information is incorrect when the <sub> SSML element is encountered.



Swift TTS Engine:
  • audio/sampling-rate parameter is ignored when synthesizing to the audio device. If you are outputting to an audio file, this parameter is honored.
API issues:
  • swift_port_stop() and swift_port_pause() take a parameter of an event at which to stop, such as SWIFT_EVENT_SENTENCE to stop before the next sentence. However, it stops or pauses more-or-less immediately regardless of what event you pass it.
  • The global "swift_version" variable exported through the Swift API shows the version to be "4.1.0-beta." It is not a beta build; This should show "4.1.0-release."



swift command line tool:
  • Setting the 'audio/sampling-rate' parameter does not resample if the output is played to the audio device. If, however, you use it in conjunction with the '-o output.wav' option to direct the output to a wave file, the audio is indeed resampled.



Installation:
  • The installer generates a swift.xml file that may not be readable by the Swift TTS Engine on non-English systems. The problem lies in the use of the `date` command when writing the "Generated by install.sh on..." comment. If you run into an issue with this, you can fix it by editing install.sh, line 164. Change this line:

    <!-- Generated by install.sh on `date` -->

    To:

    <!-- Generated by install.sh on `date +"%F at %k:%M:%S"` -->
  • Installation fails on some Solaris systems because it cannot create symlinks in /usr/local/bin. This directory is not present on some Solaris systems.
  • Installation fails on some Solaris systems that do not have a /dev/null. The installation is actually complete at this point, but the test run to render an audio file to /dev/null has failed.



I have an issue which isn't covered here

For all other technical support inquiries, please use our
Support Request Form. Please provide as much technical information as possible. Thank you!