Exclusive Freebie: Text-To-Speech Utility

First of the month (more or less) and therefore time for another Exclusive Freebie! This month ActiveDen author flashanctuary offers up an interesting tool making use of the Google Text-To-Speech API. Check it out after the jump!


Using the Unofficial Text-To-Speech Google API

Not so long ago, google added a new cool feature to the google translate system named the text-to-speech function. Is isn’t an official API, but anyone can make use of the service. All you need to do is access a url which generates an mp3 file. This file can be played in flash like any mp3 file.


Step 1: Creating the text-to-speech Class

The first step is to create the text-to-speech class. Open flash and select the file menu, new submenu and new action script file. Then save this file on the hard-disk in your desired package. Let’s name this class and file Text2Speech.as.

package com.flashanctuary.text2speech
{
	// flash imports
	import flash.events.Event;
	import flash.media.Sound;
	import flash.media.SoundChannel;
	import flash.media.SoundTransform;
	import flash.net.URLRequest;

	/**
	 * Text2Speech Main class
	 */
	public class Text2Speech
	{
		public function Text2Speech() : void
		{	

		}
	}
}

Step 2: Declaring Necessary Variables

First, we need the url constant link to the Google API. Another constant declared regards the google api limitation: it can not play a string which is longer than 100 characters. Of course, my flash file plays phrases longer than 100 characters, I’ll explain how later on.

Other variables that we need inlcude: the text to be played, an array to contain phrases of maximum 100 character from the text, the current sentence position to be played at one moment, the language where to play the text. We also need three variables to play the sound: a Sound variable, a SoundChannel variable and a SoundTransform variable.

// google api link
private static const url : String = "http://translate.google.com/translate_tts?";

// maximum number of characters supported by google api
private static const noOfMaxChars : Number = 100;

//
private var _text : String;
private var _sentences : Array;
private var _sentencePosition : int;

//
private var sound : Sound;
private var soundChannel : SoundChannel;
private var sTransform : SoundTransform;

//
private var _volume : Number = 0.5;
private var _currentPosition : Number = 0;

//
private var _language : String;

Step 3: Let’s Initialize it!

Now we would need a public function to initialize this class: this will contain the text we want to play and the language.

public function init(language : String, text : String) : void
{
	// set language
	_language = language;

	// if the number of chars is longer than the maximum number of chars supported
	// the the text will be split
	if (text.length <= noOfMaxChars)
	{
		initSound(url + "tl=" + language + "&q=" + text);
	}
	else
	{
		_sentences = new Array();

		// split text
		divideText(text);
		_sentencePosition = 0;

		// begin palying the first part of the text
		initSound(url + "tl=" + language + "&q=" + _sentences[_sentencePosition]);
	}
}

Step 4: Trim Text

If our text length is longer than the supported number of characters, than we must split it into sentences of a maximum of 100 characters. In order to do this, we create a recursive function to go through all the text and insert the sentences of maximum 100 characters in an array.

private function divideText(str : String) : void
{
	var substr : String;
	if (str.length >= noOfMaxChars)
	{
		substr = str.substring(0, noOfMaxChars + 1);
	}
	else
	{
		substr = str;
	}

	if (substr.charAt(substr.length - 1) != " " && str.length >= noOfMaxChars)
	{
		var index : int = substr.lastIndexOf(" ");
		substr = str.substring(0, index);
	}

	_sentences.push(substr);

	if (str.substring(substr.length + 1, str.length).length > 0)
	{
		divideText(str.substring(substr.length + 1, str.length));
	}
}

Step 5: Play the Sentences

From now on, the process is similiar to when you create an mp3 player. Firstly, we initialize the sound variables that I spoke about in the first instance and after this we begin to play the first sentence from the whole sentences array.

When this sentence has finished playing, we go and play the next one (when the SOUND_COMPLETE events is happening). And so on and so forth until all maximum 100 characters sentences from our initial text are played.

public function initSound(path : String) : void
{
	// reset sounds chanels, sound and sound transform if they are not null
	if (soundChannel != null)
	{
		stopSound();
		soundChannel = null;
	}

	if (sound != null)
	{
		sound = null;
	}

	if (sTransform != null)
	{
		sTransform = null;
	}

	_currentPosition = 0;

	// create the new sound and begin playing
	sound = new Sound(new URLRequest(path));
	soundChannel = new SoundChannel();
	sTransform = new SoundTransform(_volume);
	playSound();
}

public function playSound() : void
{
	soundChannel = sound.play(_currentPosition);
	soundChannel.addEventListener(Event.SOUND_COMPLETE, soundComplete);
	soundChannel.soundTransform = sTransform;
}

public function stopSound() : void
{
	_currentPosition = soundChannel.position;
	if (soundChannel.hasEventListener(Event.SOUND_COMPLETE))
	{
		soundChannel.removeEventListener(Event.SOUND_COMPLETE, soundComplete);
	}
	soundChannel.stop();
}

private function soundComplete(evt : Event) : void
{
	stopSound();

	// when the current sound is complete go to the next group of charcters and play that part
	if (_sentences != null)
	{
		if (_sentencePosition < _sentences.length - 1)
		{
			_sentencePosition++;
			initSound(url + "tl=" + _language + "&q=" + _sentences[_sentencePosition]);
		}
	}
}

Step 6: How to use This Class

You would probably now ask how this class can be used in one of your projects. Well, this is very simple: you just need to create one instance
of this class and initialize it with your desired language and text. All languages supported are listed and can be found at http://translate.google.com.

// create the new text to speech variable
var t2s : Text2Speech = new Text2Speech();

// initialize the text to speech class instance with the language desired and the text
t2s.init(language_abbreviation, text_to_play);

Sound Playing Issues

You have possibly already noticed in the preview that sometimes a cut-off appears when a sound ends and another one begins. This problem can be easily solved by loading all sounds before playing them back. This method implies a longer loading time, especially when the chosen text to play back is very long. The best solution for the time being seems to be the above one, but should anyone have a better one, you are most welcome to leave a comment!</p


Conclusion

I’ve spent a long time trying to find a solution for this text-to-speech problem so I can use it in Flash. It seems google has solved it and the solution is these days easy and accessible for all developers.

This is my first contribution to Activetuts+, I hope you have learned something useful, enjoy the file and thank you for reading!

Leave a Reply

Your email address will not be published. Required fields are marked *