FreeSWITCH折腾笔记4——自己做一个TTS服务器

2025-02-09 技术教程

freeswitch原生支持的tts功能中文一般是使用的ekho，但是那合成的效果简直惨不忍睹，于是我想自己做一个TTS服务器。

首先是找到比较满意的TTS引擎，科大讯飞的效果当然是没话说，但是价格不菲，其他商业的引擎中文合成也不是很流畅，偶然发现windows7自带的合成引擎还算过得去，windows10带的合成引擎就更好了（有兴趣的可以先测试一下，直接在windows控制面板中的语音设置里面有测试，但是测试的中英文混读很蛋疼）。

那么问题来了，怎么把这个引擎用到我的FS上边呢？

思路，debian上的freeswitch（后续简称de-FS）需要TTS的时候，通过桥接到另外一个windows版本的freeswitch（后续简称win-FS），在SIP消息X-header中附加上需要合成的文字，然后win-FS接收到文字后，通过lua脚本调用一个应用程序转换成语音文件，win-FS再播放出来，这样呼入原de-FS的主叫就听到了文本转换后的语音了。

折腾了一个星期，终于把这个TTS服务搭建好了，分享出来，顺便记一下，免得下次要用时找不到。

废话不多说了，直接上脚本（注：de-FS使用fusionpbx搭建的，数据源调用直接用了fusionpbx的方法，深究的可以搭一个fusionpbx查查里面的脚本）

1、de-FS端查数、外呼、转接脚本：

functioncreateIconv(from,to,text)--本函数用于将utf8编码和gbk编码相互转换--apt-getinstalllua-iconv#需要先安装lua-iconvlocaliconv=require("iconv")localcd=iconv.new(to.."//TRANSLIT",from)localostr,err=cd:iconv(text)iferr==iconv.ERROR_INCOMPLETEthenreturn"ERROR:Incompleteinput."elseiferr==iconv.ERROR_INVALIDthenreturn"ERROR:Invalidinput."elseiferr==iconv.ERROR_NO_MEMORYthenreturn"ERROR:Failedtoallocatememory."elseiferr==iconv.ERROR_UNKNOWNthenreturn"ERROR:Therewasanunknownerror."endreturnostrendfunctioninser_str_for(in_num)--本函数用在长串数字中插入空格ifin_num==nilthenout_str='errorin_num'else--print("正在转换："..in_num.."\n")in_num_len=string.len(in_num)tmp_str=string.sub(in_num,1,1)forstart_len=2,in_num_len,1dosub_str=string.sub(in_num,start_len,start_len)tmp_str=tmp_str..""..sub_strendout_str=tmp_strendreturnout_strend--require"resources.functions.config";localDatabase=require"resources.functions.database";dbh=Database.new('switch');--调用freeswitch数据库api=freeswitch.API();session:answer();if(session:ready()==true)then--orderid=session:playAndGetDigits(8,8,3,5000,'#','firstsound:#','error_trysound','\\d+|#');orderid=session:playAndGetDigits(12,12,3,8000,'#','misc/enter_carryid.wav','misc/erro_carryid.wav','\\d+|#')endif(orderid==nil)thenerror_code=0elseerror_code=1endif(error_code==1)thensession:execute("playback","misc/carry_querying.wav")localparams={carry_id_in=orderid}localsql="SELECT*fromcarry_stauttwheret.carry_id=:carry_id_in;"dbh:query(sql,params,function(row)carry_staut=row.info;--返回字段映射end);ifcarry_staut==nilthensession:streamFile("misc/erro_carryid.wav")else--out_orderid=inser_str_for(orderid)--当使用微软音库时需转换out_orderid=orderid--当使用"VMHui"音库时无需转换tts_text="您好，单号："..out_orderid.."，7月11日，您从广州寄往深圳的快件当前状态："..carry_staut.."，签收人李先生，感谢使用顺丰！"tts_text=createIconv("utf-8","gbk",tts_text)--调用函数将字符集转换为gbksession:setVariable("sip_h_X-Text",tts_text)tts_session=freeswitch.Session("sofia/gateway/80fd35d7-d767-4fb3-907f-72daaa7896ea/1098",session)if(tts_session:ready()==true)thenfreeswitch.bridge(session,tts_session)tts_session:hangup()endendelsesession:streamFile("misc/erro_carryid.wav")endsession:sleep(500)localdigits=session:playAndGetDigits(1,1,3,8000,'','misc/continue_carryid.wav','ivr/ivr-that_was_an_invalid_entry','[\\d{1}|*]')if(digits=="1")thensession:execute("transfer","6200");elseif(digits=="2")thensession:execute("transfer","6002");elseif(digits=="*")thensession:execute("transfer","6000");elsesession:execute("transfer","6002");endendend

2、win-FS接收端脚本：

session:answer();session:ready();uuid=session:getVariable("uuid");sounds_dir=session:getVariable("sounds_dir");text1=session:getVariable("sip_h_X-Text");iftext1==nilthentext1=session:getVariable("caller_id_name");endos.execute("G:\\fusionpbx\\TTS\\tts_app\\tts_app.exe".."-to".."\""..text1.."\"".."\""..sounds_dir.."/"..uuid..".wav".."\"");session:streamFile(sounds_dir.."/"..uuid..".wav");session:sleep(1000);os.remove(sounds_dir.."/"..uuid..".wav");session:hangup();

3、调用windows的tts程序代码（用C#写的一个控制台程序，可实现列出语音库，直接放音，和将语音保存到文件，直接调用windows接口）：

usingSystem;usingSystem.Speech.Synthesis;usingSystem.Configuration;usingSystem.Speech.AudioFormat;namespacetts_app{classProgram{staticvoidMain(string[]args){//Encodingutf8=Encoding.GetEncoding("utf8");//Encodinggb2312=Encoding.GetEncoding("gb2312");Stringtts_text;Stringoutputfile;StringVoice_out=ConfigurationManager.AppSettings["Voice"];StringSpeed=ConfigurationManager.AppSettings["Speed"];StringVolume=ConfigurationManager.AppSettings["Volume"];intSpeed_true=0;intVolume_true=0;intSpeed_out,Volume_out;int.TryParse(Speed,outSpeed_true);if(Speed_true>=-10&&Speed_true<=10){Speed_out=int.Parse(Speed);}else{Console.WriteLine("语速配置错误，将以默认语速0输出");Speed_out=0;}int.TryParse(Volume,outVolume_true);if(Volume_true>=0&&Volume_true<=100){Volume_out=int.Parse(Volume);}else{Console.WriteLine("音量配置错误，将以默认音量100输出");Volume_out=100;}if(args.Length==1&&args[0]=="-l")//判断输入参数等于1，且等于-l时列出语音库信息{SpeechSynthesizerspeech=newSpeechSynthesizer();Console.WriteLine("本机安装的语音包如下：");foreach(InstalledVoicevoiceinspeech.GetInstalledVoices()){VoiceInfoinfo=voice.VoiceInfo;stringAudioFormats="";foreach(SpeechAudioFormatInfofmtininfo.SupportedAudioFormats){AudioFormats+=String.Format("{0}\n",fmt.EncodingFormat.ToString());}Console.WriteLine("音库名称:"+info.Name);Console.WriteLine("语种:"+info.Culture);Console.WriteLine("年龄:"+info.Age);Console.WriteLine("性别:"+info.Gender);Console.WriteLine("描述:"+info.Description);Console.WriteLine("ID:"+info.Id);Console.WriteLine("是否启用:"+voice.Enabled);if(info.SupportedAudioFormats.Count!=0){Console.WriteLine("语音格式："+AudioFormats);}else{Console.WriteLine("不支持的语音格式：");}stringAdditionalInfo="";foreach(stringkeyininfo.AdditionalInfo.Keys){AdditionalInfo+=String.Format("{0}:{1}\n",key,info.AdditionalInfo[key]);}Console.WriteLine("其他信息："+AdditionalInfo);}speech.SetOutputToNull();speech.Dispose();Console.WriteLine("读取语音列表完成！");}elseif(args.Length==2&&args[0]=="-t"){tts_text=args[1];//以下代码用于将linux传过来的UTF8转码成正常文字//byte[]buffer1=Encoding.Default.GetBytes(tts_text);//byte[]buffer2=Encoding.Convert(Encoding.UTF8,Encoding.Default,buffer1,0,buffer1.Length);//tts_text=Encoding.Default.GetString(buffer2,0,buffer2.Length);SpeechSynthesizerspeech=newSpeechSynthesizer();Stringdefult_voice;speech.GetInstalledVoices();speech.SelectVoice(Voice_out);//语音类型speech.Rate=Speed_out;//语速speech.Volume=Volume_out;//音量Console.WriteLine("正在合成:");Console.WriteLine(tts_text);speech.Speak(tts_text);Console.WriteLine("合成完成");speech.Dispose();}elseif(args.Length==3&&args[0]=="-to"){tts_text=args[1];outputfile=args[2];//以下代码用于将linux传过来的UTF8转码成正常文字//byte[]buffer1=Encoding.Default.GetBytes(tts_text);//byte[]buffer2=Encoding.Convert(Encoding.UTF8,Encoding.Default,buffer1,0,buffer1.Length);//tts_text=Encoding.Default.GetString(buffer2,0,buffer2.Length);SpeechSynthesizerspeech=newSpeechSynthesizer();speech.SelectVoice(Voice_out);//语音类型speech.Rate=Speed_out;//语速speech.Volume=Volume_out;//音量Console.Write("正在合成:");Console.WriteLine(tts_text);Console.Write("输出文件名:");Console.WriteLine(outputfile);speech.SetOutputToWaveFile(outputfile);speech.Speak(tts_text);Console.WriteLine("合成完成");speech.SetOutputToNull();speech.Dispose();}else{Console.WriteLine("启动参数错误，请使用“-t文本内容，或使用“-to文本内容输出文件名”，“-l”列出本机安装的语音库");}}}}

4、TTS程序配置文件（用来调整语速，音量和选择音库）：

<?xmlversion="1.0"encoding="utf-8"?><configuration><startup><supportedRuntimeversion="v4.0"sku=".NETFramework,Version=v4.5.2"/></startup><appSettings><addkey="Voice"value="MicrosoftLili"></add><addkey="Speed"value="0"></add><addkey="Volume"value="100"></add></appSettings></configuration>

附件包含了两个语音合成效果和我编译好了的tts_app应用(需要安装.net运行库)

应用的使用方式：

1、列出语音库：tts_app.exe -l （如果放不出来，则需要查询语音库有哪些，然后将音库名称填入到配置文件的voice参数里面）

2、直接合成：

tts_app.exe -t “需要合成的文字”

3、保存到wav文件

tts_app.exe -to “需要合成的文字” "D:\存放路径\文件名.wav"

附件：http://down.51cto.com/data/2366909