本例程采用C语言版本的FFT算法对一个软件合成的信号进行FFT变换,并将FFT变换后每个频点的幅值打印输出。 信号合成程序: FFT变换后幅值输出如下: 性能测试(STM32 72M):
2022-08-24 08:47:04 678KB 电路方案
1
androidtalk_语音朗读-语音识别-语音源码.rar
2022-07-05 18:07:09 2.54MB android
易语言文本转语音源码 调用某接口实现
2022-06-18 22:05:17 205KB 源码
1
多种语音,语调语速可调节,音频可下载
2022-05-31 19:03:54 7.21MB 语音 文字 文字转语音 源码
Qt文字转语音文字转语音源码
2022-03-12 16:54:50 43KB qt 语音识别
1
nao机器人学习过程中java代码 package com.aldebaran.proxy; import com.aldebaran.proxy.Variant; import com.aldebaran.proxy.ALProxy; public class ALTextToSpeechProxy extends ALProxy { static { System.loadLibrary("jnaoqi"); } public ALProxy proxy; /// /// Default Constructor. /// public ALTextToSpeechProxy(String ip, int port) { super("ALTextToSpeech", ip, port); } /// /// Disables the notifications puted in ALMemory during the synthesis (TextStarted, TextDone, CurrentBookMark, CurrentWord, ...) /// public void disableNotifications() { Variant result = call("disableNotifications" ); // no return value } /// /// Enables the notifications puted in ALMemory during the synthesis (TextStarted, TextDone, CurrentBookMark, CurrentWord, ...) /// public void enableNotifications() { Variant result = call("enableNotifications" ); // no return value } /// /// Exits and unregisters the module. /// public void exit() { Variant result = call("exit" ); // no return value } /// /// Outputs the languages installed on the system. /// /// Array of std::string that contains the languages installed on the system. public String[] getAvailableLanguages() { Variant result = call("getAvailableLanguages" ); return (String[]) result.toStringArray(); } /// /// Outputs the available voices. The returned list contains the voice IDs. /// /// Array of std::string containing the voices installed on the system. public String[] getAvailableVoices() { Variant result = call("getAvailableVoices" ); return (String[]) result.toStringArray(); } /// /// Gets the name of the parent broker. /// /// The name of the parent broker. public String getBrokerName() { Variant result = call("getBrokerName" ); return result.toString(); } /// /// Returns the language currently used by the text-to-speech engine. /// /// Language of the current voice. public String getLanguage() { Variant result = call("getLanguage" ); return result.toString(); } /// /// Returns the encoding that should be used with the specified language. /// /// Language name (as a std::string). Must belong to the languages available in TTS. /// Encoding of the specified language. public String getLanguageEncoding( String pLanguage) { Variant vpLanguage; vpLanguage = new Variant(pLanguage); Variant result = call("getLanguageEncoding" ,vpLanguage); return result.toString(); } /// /// Retrieves a method's description. /// /// The name of the method. /// A structure containing the method's description. public Variant getMethodHelp( String methodName) { Variant vmethodName; vmethodName = new Variant(methodName); Variant result = call("getMethodHelp" ,vmethodName); return result; } /// /// Retrieves the module's method list. /// /// An array of method names. public String[] getMethodList() { Variant result = call("getMethodList" ); return (String[]) result.toStringArray(); } /// /// Retrieves the module's description. /// /// A structure describing the module. public Variant getModuleHelp() { Variant result = call("getModuleHelp" ); return result; } /// /// Returns the value of one of the voice parameters. The available parameters are: \"pitchShift\", \"doubleVoice\",\"doubleVoiceLevel\" and \"doubleVoiceTimeShift\" /// /// Name of the parameter. /// Value of the specified parameter public float getParameter( String pParameterName) { Variant vpParameterName; vpParameterName = new Variant(pParameterName); Variant result = call("getParameter" ,vpParameterName); return result.toFloat(); } /// /// Gets the method usage string. This summarises how to use the method. /// /// The name of the method. /// A string that summarises the usage of the method. public String getUsage( String name) { Variant vname; vname = new Variant(name); Variant result = call("getUsage" ,vname); return result.toString(); } /// /// Returns the voice currently used by the text-to-speech engine. /// /// Name of the current voice public String getVoice() { Variant result = call("getVoice" ); return result.toString(); } /// /// Fetches the current volume the text to speech. /// /// Volume (integer between 0 and 100). public float getVolume() { Variant result = call("getVolume" ); return result.toFloat(); } /// /// Returns true if the method is currently running. /// /// The ID of the method that was returned when calling the method using 'post' /// True if the method is currently running public Boolean isRunning( int id) { Variant vid; vid = new Variant(id); Variant result = call("isRunning" ,vid); return result.toBoolean(); } /// /// Loads a set of voice parameters defined in a xml file contained in the preferences folder.The name of the xml file must begin with ALTextToSpeech_Voice_ /// /// Name of the voice preference. public void loadVoicePreference( String pPreferenceName) { Variant vpPreferenceName; vpPreferenceName = new Variant(pPreferenceName); Variant result = call("loadVoicePreference" ,vpPreferenceName); // no return value } /// /// Just a ping. Always returns true /// /// returns true public Boolean ping() { Variant result = call("ping" ); return result.toBoolean(); } /// /// Performs the text-to-speech operations : it takes a std::string as input and outputs a sound in both speakers. It logs an error if the std::string is empty. String encoding must be UTF8. /// /// Text to say, encoded in UTF-8. public void say( String StringToSay) { Variant vstringToSay; vstringToSay = new Variant(StringToSay); Variant result = call("say" ,vstringToSay); // no return value } /// /// Performs the text-to-speech operations: it takes a std::string as input and outputs the corresponding audio signal in the specified file. /// /// Text to say, encoded in UTF-8. /// RAW file where to store the generated signal. The signal is encoded with a sample rate of 22050Hz, format S16_LE, 2 channels. public void sayToFile( String pStringToSay, String pFileName) { Variant vpStringToSay; vpStringToSay = new Variant(pStringToSay); Variant vpFileName; vpFileName = new Variant(pFileName); Variant result = call("sayToFile" ,vpStringToSay, vpFileName); // no return value } /// /// This method performs the text-to-speech operations: it takes a std::string, outputs the synthesis resulting audio signal in a file, and then plays the audio file. The file is deleted afterwards. It is useful when you want to perform a short synthesis, when few CPU is available. Do not use it if you want a low-latency synthesis or to synthesize a long std::string. /// /// Text to say, encoded in UTF-8. public void sayToFileAndPlay( String pStringToSay) { Variant vpStringToSay; vpStringToSay = new Variant(pStringToSay); Variant result = call("sayToFileAndPlay" ,vpStringToSay); // no return value } /// /// Changes the language used by the Text-to-Speech engine. It automatically changes the voice used since each of them is related to a unique language. If you want that change to take effect automatically after reboot of your robot, refer to the robot web page (setting page). /// /// Language name. Must belong to the languages available in TTS (can be obtained with the getAvailableLanguages method). It should be an identifier std::string. public void setLanguage( String pLanguage) { Variant vpLanguage; vpLanguage = new Variant(pLanguage); Variant result = call("setLanguage" ,vpLanguage); // no return value } /// /// Changes the parameters of the voice. The available parameters are: /// /// pitchShift: applies a pitch shifting to the voice. The value indicates the ratio between the new fundamental frequencies and the old ones (examples: 2.0: an octave above, 1.5: a quint above). Correct range is (1.0 -- 4), or 0 to disable effect. /// /// doubleVoice: adds a second voice to the first one. The value indicates the ratio between the second voice fundamental frequency and the first one. Correct range is (1.0 -- 4), or 0 to disable effect /// /// doubleVoiceLevel: the corresponding value is the level of the double voice (1.0: equal to the main voice one). Correct range is (0 -- 4). /// /// doubleVoiceTimeShift: the corresponding value is the delay between the double voice and the main one. Correct range is (0 -- 0.5) /// /// If the effect value is not available, the effect parameter remains unchanged. /// /// Name of the parameter. /// Value of the parameter. public void setParameter( String pEffectName, float pEffectValue) { Variant vpEffectName; vpEffectName = new Variant(pEffectName); Variant vpEffectValue; vpEffectValue = new Variant(pEffectValue); Variant result = call("setParameter" ,vpEffectName, vpEffectValue); // no return value } /// /// Changes the voice used by the text-to-speech engine. The voice identifier must belong to the installed voices, that can be listed using the 'getAvailableVoices' method. If the voice is not available, it remains unchanged. No exception is thrown in this case. For the time being, only two voices are available by default : Kenny22Enhanced (English voice) and Julie22Enhanced (French voice) /// /// The voice (as a std::string). public void setVoice( String pVoiceID) { Variant vpVoiceID; vpVoiceID = new Variant(pVoiceID); Variant result = call("setVoice" ,vpVoiceID); // no return value } /// /// Sets the volume of text-to-speech output. /// /// Volume (between 0.0 and 1.0). public void setVolume( float volume) { Variant vvolume; vvolume = new Variant(volume); Variant result = call("setVolume" ,vvolume); // no return value } /// /// returns true if the method is currently running /// /// the ID of the method to wait for public void stop( int id) { Variant vid; vid = new Variant(id); Variant result = call("stop" ,vid); // no return value } /// /// This method stops the current and all the pending tasks immediately. /// public void stopAll() { Variant result = call("stopAll" ); // no return value } /// /// Returns the version of the module. /// /// A string containing the version of the module. public String version() { Variant result = call("version" ); return result.toString(); } /// /// Wait for the end of a long running method that was called using 'post' /// /// The ID of the method that was returned when calling the method using 'post' /// The timeout period in ms. To wait indefinately, use a timeoutPeriod of zero. /// True if the timeout period terminated. False if the method returned. public Boolean wait( int id, int timeoutPeriod) { Variant vid; vid = new Variant(id); Variant vtimeoutPeriod; vtimeoutPeriod = new Variant(timeoutPeriod); Variant result = call("wait" ,vid, vtimeoutPeriod); return result.toBoolean(); } }
2021-12-11 20:45:52 13KB Nao机器人
1
实时语音克隆 该存储库是使用实时工作的声码器实现的(SV2TTS)的实现。 如果您好奇或正在寻找我未记录的信息,请随时检查。 通常,我建议您快速浏览一下引言之外的数字。 SV2TTS是一个三阶段的深度学习框架,它允许从几秒钟的音频中创建语音的数字表示,并使用它来调节经过训练的文本到语音模型,以推广到新的语音。 视频演示(单击图片): 已实施文件 网址 指定 标题 实施源 SV2TTS 将学习从演讲者验证转移到多演讲者语音合成 这个回购 WaveRNN(声码器) 高效的神经音频合成 Tacotron 2(合成器) 基于梅尔谱图预测的条件波网自然合成TTS GE2E(编码器)
2021-10-01 07:30:19 955KB python deep-learning tensorflow pytorch
1
SEWUNet 通过深波U-Net增强语音 在检查全文。 介绍 在本文中,我们提出了一种端到端的方法来从其原始波形上的语音信号中删除背景上下文。 网络的输入是音频,具有16kHz的采样率,并在5dB到15dB的信噪比内均匀分布地被附加噪声所破坏。 该系统旨在产生具有清晰语音内容的信号。 当前,有多种深度学习架构可用于此任务,从基于频谱的前端到原始波形,其结果令人鼓舞。 我们的方法基于Wave-U-Net体系结构,并对我们的问题进行了一些调整,在初始化主要任务的训练之前,建议通过自动编码器进行权重初始化。 我们表明,通过定量指标,我们的方法优于经典的维纳滤波。 如何使用 有两种使用此存储库的方式:1.使用数据训练自己的模型2.仅将技术应用于具有预先训练的模型的数据 如何训练 tl; dr:以与本文所示相同的方式训练最佳模型的步骤。 将LibriSpeech数据集和UrbanSound8K
1
Unity接入百度语音识别教程 写了一篇文章专门介绍,可以来看一下 https://blog.csdn.net/zhangay1998/article/details/119033698
2021-07-26 20:35:01 63.29MB 语音识别 百度语音 unity3d
1
说明: 该设计资料来自立创社区分享,希望给需要的朋友一个很好的参考作用。 CH563介绍: CH563 是一款类似 ARM9 的 32 位 RISC 精简指令集 CPU,指令集兼容 ARMv5TE,支持 16 位 Thumb指令和增强 DSP 指令。默认系统主频为 100MHZ,最高可达 130MHZ。高度集成的外设以及高性能,使其可以广泛的应用于各种嵌入式应用。 一、摘要 根据Mass Storage Class(大容量/海量存储器,以下简称MSC)协议,使用CH558、CH559和CH563分别实现模拟全速和高速U盘的功能,外部存储介质和U盘容量支持自由调整,用以解决数据转存或者定制U盘等功能。关键在于CH5XX USB设备控制器操作、Bulk-Only传输协议、SCSI命令支持和存储介质读写这几个部分。 二、总体概述 模拟U盘关键功能部件包括以下几点: (1)、USB Mass Storage Framework (2)、以U盘为例,下图是PC和U盘的内部抽象逻辑框图 CH558、CH559和CH563内置USB设备控制器和PHY,对于实现U盘控制器的应用,只需要配置USB设备模式和读写外部存储介质。 (3)、USB MSC CBI/BBB Transport USB MSC Control/Bulk/Interrupt Transport 只能用于Full-speed的软盘(Floppy drive),这里不赘述,有兴趣可以自己百度。 Bulk-only传输类控制、批量都是通过批量端点,即用Bulk端点来传送命令块,数据,状态,因此,才类似于Control/Bulk/Interrupt被简称为CBI一样,而Bulk/Bulk/Bulk被简称为BBB。 (4)、USB MSC Protocol relation 传输通讯:Host和Device之间的数据通讯根据存储介质(Floppy或Flash)分别使用UFI和SCSI协议,更深入的设备特性配置参考More Feature。 更多讲解,详见“相关文件”案例分析。 处理器及微控制器 /CH563Q购买链接:https://www.szlcsc.com/product/details_88564.html#
2021-06-15 17:14:59 4.55MB usb2.0 ch563 模拟u盘 ch559
1