人工智障聊天机器人
目錄
?
項目構想
項目感想
項目API調用
項目語言以及庫
項目目錄
文件構成
代碼清單
main:2.Speech_Recognition.py
1. Sound_Recording.py?
3.tuling.py?
4.Specch_Sythesis.py
問題總結
1.調用playsound庫進行播放音頻時會出現使用后資源不釋放產生以下錯誤:
PermissionError: [Errno 13] Permission denied:?“1.MP3”
解決辦法參考以下:https://blog.csdn.net/liang4000/article/details/96766845
2.仔細閱讀各個API文檔
3.如何退出死循環:添加if判斷語音是否含有“退出”
項目視頻展示
-
項目構想
錄制一段音頻并識別成字符,將字符傳入圖靈機器人并獲得回復,將回復合成音頻文件并播放。
-
項目感想
訊飛語音識別率還行,但是項目容錯率低,并且項目基本調用API,沒進一步研究語音識別技術的過程與實現,不過還是頗有收獲,學習到了完成項目期間出現各種各樣的問題的解決辦法,所以只要你敢想,敢動手去做就一定會有收獲。
-
項目API調用
訊飛語音識別,百度語音合成,圖靈機器人。
-
項目語言以及庫
Python+playsound+pyaudio+wave+os+百度API+訊飛API+圖靈機器人API。
-
項目目錄
-
文件構成
中文目錄:語音識別為主函數----->>函數調用步驟(1.語音錄制? 2.語音識別? 3.圖靈機器人? 4.語音合成)
英文目錄:Speech_Recognition.py----->>(1. Sound_Recording.py? 2.Speech_Recognition.py? 3.tuling.py? 4.Specch_Sythesis.py)
-
代碼清單
main:2.Speech_Recognition.py
#!/usr/bin/env python # -*- coding: utf-8 -*- # @Time : 2019/12/27 16:10 # @Author : Cxk # @File : Speech_Recognition.py# -*- coding:utf-8 -*- # # author: iflytek # # 本demo測試時運行的環境為:Windows + Python3.7 # 本demo測試成功運行時所安裝的第三方庫及其版本如下,您可自行逐一或者復制到一個新的txt文件利用pip一次性安裝: # cffi==1.12.3 # gevent==1.4.0 # greenlet==0.4.15 # pycparser==2.19 # six==1.12.0 # websocket==0.2.1 # websocket-client==0.56.0 # # 語音聽寫流式 WebAPI 接口調用示例 接口文檔(必看):https://doc.xfyun.cn/rest_api/語音聽寫(流式版).html # webapi 聽寫服務參考帖子(必看):http://bbs.xfyun.cn/forum.php?mod=viewthread&tid=38947&extra= # 語音聽寫流式WebAPI 服務,熱詞使用方式:登陸開放平臺https://www.xfyun.cn/后,找到控制臺--我的應用---語音聽寫(流式)---服務管理--個性化熱詞, # 設置熱詞 # 注意:熱詞只能在識別的時候會增加熱詞的識別權重,需要注意的是增加相應詞條的識別率,但并不是絕對的,具體效果以您測試為準。 # 語音聽寫流式WebAPI 服務,方言試用方法:登陸開放平臺https://www.xfyun.cn/后,找到控制臺--我的應用---語音聽寫(流式)---服務管理--識別語種列表 # 可添加語種或方言,添加后會顯示該方言的參數值 # 錯誤碼鏈接:https://www.xfyun.cn/document/error-code (code返回錯誤碼時必看) # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # import websocket import datetime import hashlib import base64 import hmac import json from urllib.parse import urlencode import time import ssl from wsgiref.handlers import format_date_time from datetime import datetime from time import mktime import _thread as threadfrom Speech_Synthesis import * from tuling import * from Sound_Recording import *from playsound import playsound import osSTATUS_FIRST_FRAME = 0 # 第一幀的標識 STATUS_CONTINUE_FRAME = 1 # 中間幀標識 STATUS_LAST_FRAME = 2 # 最后一幀的標識class Ws_Param(object):# 初始化def __init__(self, APPID, APIKey, APISecret, AudioFile):self.APPID = APPIDself.APIKey = APIKeyself.APISecret = APISecretself.AudioFile = AudioFile# 公共參數(common)self.CommonArgs = {"app_id": self.APPID}# 業務參數(business),更多個性化參數可在官網查看self.BusinessArgs = {"domain": "iat", "language": "zh_cn", "accent": "mandarin", "vinfo":1,"vad_eos":10000}# 生成urldef create_url(self):url = 'wss://ws-api.xfyun.cn/v2/iat'# 生成RFC1123格式的時間戳now = datetime.now()date = format_date_time(mktime(now.timetuple()))# 拼接字符串signature_origin = "host: " + "ws-api.xfyun.cn" + "\n"signature_origin += "date: " + date + "\n"signature_origin += "GET " + "/v2/iat " + "HTTP/1.1"# 進行hmac-sha256進行加密signature_sha = hmac.new(self.APISecret.encode('utf-8'), signature_origin.encode('utf-8'),digestmod=hashlib.sha256).digest()signature_sha = base64.b64encode(signature_sha).decode(encoding='utf-8')authorization_origin = "api_key=\"%s\", algorithm=\"%s\", headers=\"%s\", signature=\"%s\"" % (self.APIKey, "hmac-sha256", "host date request-line", signature_sha)authorization = base64.b64encode(authorization_origin.encode('utf-8')).decode(encoding='utf-8')# 將請求的鑒權參數組合為字典v = {"authorization": authorization,"date": date,"host": "ws-api.xfyun.cn"}# 拼接鑒權參數,生成urlurl = url + '?' + urlencode(v)# print("date: ",date)# print("v: ",v)# 此處打印出建立連接時候的url,參考本demo的時候可取消上方打印的注釋,比對相同參數時生成的url與自己代碼生成的url是否一致# print('websocket url :', url)return url# 收到websocket消息的處理 def on_message(ws, message):global resulttry:code = json.loads(message)["code"]sid = json.loads(message)["sid"]if code != 0:errMsg = json.loads(message)["message"]print("sid:%s call error:%s code is:%s" % (sid, errMsg, code))else:data = json.loads(message)["data"]["result"]["ws"]for i in data:for w in i["cw"]:result += w["w"] # print("sid:%s call success!,data is:%s" % (sid, json.dumps(data, ensure_ascii=False)))except Exception as e:print("receive msg,but parse exception:", e)return '識別出錯!'# 收到websocket錯誤的處理 def on_error(ws, error):print("### error:", error)# 收到websocket關閉的處理 def on_close(ws):print("### closed ###")# 收到websocket連接建立的處理 def on_open(ws):def run(*args):frameSize = 8000 # 每一幀的音頻大小intervel = 0.04 # 發送音頻間隔(單位:s)status = STATUS_FIRST_FRAME # 音頻的狀態信息,標識音頻是第一幀,還是中間幀、最后一幀with open(wsParam.AudioFile, "rb") as fp:while True:buf = fp.read(frameSize)# 文件結束if not buf:status = STATUS_LAST_FRAME# 第一幀處理# 發送第一幀音頻,帶business 參數# appid 必須帶上,只需第一幀發送if status == STATUS_FIRST_FRAME:d = {"common": wsParam.CommonArgs,"business": wsParam.BusinessArgs,"data": {"status": 0, "format": "audio/L16;rate=16000","audio": str(base64.b64encode(buf), 'utf-8'),"encoding": "raw"}}d = json.dumps(d)ws.send(d)status = STATUS_CONTINUE_FRAME# 中間幀處理elif status == STATUS_CONTINUE_FRAME:d = {"data": {"status": 1, "format": "audio/L16;rate=16000","audio": str(base64.b64encode(buf), 'utf-8'),"encoding": "raw"}}ws.send(json.dumps(d))# 最后一幀處理elif status == STATUS_LAST_FRAME:d = {"data": {"status": 2, "format": "audio/L16;rate=16000","audio": str(base64.b64encode(buf), 'utf-8'),"encoding": "raw"}}ws.send(json.dumps(d))time.sleep(1)break# 模擬音頻采樣間隔time.sleep(intervel)ws.close()thread.start_new_thread(run, ())def play(file):playsound("%s"%file)if __name__ == "__main__":while(True):"""錄音參數 1 音頻文件參數 2 錄音時長 單位(秒)"""audio_record("yinping.wav", 5)"""訊飛音頻識別APPID=ID, APIKey=KEY,APISecret=Secret,AudioFile=音頻文件全局變量result:拼接返回結果"""global resultresult=''time1 = datetime.now()wsParam = Ws_Param(APPID='申請的訊飛ID', APIKey='申請的訊飛KEY',APISecret='申請的訊飛Secret',AudioFile=r'yinping.wav')websocket.enableTrace(False)wsUrl = wsParam.create_url()ws = websocket.WebSocketApp(wsUrl, on_message=on_message, on_error=on_error, on_close=on_close)ws.on_open = on_openws.run_forever(sslopt={"cert_reqs": ssl.CERT_NONE})time2 = datetime.now()print("錄音音頻識別結果:"+result)if("退出"in result):"""退出死循環說關鍵詞:退出"""print("程序已退出!!")play("2.mp3")breakelse:"""圖靈機器人回復tuling(參數)參數:訊飛音頻識別回傳字符串百度語音合成getBaiduVoice(參數)參數:圖靈機器人回傳字符串結果:合成音頻文件1.MP3"""strss=tuling(result)getBaiduVoice(strss)"""播放圖靈機器人合成語音1.MP3"""play("1.mp3")print("-------------------")continue1. Sound_Recording.py?
#!/usr/bin/env python # -*- coding: utf-8 -*- # @Time : 2019/12/27 18:18 # @Author : Cxk # @File : Sound_Recording.pyimport pyaudio import os import wave # 用Pyaudio庫錄制音頻 # out_file:輸出音頻文件名 # rec_time:音頻錄制時間(秒) def audio_record(out_file, rec_time):CHUNK = 1024FORMAT = pyaudio.paInt16 #16bit編碼格式CHANNELS = 1 #單聲道RATE = 16000 #16000采樣頻率p = pyaudio.PyAudio()# 創建音頻流 stream = p.open(format=FORMAT, # 音頻流wav格式channels=CHANNELS, # 單聲道rate=RATE, # 采樣率16000input=True,frames_per_buffer=CHUNK)print("開始錄音...")frames = [] # 錄制的音頻流# 錄制音頻數據for i in range(0, int(RATE / CHUNK * rec_time)):data = stream.read(CHUNK)frames.append(data)# 錄制完成stream.stop_stream()stream.close()p.terminate()print("錄音完畢...")# 保存音頻文件wf = wave.open(out_file, 'wb')wf.setnchannels(CHANNELS)wf.setsampwidth(p.get_sample_size(FORMAT))wf.setframerate(RATE)wf.writeframes(b''.join(frames))wf.close()# audio_record("yinping.wav", 5)3.tuling.py?
#!/usr/bin/env python # -*- coding: utf-8 -*- # @Time : 2019/12/27 17:50 # @Author : Cxk # @File : tuling.pyimport requests import json def tuling(info):appkey = "申請的圖靈機器人KEY"url = "http://www.tuling123.com/openapi/api?key=%s&info=%s"%(appkey,info)req = requests.get(url)content = req.textdata = json.loads(content)answer = data['text']print("圖靈機器人回復:"+answer)return answer4.Specch_Sythesis.py
#!/usr/bin/env python # -*- coding: utf-8 -*- # @Time : 2019/12/27 19:38 # @Author : Cxk # @File : Speech_Synthesis.pyfrom aip import AipSpeech # import randomdef getBaiduVoice(text):""" 你的 APPID AK SK """APP_ID = '申請的百度ID'API_KEY = '申請的百度KEY'SECRET_KEY = '申請的訊飛Secret'client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)result = client.synthesis(text = text, options={'vol':5,'per':4})if not isinstance(result,dict): # i=random.randint(1,10) with open('1.mp3','wb') as f:f.write(result) # return ielse:print(result)-
問題總結
1.調用playsound庫進行播放音頻時會出現使用后資源不釋放產生以下錯誤:
PermissionError: [Errno 13] Permission denied:?“1.MP3”
解決辦法參考以下:https://blog.csdn.net/liang4000/article/details/96766845
2.仔細閱讀各個API文檔
3.如何退出死循環:添加if判斷語音是否含有“退出”
if("退出"in result):"""退出死循環說關鍵詞:退出"""print("程序已退出!!")play("2.mp3")break-
項目視頻展示
Python人工智障聊天機器人
?
總結
- 上一篇: mysql模糊查询like优化
- 下一篇: stm32毕设 stm32智能扫地机器人