聊天机器人
相信很多人无聊的时候曾经调戏过手机上的语音助手,那么如何用Python实现一个属于自己的语音助手?
起因
突然刷到了一片博文: 【深度讲解】手把手教你python制作萝莉音智能对话语音机器人,附全部源码!速速学起来!! 版权声明:本文为CSDN博主「川川菜鸟」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。 原文链接:https://blog.csdn.net/weixin_46211269/article/details/119847830 对了,有个比较大的问题是playsound不能解除占用,虽然看着playsound的源码中加入了close,但实际上不起作用(我安装的playsound和原文中的貌似版本不太一样) 原文中用到的是自己写了个play.py实现重复播放语音,实际上也是根据playsound改的,但是用起来不是很好用,这里参考了一下这篇文章中的方法实现编码问题 Python playsound 播放MP3 原文链接:https://stackoverflow.com/questions/58659364/how-to-use-unicode-version-windows-api-mcisendstring-python# 增加一句代码实现解除占用
def _playsoundWin(sound, block = True):
'''
依据这里的修改意见进行修改以支持unicode
https://stackoverflow.com/questions/58659364/how-to-use-unicode-version-windows-api-mcisendstring-python#
Utilizes windll.winmm. Tested and known to work with MP3 and WAVE on
Windows 7 with Python 2.7. Probably works with more file formats.
Probably works on Windows XP thru Windows 10. Probably works with all
versions of Python.
Inspired by (but not copied from) Michael Gundlach <gundlach@gmail.com>'s mp3play:
https://github.com/michaelgundlach/mp3play
I never would have tried using windll.winmm without seeing his code.
'''
from ctypes import c_buffer, windll
from random import random
from time import sleep
from sys import getfilesystemencoding
'''
依据这里的修改意见进行修改
https://stackoverflow.com/questions/58659364/how-to-use-unicode-version-windows-api-mcisendstring-python#
'''
def winCommand(*command):
buf = c_buffer(255)
command = ' '.join(command)
errorCode = int(windll.winmm.mciSendStringW(command, buf, 254, 0))
if errorCode:
errorBuffer = c_buffer(255)
windll.winmm.mciGetErrorStringA(errorCode, errorBuffer, 254)
exceptionMessage = ('\n Error ' + str(errorCode) + ' for command:'
'\n ' + command.decode() +
'\n ' + errorBuffer.value.decode())
raise PlaysoundException(exceptionMessage)
return buf.value
alias = 'playsound_' + str(random())
winCommand('open "' + sound + '" alias', alias)
winCommand('play', alias, 'wait')
if block:
winCommand('close', alias)
最后发现还是有一点美中不足,虽然和我们聊天的是萌萌的萝莉音,但是我们自己输入的时候是手打的,这哪里有聊天的感觉嘛。
语音转文字
如何将自己的声音转化为文字?这里用的是百度的语音识别,参考的是如下文章: python实现语音录入识别 https://www.cnblogs.com/angelyan/p/12005974.html 使用的时候发现我的麦好像不是特别好,或者说话的声音可能不够大,偶尔会出现识别不到的现象,这时候就会报错程序停止,修改了一下代码使用嵌套的方式来重复调用自己直至成功识别,代码如下:
def ASR():
record()
""" 你的 APPID AK SK """
APP_ID = '你的 APPID'
API_KEY = '你的API_KEY'
SECRET_KEY = '你的SECRET_KEY'
client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)
def get_file_content(filePath):
with open(filePath, 'rb') as fp:
return fp.read()
res=client.asr(get_file_content('output.wav'), 'wav', 16000, {
'dev_pid': 1536,
})
if res['err_msg'] != 'success.':
print('识别失败,请重新录制')
res=ASR()
return(res)
好了,那么从录制自身到识别语音再到传输到聊天机器人再到语音输出就齐活了。
完整代码
需要自己申请自己的百度智能云的APPID AK SK哦!
import requests
import json
from aip import AipSpeech
from playsound import playsound
from record import ASR
APP_ID = '你的 APPID'
API_KEY = '你的API_KEY'
SECRET_KEY = '你的SECRET_KEY'
client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)
url = 'https://api.ownthink.com/bot?appid=ce71072c2802ecfef630ea19d4b2436c&userid= DItLcaO6&spoken='
def main():
print("我们来聊天吧")
while True:
talk_man = ASR()['result'][0]
print(talk_man)
res_robot = requests.get(url + talk_man).text
talk_robot = json.loads(res_robot)['data']['info']['text']
result = client.synthesis(talk_robot, 'zh', 1, {
'vol': 8,
'spd': 5,
'pit': 9,
'per': 0,
})
if not isinstance(result, dict):
with open('audio.mp3', 'wb+') as f:
f.write(result)
print(talk_robot)
try:
playsound('audio.mp3')
except Exception:
pass
main()
这里的record.py如下:
import wave
import pyaudio
from aip import AipSpeech
def record():
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
frames = []
print("* recording")
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("* done recording")
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
def ASR():
record()
""" 你的 APPID AK SK """
APP_ID = '你的 APPID'
API_KEY = '你的API_KEY'
SECRET_KEY = '你的SECRET_KEY'
client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)
def get_file_content(filePath):
with open(filePath, 'rb') as fp:
return fp.read()
res=client.asr(get_file_content('output.wav'), 'wav', 16000, {
'dev_pid': 1536,
})
if res['err_msg'] != 'success.':
print('识别失败,请重新录制')
res=ASR()
return(res)
if __name__ == '__main__':
ASR()
|