基础知识
两个特征集的维度
IS09 、eGeMAPS是以整个句子进行提取的,没有进行分帧载提取。 IS09->384维 eGeMAPS->88维
opensmile下载使用
下载地址:https://github.com/audeering/opensmile/releases/tag/v3.0.0 下载之后解压到一个自己方便找到的地方即可,后期使用的时候要用到该位置。 后期主要用到bin和config两个文件夹:
代码实现
提取IS09和eGeMAPS除了配置文件.conf路径不一样其余操作一致,区别:
pathConfig = r"F:\AI\SER\opensmile\config\is09-13\IS09_emotion.conf"
pathConfig =r"F:\AI\SER\opensmile\config\egemaps\v01a\eGeMAPSv01a.conf"
后面实现时方便展示,我以提取eGeMAPS为例。 为了举例,本文只用两个wav文件
import os
import numpy as np
import csv
def excuteCMD(ExcuteFile, Config, Audio, Output):
cmd = ExcuteFile + " -C " + Config + " -I " + Audio + " -csvoutput " + Output
print(cmd)
return cmd
def CsvToList(nameFront):
with open(nameFront) as f:
f_csv = csv.reader(f)
header = next(f_csv)
l = []
for row in f_csv:
print("原始的:\n", row)
l = row[0].split(';')[2:]
print("去掉前两个元素并转为列表后:\n", l)
float_l = list(map(float, l))
print("将列表里的字符串转为浮点型:\n", float_l)
print(len(l))
os.remove(nameFront)
return float_l
def ExtractFea():
pathExcuteFile = r"F:\AI\SER\opensmile\bin\SMILExtract.exe"
pathConfig = r"F:\AI\SER\opensmile\config\egemaps\v01a\eGeMAPSv01a.conf"
pathAudioRoot = r"F:\corpus\test\1"
file_path = r"F:\corpus\test\2"
for wav in os.listdir(pathAudioRoot):
pathAudio = os.path.join(pathAudioRoot, wav)
print(pathAudio)
csv_filedir = os.path.join(file_path, wav[0:-4]) + '.csv'
print("保存路径:", csv_filedir)
os.system(excuteCMD(pathExcuteFile, pathConfig, pathAudio, csv_filedir))
l = CsvToList(csv_filedir)
print(l)
return 0
if __name__ == "__main__":
ExtractFea()
运行结果:
E:\w\anaconda3\envs\pytorch\python.exe F:/Project/my_tools/opensmil/自带包.py
F:\corpus\test\1\speech001.wav
保存路径: F:\corpus\test\2\speech001.csv
F:\AI\SER\opensmile\bin\SMILExtract.exe -C F:\AI\SER\opensmile\config\egemaps\v01a\eGeMAPSv01a.conf -I F:\corpus\test\1\speech001.wav -csvoutput F:\corpus\test\2\speech001.csv
(MSG) [2] SMILExtract: openSMILE starting!
(MSG) [2] SMILExtract: config file is: F:\AI\SER\opensmile\config\egemaps\v01a\eGeMAPSv01a.conf
(MSG) [2] cComponentManager: successfully registered 102 component types.
(MSG) [2] instance 'gemapsv01a_logSpectral': logSpecFloor = -140.00 (specFloor = 1.000000e-14)
(MSG) [2] instance 'egemapsv01a_logSpectral_flux': logSpecFloor = -140.00 (specFloor = 1.000000e-14)
(MSG) [2] instance 'lldsink': No filename given, disabling this sink component.
(MSG) [2] instance 'lldhtksink': No filename given, disabling this sink component.
(MSG) [2] instance 'lldarffsink': No filename given, disabling this sink component.
(MSG) [2] instance 'arffsink': No filename given, disabling this sink component.
(MSG) [2] instance 'htksink': No filename given, disabling this sink component.
(WRN) [1] instance 'gemapsv01a_formantLpc': minF and maxF should be set above zero and below Nyquist frequency (5500.000000Hz) to ensure numerical stability.
(MSG) [2] cComponentManager: successfully finished createInstances (76 component instances were finalised, 1 data memories were finalised)
(MSG) [2] cComponentManager: starting single thread processing loop
(WRN) [2] cComponentManager: The following component(s) could not perform any work because destination levels are full but no other component performed any work either: egemapsv01a_smoE. Processing will possibly be incomplete. For more details, enable the execDebug option of cComponentManager.
(WRN) [2] cComponentManager: The following component(s) could not perform any work because destination levels are full but no other component performed any work either: egemapsv01a_smoE. Processing will possibly be incomplete. For more details, enable the execDebug option of cComponentManager.
(WRN) [2] cComponentManager: The following component(s) could not perform any work because destination levels are full but no other component performed any work either: egemapsv01a_smoE. Processing will possibly be incomplete. For more details, enable the execDebug option of cComponentManager.
(MSG) [2] cComponentManager: Processing finished! System ran for 129 ticks.
原始的:
["'unknown';0.000000;3.680355e+01;9.770308e-02;3.365072e+01;3.764083e+01;4.000022e+01;6.349503e+00;1.214337e+02;1.546369e+02;5.411725e+01;1.408990e+01;1.017253e+00;7.941121e-01;3.794944e-01;7.283563e-01;1.507197e+00;1.127702e+00;2.422774e+01;1.101786e+01;1.090528e+01;9.336731e+00;8.171201e-01;1.309137e+00;1.605902e+01;9.994079e-01;2.000773e+00;5.886363e+00;-1.868924e+00;-9.191002e+00;-2.293128e+01;-6.308594e-01;3.987048e-02;8.147252e-01;1.710789e+00;6.841631e-01;5.464337e+00;5.670821e-01;8.235865e-01;1.132386e+01;2.647956e+01;3.748299e-01;6.120276e+02;3.091320e-01;1.634083e+03;2.405461e-01;-1.265667e+02;-7.192033e-01;1.426784e+03;1.491545e-01;1.135315e+03;1.142887e-01;-1.202745e+02;-7.215747e-01;2.552135e+03;6.364776e-02;7.868579e+02;2.274594e-01;-1.251650e+02;-6.471764e-01;-1.230487e+01;-6.729668e-01;2.669224e+01;3.419868e-01;6.028610e-02;2.685980e-01;-3.073610e-02;-4.685379e-01;1.578635e+00;8.072857e-01;2.605691e+01;4.782025e-01;3.475485e+00;3.986405e+00;-1.507025e+01;-1.064250e+00;-3.030024e+01;-4.632347e-01;-1.616301e+00;9.780937e+00;6.193214e-02;-5.613314e-03;2.925960e-01;4.273504e+00;2.678571e+00;1.566667e-01;1.247219e-02;2.850000e-01;6.500000e-02;-1.575238e+01"]
去掉前两个元素并转为列表后:
['3.680355e+01', '9.770308e-02', '3.365072e+01', '3.764083e+01', '4.000022e+01', '6.349503e+00', '1.214337e+02', '1.546369e+02', '5.411725e+01', '1.408990e+01', '1.017253e+00', '7.941121e-01', '3.794944e-01', '7.283563e-01', '1.507197e+00', '1.127702e+00', '2.422774e+01', '1.101786e+01', '1.090528e+01', '9.336731e+00', '8.171201e-01', '1.309137e+00', '1.605902e+01', '9.994079e-01', '2.000773e+00', '5.886363e+00', '-1.868924e+00', '-9.191002e+00', '-2.293128e+01', '-6.308594e-01', '3.987048e-02', '8.147252e-01', '1.710789e+00', '6.841631e-01', '5.464337e+00', '5.670821e-01', '8.235865e-01', '1.132386e+01', '2.647956e+01', '3.748299e-01', '6.120276e+02', '3.091320e-01', '1.634083e+03', '2.405461e-01', '-1.265667e+02', '-7.192033e-01', '1.426784e+03', '1.491545e-01', '1.135315e+03', '1.142887e-01', '-1.202745e+02', '-7.215747e-01', '2.552135e+03', '6.364776e-02', '7.868579e+02', '2.274594e-01', '-1.251650e+02', '-6.471764e-01', '-1.230487e+01', '-6.729668e-01', '2.669224e+01', '3.419868e-01', '6.028610e-02', '2.685980e-01', '-3.073610e-02', '-4.685379e-01', '1.578635e+00', '8.072857e-01', '2.605691e+01', '4.782025e-01', '3.475485e+00', '3.986405e+00', '-1.507025e+01', '-1.064250e+00', '-3.030024e+01', '-4.632347e-01', '-1.616301e+00', '9.780937e+00', '6.193214e-02', '-5.613314e-03', '2.925960e-01', '4.273504e+00', '2.678571e+00', '1.566667e-01', '1.247219e-02', '2.850000e-01', '6.500000e-02', '-1.575238e+01']
特征长度: 88
将列表里的字符串转为浮点型得到最后的特征:
[36.80355, 0.09770308, 33.65072, 37.64083, 40.00022, 6.349503, 121.4337, 154.6369, 54.11725, 14.0899, 1.017253, 0.7941121, 0.3794944, 0.7283563, 1.507197, 1.127702, 24.22774, 11.01786, 10.90528, 9.336731, 0.8171201, 1.309137, 16.05902, 0.9994079, 2.000773, 5.886363, -1.868924, -9.191002, -22.93128, -0.6308594, 0.03987048, 0.8147252, 1.710789, 0.6841631, 5.464337, 0.5670821, 0.8235865, 11.32386, 26.47956, 0.3748299, 612.0276, 0.309132, 1634.083, 0.2405461, -126.5667, -0.7192033, 1426.784, 0.1491545, 1135.315, 0.1142887, -120.2745, -0.7215747, 2552.135, 0.06364776, 786.8579, 0.2274594, -125.165, -0.6471764, -12.30487, -0.6729668, 26.69224, 0.3419868, 0.0602861, 0.268598, -0.0307361, -0.4685379, 1.578635, 0.8072857, 26.05691, 0.4782025, 3.475485, 3.986405, -15.07025, -1.06425, -30.30024, -0.4632347, -1.616301, 9.780937, 0.06193214, -0.005613314, 0.292596, 4.273504, 2.678571, 0.1566667, 0.01247219, 0.285, 0.065, -15.75238]
F:\corpus\test\1\speech002.wav
保存路径: F:\corpus\test\2\speech002.csv
F:\AI\SER\opensmile\bin\SMILExtract.exe -C F:\AI\SER\opensmile\config\egemaps\v01a\eGeMAPSv01a.conf -I F:\corpus\test\1\speech002.wav -csvoutput F:\corpus\test\2\speech002.csv
(MSG) [2] SMILExtract: openSMILE starting!
(MSG) [2] SMILExtract: config file is: F:\AI\SER\opensmile\config\egemaps\v01a\eGeMAPSv01a.conf
(MSG) [2] cComponentManager: successfully registered 102 component types.
(MSG) [2] instance 'gemapsv01a_logSpectral': logSpecFloor = -140.00 (specFloor = 1.000000e-14)
(MSG) [2] instance 'egemapsv01a_logSpectral_flux': logSpecFloor = -140.00 (specFloor = 1.000000e-14)
(MSG) [2] instance 'lldsink': No filename given, disabling this sink component.
(MSG) [2] instance 'lldhtksink': No filename given, disabling this sink component.
(MSG) [2] instance 'lldarffsink': No filename given, disabling this sink component.
(MSG) [2] instance 'arffsink': No filename given, disabling this sink component.
(MSG) [2] instance 'htksink': No filename given, disabling this sink component.
(WRN) [1] instance 'gemapsv01a_formantLpc': minF and maxF should be set above zero and below Nyquist frequency (5500.000000Hz) to ensure numerical stability.
(MSG) [2] cComponentManager: successfully finished createInstances (76 component instances were finalised, 1 data memories were finalised)
(MSG) [2] cComponentManager: starting single thread processing loop
(WRN) [2] cComponentManager: The following component(s) could not perform any work because destination levels are full but no other component performed any work either: egemapsv01a_smoE. Processing will possibly be incomplete. For more details, enable the execDebug option of cComponentManager.
(WRN) [2] cComponentManager: The following component(s) could not perform any work because destination levels are full but no other component performed any work either: egemapsv01a_smoE. Processing will possibly be incomplete. For more details, enable the execDebug option of cComponentManager.
(WRN) [2] cComponentManager: The following component(s) could not perform any work because destination levels are full but no other component performed any work either: egemapsv01a_smoE. Processing will possibly be incomplete. For more details, enable the execDebug option of cComponentManager.
(MSG) [2] cComponentManager: Processing finished! System ran for 184 ticks.
原始的:
["'unknown';0.000000;3.590475e+01;9.424488e-02;3.293277e+01;3.487373e+01;3.846412e+01;5.531357e+00;1.037303e+02;1.381955e+02;4.397307e+01;1.907073e+01;1.006887e+00;8.784395e-01;3.217286e-01;7.002578e-01;1.535076e+00;1.213347e+00;1.927636e+01;1.302068e+01;1.573453e+01;1.135695e+01;6.307563e-01;1.183189e+00;2.001542e+01;6.472682e-01;1.261846e+01;9.423985e-01;-1.384552e+00;-1.142180e+01;-1.567890e+01;-1.004838e+00;2.710485e-02;8.775703e-01;1.088429e+00;6.303176e-01;8.016915e+00;2.811550e-01;5.724525e+00;1.402286e+00;2.339863e+01;2.485168e-01;5.875970e+02;3.082011e-01;1.606794e+03;2.182210e-01;-9.622655e+01;-9.311187e-01;1.438749e+03;1.187004e-01;1.152617e+03;2.110209e-01;-8.964227e+01;-9.460410e-01;2.473076e+03;5.637287e-02;8.258076e+02;1.853558e-01;-9.617101e+01;-8.284190e-01;-1.251770e+01;-5.360814e-01;2.392104e+01;3.129213e-01;6.690726e-02;2.913131e-01;-1.720028e-02;-9.229200e-01;8.807204e-01;9.696324e-01;2.213963e+01;4.980111e-01;8.023051e+00;1.485536e+00;-4.492209e+00;-3.908178e+00;-1.290205e+01;-1.227675e+00;-1.173086e+01;1.945107e+01;6.088300e-02;-2.092529e-02;3.121853e-01;3.488372e+00;3.592814e+00;1.583333e-01;1.051058e-01;9.999999e-02;6.608076e-02;-1.599856e+01"]
去掉前两个元素并转为列表后:
['3.590475e+01', '9.424488e-02', '3.293277e+01', '3.487373e+01', '3.846412e+01', '5.531357e+00', '1.037303e+02', '1.381955e+02', '4.397307e+01', '1.907073e+01', '1.006887e+00', '8.784395e-01', '3.217286e-01', '7.002578e-01', '1.535076e+00', '1.213347e+00', '1.927636e+01', '1.302068e+01', '1.573453e+01', '1.135695e+01', '6.307563e-01', '1.183189e+00', '2.001542e+01', '6.472682e-01', '1.261846e+01', '9.423985e-01', '-1.384552e+00', '-1.142180e+01', '-1.567890e+01', '-1.004838e+00', '2.710485e-02', '8.775703e-01', '1.088429e+00', '6.303176e-01', '8.016915e+00', '2.811550e-01', '5.724525e+00', '1.402286e+00', '2.339863e+01', '2.485168e-01', '5.875970e+02', '3.082011e-01', '1.606794e+03', '2.182210e-01', '-9.622655e+01', '-9.311187e-01', '1.438749e+03', '1.187004e-01', '1.152617e+03', '2.110209e-01', '-8.964227e+01', '-9.460410e-01', '2.473076e+03', '5.637287e-02', '8.258076e+02', '1.853558e-01', '-9.617101e+01', '-8.284190e-01', '-1.251770e+01', '-5.360814e-01', '2.392104e+01', '3.129213e-01', '6.690726e-02', '2.913131e-01', '-1.720028e-02', '-9.229200e-01', '8.807204e-01', '9.696324e-01', '2.213963e+01', '4.980111e-01', '8.023051e+00', '1.485536e+00', '-4.492209e+00', '-3.908178e+00', '-1.290205e+01', '-1.227675e+00', '-1.173086e+01', '1.945107e+01', '6.088300e-02', '-2.092529e-02', '3.121853e-01', '3.488372e+00', '3.592814e+00', '1.583333e-01', '1.051058e-01', '9.999999e-02', '6.608076e-02', '-1.599856e+01']
特征长度: 88
将列表里的字符串转为浮点型得到最后的特征:
[35.90475, 0.09424488, 32.93277, 34.87373, 38.46412, 5.531357, 103.7303, 138.1955, 43.97307, 19.07073, 1.006887, 0.8784395, 0.3217286, 0.7002578, 1.535076, 1.213347, 19.27636, 13.02068, 15.73453, 11.35695, 0.6307563, 1.183189, 20.01542, 0.6472682, 12.61846, 0.9423985, -1.384552, -11.4218, -15.6789, -1.004838, 0.02710485, 0.8775703, 1.088429, 0.6303176, 8.016915, 0.281155, 5.724525, 1.402286, 23.39863, 0.2485168, 587.597, 0.3082011, 1606.794, 0.218221, -96.22655, -0.9311187, 1438.749, 0.1187004, 1152.617, 0.2110209, -89.64227, -0.946041, 2473.076, 0.05637287, 825.8076, 0.1853558, -96.17101, -0.828419, -12.5177, -0.5360814, 23.92104, 0.3129213, 0.06690726, 0.2913131, -0.01720028, -0.92292, 0.8807204, 0.9696324, 22.13963, 0.4980111, 8.023051, 1.485536, -4.492209, -3.908178, -12.90205, -1.227675, -11.73086, 19.45107, 0.060883, -0.02092529, 0.3121853, 3.488372, 3.592814, 0.1583333, 0.1051058, 0.09999999, 0.06608076, -15.99856]
Process finished with exit code 0
后记
刚开始操作时建议大家先注释删除文件的代码:
os.remove(nameFront)
这样就会自动保存.csv特征文件: 打开文件的内容 再弄懂.csv文件的格式,能提取出自己所需内容后,建议读取完就可以删除了。 其中生成的cmd指令也可以在opensmile对应的文件夹输入cmd回车,然后复制指令回车后得到的效果一样: 然后将指令复制cmd界面
F:\AI\SER\opensmile\bin\SMILExtract.exe -C F:\AI\SER\opensmile\config\egemaps\v01a\eGeMAPSv01a.conf -I F:\corpus\test\1\speech001.wav -csvoutput F:\corpus\test\2\speech001.csv
回车得到:
|