當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

JAVA开发离线语音识别

發布時間：2023/12/14 编程问答 29 豆豆

生活随笔收集整理的這篇文章主要介紹了 JAVA开发离线语音识别小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

?可以離線識別，但是暫時只有一個小的語音庫，識別準確率特別低。

如果誰有訓練語音庫的方法希望可以分享一下。謝謝！

springboot框架搭的一個小demo。

原文地址還有前端頁面html和js，有錄音，播放，翻譯等小功能，詳情見下邊原文地址。

package com.example.gadgets.yysb;import com.alibaba.fastjson.JSONObject; import com.sun.media.sound.WaveFileReader; import com.sun.media.sound.WaveFileWriter; import org.springframework.util.Assert; import org.springframework.util.StringUtils; import org.vosk.LibVosk; import org.vosk.LogLevel; import org.vosk.Model; import org.vosk.Recognizer;import javax.sound.sampled.*; import java.io.*; import java.nio.file.Files; import java.nio.file.Paths;public class VoiceUtil {//模型的地址，需要去官網下載：https://alphacephei.com/vosk/models，這里選擇的是Chinese里的vosk-model-small-cn-0.22 微型版本//經測試，微型版本轉化準確率30%左右。如果語言不清楚，可能更低。明天下載個大的包試一下private static String VOSKMODELPATH = "D:/yuyinshibie/vosk-model-small-cn-0.22";public static String getWord(String filePath) throws IOException, UnsupportedAudioFileException {Assert.isTrue(StringUtils.hasLength(VOSKMODELPATH), "無效的VOS模塊！");byte[] bytes = Files.readAllBytes(Paths.get(filePath));// 轉換為16KHZreSamplingAndSave(bytes, filePath);File f = new File(filePath);RandomAccessFile rdf = null;rdf = new RandomAccessFile(f, "r");System.out.println("聲音尺寸:{}"+ toInt(read(rdf, 4, 4)));System.out.println("音頻格式:{}"+ toShort(read(rdf, 20, 2)));short track=toShort(read(rdf, 22, 2));System.out.println("1 單聲道 2 雙聲道: {}"+ track);System.out.println("采樣率、音頻采樣級別 16000 = 16KHz: {}"+ toInt(read(rdf, 24, 4)));System.out.println("每秒波形的數據量：{}"+ toShort(read(rdf, 22, 2)));System.out.println("采樣幀的大小：{}"+ toShort(read(rdf, 32, 2)));System.out.println("采樣位數：{}"+ toShort(read(rdf, 34, 2)));rdf.close();LibVosk.setLogLevel(LogLevel.WARNINGS);try (Model model = new Model(VOSKMODELPATH);InputStream ais = AudioSystem.getAudioInputStream(new BufferedInputStream(new FileInputStream(filePath)));// 采樣率為音頻采樣率的聲道倍數Recognizer recognizer = new Recognizer(model, 16000*track)) {int nbytes;byte[] b = new byte[4096];int i = 0;while ((nbytes = ais.read(b)) >= 0) {i += 1;if (recognizer.acceptWaveForm(b, nbytes)) { // System.out.println(recognizer.getResult());} else { // System.out.println(recognizer.getPartialResult());}}String result = recognizer.getFinalResult();System.out.println("識別結果：{}"+ result);if (StringUtils.hasLength(result)) {JSONObject jsonObject = JSONObject.parseObject(result);return jsonObject.getString("text").replace(" ", "");}return "";}}public static int toInt(byte[] b) {return (((b[3] & 0xff) << 24) + ((b[2] & 0xff) << 16) + ((b[1] & 0xff) << 8) + ((b[0] & 0xff) << 0));}public static short toShort(byte[] b) {return (short) ((b[1] << 8) + (b[0] << 0));}public static byte[] read(RandomAccessFile rdf, int pos, int length) throws IOException {rdf.seek(pos);byte result[] = new byte[length];for (int i = 0; i < length; i++) {result[i] = rdf.readByte();}return result;}public static void reSamplingAndSave(byte[] data, String path) throws IOException, UnsupportedAudioFileException {WaveFileReader reader = new WaveFileReader();AudioInputStream audioIn = reader.getAudioInputStream(new ByteArrayInputStream(data));AudioFormat srcFormat = audioIn.getFormat();int targetSampleRate = 16000;AudioFormat dstFormat = new AudioFormat(srcFormat.getEncoding(),targetSampleRate,srcFormat.getSampleSizeInBits(),srcFormat.getChannels(),srcFormat.getFrameSize(),srcFormat.getFrameRate(),srcFormat.isBigEndian());AudioInputStream convertedIn = AudioSystem.getAudioInputStream(dstFormat, audioIn);File file = new File(path);WaveFileWriter writer = new WaveFileWriter();writer.write(convertedIn, AudioFileFormat.Type.WAVE, file);}public static void main(String[] args) {String path = "D:/yuyinshibie/test456.wav";File localFile = new File(path);try {//開始解析String text = getWord(path);System.out.println("text:"+text);localFile.delete();} catch (IOException | UnsupportedAudioFileException e) {e.printStackTrace();localFile.delete();}}}

原文：java 離線中文語音文字識別 - Rolay - 博客園轉載注明出處：https://www.cnblogs.com/rolayblog/p/15237099.html 項目需要，要實現類似小愛同學的語音控制功能，并且要離線，不能花公司一分錢。第一步就是需https://www.cnblogs.com/rolayblog/p/15237099.html

總結

以上是生活随笔為你收集整理的JAVA开发离线语音识别的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： git log根据特定条件查询日志并统计
下一篇： 2小时速刷8大项目——上海迪士尼一日游攻