日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 运维知识 > windows >内容正文

windows

云端开炉,线上训练,Bert-vits2-v2.2云端线上训练和推理实践(基于GoogleColab)

發(fā)布時(shí)間:2023/12/24 windows 100 coder
生活随笔 收集整理的這篇文章主要介紹了 云端开炉,线上训练,Bert-vits2-v2.2云端线上训练和推理实践(基于GoogleColab) 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

假如我們一定要說深度學(xué)習(xí)入門會(huì)有一定的門檻,那么設(shè)備成本是一個(gè)無法避開的話題。深度學(xué)習(xí)模型通常需要大量的計(jì)算資源來進(jìn)行訓(xùn)練和推理。較大規(guī)模的深度學(xué)習(xí)模型和復(fù)雜的數(shù)據(jù)集需要更高的計(jì)算能力才能進(jìn)行有效的訓(xùn)練。因此,訓(xùn)練深度學(xué)習(xí)模型可能需要使用高性能的計(jì)算設(shè)備,如圖形處理器(GPU)或?qū)S玫纳疃葘W(xué)習(xí)處理器(如TPU),這讓很多本地沒有N卡的同學(xué)望而卻步。

GoogleColab是由Google提供的一種基于云的免費(fèi)Jupyter筆記本環(huán)境。它可以幫助入門用戶輕松地進(jìn)行機(jī)器學(xué)習(xí)和深度學(xué)習(xí)的實(shí)驗(yàn)。

盡管GoogleColab提供了很多便利和免費(fèi)的功能,但也有一些限制。例如,每個(gè)會(huì)話的計(jì)算資源可能是有限的,并且會(huì)話可能會(huì)在一段時(shí)間后自動(dòng)關(guān)閉。此外,Colab的使用可能受到Google的限制和政策規(guī)定。

對于筆者這樣的窮哥們來講,GoogleColab就是黑暗中的一道光,就算有訓(xùn)練時(shí)長限制,也能湊合用了,要啥自行車?要飯?jiān)垡簿蛣e嫌飯餿了,本次我們基于GoogleColab在云端訓(xùn)練和推理Bert-vits2-v2.2項(xiàng)目,復(fù)刻那黑破壞神角色莉莉絲(lilith)。

配置云端設(shè)備

首先進(jìn)入GoogleColab實(shí)驗(yàn)室官網(wǎng):

https://colab.research.google.com/

點(diǎn)擊新建筆記,并且鏈接設(shè)備服務(wù)器:

這里硬件設(shè)備選擇T4GPU。

隨后新建一條命令

#@title 查看顯卡  
!nvidia-smi

點(diǎn)擊運(yùn)行程序返回:



  
Tue Dec 19 03:07:21 2023         
+---------------------------------------------------------------------------------------+  
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |  
|-----------------------------------------+----------------------+----------------------+  
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |  
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |  
|                                         |                      |               MIG M. |  
|=========================================+======================+======================|  
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |  
| N/A   54C    P8              10W /  70W |      0MiB / 15360MiB |      0%      Default |  
|                                         |                      |                  N/A |  
+-----------------------------------------+----------------------+----------------------+  
                                                                                           
+---------------------------------------------------------------------------------------+  
| Processes:                                                                            |  
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |  
|        ID   ID                                                             Usage      |  
|=======================================================================================|  
|  No running processes found                                                           |  
+---------------------------------------------------------------------------------------+


新一代圖靈架構(gòu)、16GB 顯存,免費(fèi) GPU 也能如此耀眼,不愧是業(yè)界良心。

克隆代碼倉庫

隨后新建命令:



#@title 克隆代碼倉庫  
!git clone https://github.com/v3ucn/Bert-vits2-V2.2.git


程序返回:

Cloning into 'Bert-vits2-V2.2'...  
remote: Enumerating objects: 310, done.  
remote: Counting objects: 100% (310/310), done.  
remote: Compressing objects: 100% (210/210), done.  
remote: Total 310 (delta 97), reused 294 (delta 81), pack-reused 0  
Receiving objects: 100% (310/310), 12.84 MiB | 18.95 MiB/s, done.  
Resolving deltas: 100% (97/97), done.

安裝所需要的依賴

新建安裝依賴命令:

#@title 安裝所需要的依賴  
%cd /content/Bert-vits2-V2.2  
!pip install -r requirements.txt

依賴安裝的時(shí)間要長一些,需要耐心等待。

下載必要的模型

接著下載必要的模型,這里包括bert模型和情感模型:

#@title 下載必要的模型  
!wget -P emotional/clap-htsat-fused/ https://huggingface.co/laion/clap-htsat-fused/resolve/main/pytorch_model.bin  
!wget -P emotional/wav2vec2-large-robust-12-ft-emotion-msp-dim/ https://huggingface.co/audeering/wav2vec2-large-robust-12-ft-emotion-msp-dim/resolve/main/pytorch_model.bin  
!wget -P bert/chinese-roberta-wwm-ext-large/ https://huggingface.co/hfl/chinese-roberta-wwm-ext-large/resolve/main/pytorch_model.bin  
!wget -P bert/bert-base-japanese-v3/ https://huggingface.co/cl-tohoku/bert-base-japanese-v3/resolve/main/pytorch_model.bin  
!wget -P bert/deberta-v3-large/ https://huggingface.co/microsoft/deberta-v3-large/resolve/main/pytorch_model.bin  
!wget -P bert/deberta-v3-large/ https://huggingface.co/microsoft/deberta-v3-large/resolve/main/pytorch_model.generator.bin  
!wget -P bert/deberta-v2-large-japanese/ https://huggingface.co/ku-nlp/deberta-v2-large-japanese/resolve/main/pytorch_model.bin

如果推理任務(wù)只需要中文語音,那么下載前三個(gè)模型即可。

下載底模文件

隨后下載Bert-vits2-v2.2底模:

#@title 下載底模文件  
  
!wget -P Data/lilith/models/ https://huggingface.co/OedoSoldier/Bert-VITS2-2.2-CLAP/resolve/main/DUR_0.pth  
!wget -P Data/lilith/models/ https://huggingface.co/OedoSoldier/Bert-VITS2-2.2-CLAP/resolve/main/D_0.pth  
!wget -P Data/lilith/models/ https://huggingface.co/OedoSoldier/Bert-VITS2-2.2-CLAP/resolve/main/G_0.pth

注意這里的底模要放在角色的models目錄中,同時(shí)注意底模版本是2.2。

上傳音頻素材和重采樣

隨后打開目錄,在lilith目錄右鍵新建文件夾raw,接著右鍵點(diǎn)擊上傳,將素材上傳到云端:

同時(shí)也將轉(zhuǎn)寫文件esd.list右鍵上傳到項(xiàng)目的lilith目錄:

./Data/lilith/wavs/processed_0.wav|lilith|ZH|信仰,叫你們要否定心中的欲望。  
./Data/lilith/wavs/processed_1.wav|lilith|ZH|把你們*在自己的身體裡  
./Data/lilith/wavs/processed_2.wav|lilith|ZH|圣修雅瑞之母  
./Data/lilith/wavs/processed_3.wav|lilith|ZH|我有你要的東西  
./Data/lilith/wavs/processed_4.wav|lilith|ZH|你渴望知識(shí)  
./Data/lilith/wavs/processed_5.wav|lilith|ZH|不惜帶著孩子尋遍圣修雅瑞  
./Data/lilith/wavs/processed_6.wav|lilith|ZH|這話你真的相信嗎  
./Data/lilith/wavs/processed_7.wav|lilith|ZH|不必再裝了  
./Data/lilith/wavs/processed_8.wav|lilith|ZH|你有問題,我有答案  
./Data/lilith/wavs/processed_9.wav|lilith|ZH|我洞悉整個(gè)宇宙的真理  
./Data/lilith/wavs/processed_10.wav|lilith|ZH|你看了那么多  
./Data/lilith/wavs/processed_11.wav|lilith|ZH|知道的卻那麼少  
./Data/lilith/wavs/processed_12.wav|lilith|ZH|打碎枷鎖  
./Data/lilith/wavs/processed_13.wav|lilith|ZH|你願(yuàn)意接受我的提議嗎?  
./Data/lilith/wavs/processed_14.wav|lilith|ZH|你很好奇想知道我  
./Data/lilith/wavs/processed_15.wav|lilith|ZH|為什麼饒了你的命  
./Data/lilith/wavs/processed_16.wav|lilith|ZH|你相信我嗎  
./Data/lilith/wavs/processed_17.wav|lilith|ZH|很好,現(xiàn)在你只需要知道。  
./Data/lilith/wavs/processed_18.wav|lilith|ZH|我們要去見我兒子  
./Data/lilith/wavs/processed_19.wav|lilith|ZH|是的,但不止如此。  
./Data/lilith/wavs/processed_20.wav|lilith|ZH|他還是我計(jì)劃的關(guān)鍵  
./Data/lilith/wavs/processed_21.wav|lilith|ZH|雖然我無法預(yù)料  
./Data/lilith/wavs/processed_22.wav|lilith|ZH|在新世界里你是否愿意站在我身邊  
./Data/lilith/wavs/processed_23.wav|lilith|ZH|找出自己真正的本性  
./Data/lilith/wavs/processed_25.wav|lilith|ZH|可是我還是會(huì)為你  
./Data/lilith/wavs/processed_27.wav|lilith|ZH|但現(xiàn)在所有的可能性  
./Data/lilith/wavs/processed_28.wav|lilith|ZH|統(tǒng)統(tǒng)被奪走了  
./Data/lilith/wavs/processed_29.wav|lilith|ZH|奪走啊  
./Data/lilith/wavs/processed_30.wav|lilith|ZH|這把鑰匙能打開的不僅是地獄的大門  
./Data/lilith/wavs/processed_31.wav|lilith|ZH|也會(huì)開啟我們的未來  
./Data/lilith/wavs/processed_32.wav|lilith|ZH|因?yàn)槟愕臓奚诺靡詫?shí)現(xiàn)到未來  
./Data/lilith/wavs/processed_33.wav|lilith|ZH|打碎枷鎖  
./Data/lilith/wavs/processed_34.wav|lilith|ZH|接受美麗的罪惡  
./Data/lilith/wavs/processed_35.wav|lilith|ZH|這就是第一批  
./Data/lilith/wavs/processed_36.wav|lilith|ZH|腦筋動(dòng)得很快  
./Data/lilith/wavs/processed_37.wav|lilith|ZH|沒錯(cuò),我正是莉莉絲。

至于音頻如何切分、轉(zhuǎn)寫、標(biāo)注等操作,請移步:本地訓(xùn)練,立等可取,30秒音頻素材復(fù)刻霉霉講中文音色基于Bert-VITS2V2.0.2。囿于篇幅,這里不再贅述。

確保素材切分和轉(zhuǎn)寫文件都上傳成功后,新建命令:

#@title 重采樣  
!python3 resample.py --sr 44100 --in_dir ./Data/lilith/raw/ --out_dir ./Data/lilith/wavs/

進(jìn)行重采樣操作。

預(yù)處理標(biāo)簽文件

接著新建命令:

#@title 預(yù)處理標(biāo)簽文件  
!python3 preprocess_text.py --transcription-path ./Data/lilith/esd.list --train-path ./Data/lilith/train.list --val-path ./Data/lilith/val.list --config-path ./Data/lilith/configs/config.json

程序返回:

pytorch_model.bin: 100% 1.32G/1.32G [00:26<00:00, 49.4MB/s]  
spm.model: 100% 2.46M/2.46M [00:00<00:00, 131MB/s]  
The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.  
0it [00:00, ?it/s]  
[nltk_data] Downloading package averaged_perceptron_tagger to  
[nltk_data]     /root/nltk_data...  
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.  
[nltk_data] Downloading package cmudict to /root/nltk_data...  
[nltk_data]   Unzipping corpora/cmudict.zip.  
Ignored unknown kwarg option normalize  
Ignored unknown kwarg option normalize  
Ignored unknown kwarg option normalize  
Ignored unknown kwarg option normalize  
Some weights of EmotionModel were not initialized from the model checkpoint at ./emotional/wav2vec2-large-robust-12-ft-emotion-msp-dim and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0']  
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.  
  0% 0/36 [00:00<?, ?it/s]Building prefix dict from the default dictionary ...  
Dumping model to file cache /tmp/jieba.cache  
Loading model cost 0.686 seconds.  
Prefix dict has been built successfully.  
100% 36/36 [00:00<00:00, 40.28it/s]  
總重復(fù)音頻數(shù):0,總未找到的音頻數(shù):0  
訓(xùn)練集和驗(yàn)證集生成完成!

此時(shí),在lilith目錄已經(jīng)生成訓(xùn)練集和驗(yàn)證集,即train.list和val.list。

生成 BERT 特征文件

接著新建命令:

#@title 生成 BERT 特征文件  
!python3 bert_gen.py --config-path ./Data/lilith/configs/config.json

程序返回:

0% 0/36 [00:00<?, ?it/s]Some weights of the model checkpoint at ./bert/chinese-roberta-wwm-ext-large were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']  
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).  
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).  
Some weights of the model checkpoint at ./bert/chinese-roberta-wwm-ext-large were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']  
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).  
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).  
100% 36/36 [00:21<00:00,  1.67it/s]  
bert生成完畢!, 共有36個(gè)bert.pt生成!

數(shù)一下,一共36個(gè),和音頻素材數(shù)量一致。

生成 clap 特征文件

最后生成clap情感特征文件:

#@title 生成 clap 特征文件  
#!wget -P emotional/clap-htsat-fused/ https://huggingface.co/laion/clap-htsat-fused/resolve/main/pytorch_model.bin  
!python3 clap_gen.py --config-path ./Data/lilith/configs/config.json

程序返回:

/content/Bert-vits2-V2.2/clap_gen.py:34: FutureWarning: Pass sr=48000 as keyword args. From version 0.10 passing these as positional arguments will result in an error  
  audio = librosa.load(wav_path, 48000)[0]  
  0% 0/36 [00:00<?, ?it/s]/content/Bert-vits2-V2.2/clap_gen.py:34: FutureWarning: Pass sr=48000 as keyword args. From version 0.10 passing these as positional arguments will result in an error  
  audio = librosa.load(wav_path, 48000)[0]  
/content/Bert-vits2-V2.2/clap_gen.py:34: FutureWarning: Pass sr=48000 as keyword args. From version 0.10 passing these as positional arguments will result in an error  
  audio = librosa.load(wav_path, 48000)[0]  
/content/Bert-vits2-V2.2/clap_gen.py:34: FutureWarning: Pass sr=48000 as keyword args. From version 0.10 passing these as positional arguments will result in an error  
  audio = librosa.load(wav_path, 48000)[0]  
100% 36/36 [00:44<00:00,  1.23s/it]  
clap生成完畢!, 共有36個(gè)emo.pt生成!

同樣36個(gè),也就是說每個(gè)素材需要對應(yīng)一個(gè)bert和一個(gè)clap。

開始訓(xùn)練

萬事俱備,開始訓(xùn)練:

#@title 開始訓(xùn)練  
!python3 train_ms.py

程序返回:

2023-12-19 03:17:48.852966: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered  
2023-12-19 03:17:48.853057: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered  
2023-12-19 03:17:48.992178: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered  
2023-12-19 03:17:49.268092: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.  
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.  
2023-12-19 03:17:51.369993: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT  
加載config中的配置localhost  
加載config中的配置10086  
加載config中的配置1  
加載config中的配置0  
加載config中的配置0  
加載環(huán)境變量   
MASTER_ADDR: localhost,  
MASTER_PORT: 10086,  
WORLD_SIZE: 1,  
RANK: 0,  
LOCAL_RANK: 0  
12-19 03:17:55 INFO     | data_utils.py:66 | Init dataset...  
100% 32/32 [00:00<00:00, 51901.67it/s]  
12-19 03:17:55 INFO     | data_utils.py:81 | skipped: 0, total: 32  
12-19 03:17:55 INFO     | data_utils.py:66 | Init dataset...  
100% 4/4 [00:00<00:00, 34100.03it/s]  
12-19 03:17:55 INFO     | data_utils.py:81 | skipped: 0, total: 4  
Using noise scaled MAS for VITS2  
Using duration discriminator for VITS2  
INFO:models:Loaded checkpoint 'Data/lilith/models/DUR_0.pth' (iteration 0)  
ERROR:models:emb_g.weight is not in the checkpoint  
INFO:models:Loaded checkpoint 'Data/lilith/models/G_0.pth' (iteration 0)  
INFO:models:Loaded checkpoint 'Data/lilith/models/D_0.pth' (iteration 0)  
******************檢測到模型存在,epoch為 1,gloabl step為 0*********************  
  0% 0/8 [00:00<?, ?it/s][W reducer.cpp:1346] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())  
INFO:models:Train Epoch: 1 [0%]  
INFO:models:[2.78941011428833, 2.49017596244812, 5.66870641708374, 25.731149673461914, 4.624840259552002, 3.6382224559783936, 0, 0.0002]  
Evaluating ...  
INFO:models:Saving model and optimizer state at iteration 1 to Data/lilith/models/G_0.pth  
INFO:models:Saving model and optimizer state at iteration 1 to Data/lilith/models/D_0.pth  
INFO:models:Saving model and optimizer state at iteration 1 to Data/lilith/models/DUR_0.pth  
100% 8/8 [00:40<00:00,  5.05s/it]  
INFO:models:====> Epoch: 1  
100% 8/8 [00:09<00:00,  1.20s/it]  
INFO:models:====> Epoch: 2  
100% 8/8 [00:09<00:00,  1.23s/it]  
INFO:models:====> Epoch: 3  
100% 8/8 [00:09<00:00,  1.24s/it]  
INFO:models:====> Epoch: 4  
100% 8/8 [00:09<00:00,  1.25s/it]  
INFO:models:====> Epoch: 5  
100% 8/8 [00:10<00:00,  1.26s/it]  
INFO:models:====> Epoch: 6  
 25% 2/8 [00:02<00:08,  1.41s/it]INFO:models:Train Epoch: 7 [25%]

由此就在底模的基礎(chǔ)上開始訓(xùn)練了。

在線推理

訓(xùn)練了100步之后,我們可以先看看效果:

注意修改根目錄的config.yml中的模型名稱和模型名稱一致:

# webui webui配置  
# 注意, “:” 后需要加空格  
webui:  
  # 推理設(shè)備  
  device: "cuda"  
  # 模型路徑  
  model: "models/G_100.pth"  
  # 配置文件路徑  
  config_path: "configs/config.json"  
  # 端口號(hào)  
  port: 7860  
  # 是否公開部署,對外網(wǎng)開放  
  share: false  
  # 是否開啟debug模式  
  debug: false  
  # 語種識(shí)別庫,可選langid, fastlid  
  language_identification_library: "langid"

這里model參數(shù)寫成:models/G_100.pth

隨后新建命令:

#@title 開始推理  
!python3 webui.py

程序返回:

Ignored unknown kwarg option normalize  
Ignored unknown kwarg option normalize  
Ignored unknown kwarg option normalize  
Ignored unknown kwarg option normalize  
Some weights of EmotionModel were not initialized from the model checkpoint at ./emotional/wav2vec2-large-robust-12-ft-emotion-msp-dim and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1']  
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.  
| numexpr.utils | INFO | NumExpr defaulting to 2 threads.  
/usr/local/lib/python3.10/dist-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.  
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")  
| utils | INFO | Loaded checkpoint 'Data/lilith/models/G_100.pth' (iteration 13)  
推理頁面已開啟!  
Running on local URL:  http://127.0.0.1:7860  
Running on public URL: https://40b8695e0a18b0e2eb.gradio.live

一個(gè)內(nèi)網(wǎng)地址,一個(gè)公網(wǎng)地址,訪問公網(wǎng)地址https://40b8695e0a18b0e2eb.gradio.live進(jìn)行推理即可。

最后奉上GoogleColab筆記鏈接:

https://colab.research.google.com/drive/1LgewU9jevSovP9NTuqTtoxDop3qeWWKK?usp=sharing

與君共觴。

總結(jié)

以上是生活随笔為你收集整理的云端开炉,线上训练,Bert-vits2-v2.2云端线上训练和推理实践(基于GoogleColab)的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。