當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

LIC 2022 视频语义理解基线（快速启动版）

發布時間：2023/12/29 编程问答 38 豆豆

生活随笔收集整理的這篇文章主要介紹了 LIC 2022 视频语义理解基线（快速启动版）小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

轉自AI Studio，原文鏈接：

LIC 2022 視頻語義理解基線（快速啟動版） - 飛槳AI Studio

LIC2022視頻語義理解基線

??該版本為快速啟動版，訓練集取比賽提供的訓練集的子集，目的是減少在線項目加載等待時間，供選手快速跑通流程，實現一鍵訓練+預測，生成可提交并有分數的結果文件。

該基線分數比全量數據版基線（🚩鏈接稍后給出）稍低（e.g., 0.33 vs 0.38）。

??請選擇啟動GPU環境（CPU環境將無法進行模型的訓練和預測，且從CPU到GPU環境的轉換將非常慢）。

??每次打開環境，需要約15分鐘文件的加載、同步時間，請耐心等候加載完畢。

😃 每天AI Studio 免費提供8個算力點，最多可使用16小時GPU。除此之外，選手還可以選擇申請AI Studio提供的100小時免費V100 GPU算力碼。

1. 背景介紹

該項目為👉?LIC2022視頻語義理解技術評測任務👈 的快速啟動版基準模型，項目基于github基線改造，為選手提供了存儲和算力支持。模型包括兩部分：1）視頻分類標簽模型 paddle-video-classify-tag；2）視頻語義標簽模型 paddle-video-semantic-tag。

視頻分類標簽模型根據視頻內容在封閉的二級標簽體系上進行分類，得到描述視頻的分類標簽；視頻語義標簽模型從視頻的文本信息中抽取實體語義標簽（選手可進行升級，如利用給定的知識庫進行推理、融合多模信息提升標簽理解效果，或生成標簽等）。兩部分模型產出的標簽結果，分別對應技術評測標簽中的分類標簽、語義標簽。

2. 快速實踐

2.1 環境搭建

運行下列指令添加路徑。

注：注釋行中的依賴包已持久化安裝至項目目錄，無需重復安裝。

In [?]

# !mkdir /home/aistudio/external-libraries # !pip install opencv-python -i https://mirror.baidu.com/pypi/simple -t /home/aistudio/external-libraries # !pip install paddlenlp==2.0.1 -i https://mirror.baidu.com/pypi/simple -t /home/aistudio/external-libraries # !pip install tqdm wget -t /home/aistudio/external-libraries !tar -xvf external-libraries.tar !rm external-libraries.tar import os import sys sys.path.append('/home/aistudio/external-libraries')!mkdir paddle-video-classify-tag && cd paddle-video-classify-tag && mkdir data !mkdir paddle-video-semantic-tag && cd paddle-video-semantic-tag && mkdir data

2.2 數據加載

數據內容包含：

樣例訓練集（比賽使用訓練集的抽樣集合）的視頻信息，及官方提供的tsn視覺特征
A榜測試集（比賽使用的A榜測試集全量集合）的視頻信息，及官方提供的tsn視覺特征

In [?]

!tar -zxvf /home/aistudio/data/data142559/dataset_sample.tar.gz !cd paddle-video-classify-tag && unzip /home/aistudio/data/data142559/tsn_features_test_a.zip !cd paddle-video-classify-tag && unzip /home/aistudio/data/data142559/tsn_features_train_sample.zip

2.3 視頻分類標簽基線

該基線基于?VideoTag飛槳大規模視頻分類模型?，能夠根據視頻內容在封閉的二級標簽體系上進行分類，得到描述視頻的分類標簽。

2.3.1 數據準備

該視頻分類標簽模型使用TSN網絡提取原始視頻的向量表征。由于該步驟比較耗時，我們提供了所有視頻數據的TSN特征，運行2.2數據加載章節指令即可。

數據集上有兩級標簽，我們分別在一級標簽（level1）和二級標簽（level2）的設定下進行分類實驗。在每一種設定下均需要進行訓練、驗證和測試的數據劃分。準備過程會得到如下的輸出：

paddle-video-classify-tag|-- weights|-- attention_lstm.pdmodel|-- attention_lstm.pdopt|-- attention_lstm.pdparams

運行以下代碼，準備視頻語義理解數據集的label集合；訓練、驗證、測試的樣本列表等。

In [?]

import os import os.path as osp import requests import time import codecs import json import argparse import randomdef create_splits_indice(n_samples, SPLITS):assert sum([v for k, v in SPLITS]) == 1.0indices = list(range(n_samples))random.shuffle(indices)split2indice = {}r_offset = 0for idx, (split, ratio) in enumerate(SPLITS):l_offset = r_offsetif idx == len(SPLITS) - 1:r_offset = n_sampleselse:r_offset = int(n_samples * ratio) + l_offsetsplit2indice[split] = indices[l_offset:r_offset]return split2indicedef prepare_split(data, split_name, test_only=False, gather_labels=False, classify_tag_dir='/home/aistudio/paddle-video-classify-tag'):'''1. Prepare ALL (unique) labels for classification from trainval-set.2. For each split, generate sample list for level1 & level2 classification.'''trainval_tsn_feature_dir = '/home/aistudio/paddle-video-classify-tag/tsn_features_train_sample'test_tsn_feature_dir = '/home/aistudio/paddle-video-classify-tag/tsn_features_test_a'cls_data_dir = os.path.join(classify_tag_dir, 'data')if not os.path.exists(cls_data_dir):os.mkdir(cls_data_dir)sample_nids = [sample["@id"] for sample in data]level1_labels = []level2_labels = []if not test_only:for sample in data:category = {each["@meta"]["type"]: each["@value"]for each in sample["category"]}level1_labels.append(category["level1"])level2_labels.append(category["level2"])def create_sample_list(sample_labels, level_name):save_label_file = cls_data_dir + "/{}_label.txt".format(level_name)if gather_labels:# For trainval set:# Gather candidate labels and dump to {level1,level2}_label.txtlabels = sorted([str(label) for label in list(set(sample_labels))])with codecs.open(save_label_file, "w", encoding="utf-8") as ouf:ouf.writelines([label + "\n" for label in labels])print("Saved " + save_label_file)else:# For test set: load existing labels.with codecs.open(save_label_file, "r", encoding="utf-8") as inf:labels = [line.strip() for line in inf.readlines()]label2idx = {label: idx for idx, label in enumerate(labels)}sample_lines = []# Generate sample list: one sample per line (feature_path -> label)for i in range(len(sample_nids)):label_indice = label2idx[str(sample_labels[i])] if not test_only \else -1if split_name in ["train", "val", "trainval"]:tsn_feature_dir = trainval_tsn_feature_direlif split_name in ["test"]:tsn_feature_dir = test_tsn_feature_dirfeature_path = osp.join(tsn_feature_dir,"{}.npy".format(sample_nids[i]))if osp.exists(feature_path):line = "{} {}\n".format(feature_path, str(label_indice))sample_lines.append(line)save_split_file = cls_data_dir + "/{}_{}.list".format(level_name, split_name)with codecs.open(save_split_file, "w", encoding="utf-8") as ouf:ouf.writelines(sample_lines)print("Saved {}, size={}".format(save_split_file,len(sample_lines)))create_sample_list(level1_labels, "level1")create_sample_list(level2_labels, "level2")random.seed(6666) classify_tag_dir = '/home/aistudio/paddle-video-classify-tag' if not os.path.exists(classify_tag_dir):os.mkdir(classify_tag_dir) trainval_path = '/home/aistudio/dataset_sample/train.sample.json' test_path = '/home/aistudio/dataset_sample/test_a.json'# load data for train & validation (have labels). with codecs.open(trainval_path, "r", encoding="utf-8") as inf:print("Loading {}...".format(trainval_path))lines = inf.readlines()trainval_data = [json.loads(line) for line in lines]# load data for test (no labels). with codecs.open(test_path, "r", encoding="utf-8") as inf:print("Loading {}...".format(test_path))lines = inf.readlines()test_data = [json.loads(line) for line in lines]# split the trainval data into train-set(80%) and validation-set(20%). split2indice = create_splits_indice(len(trainval_data), [("train", 4.0 / 5.0),("val", 1.0 / 5.0),]) train_data = [trainval_data[idx] for idx in split2indice["train"]] val_data = [trainval_data[idx] for idx in split2indice["val"]]prepare_split(trainval_data, "trainval", gather_labels=True) prepare_split(train_data, "train") prepare_split(val_data, "val") prepare_split(test_data, "test", test_only=True) Loading /home/aistudio/dataset_sample/train.sample.json... Loading /home/aistudio/dataset_sample/test_a.json...

2.3.2 訓練與驗證

運行以下指令訓練分類模型。

選手可以參考原代碼庫中的模型微調指南獲取更多信息。

In [4]

import os import sys import argparse import ast import logging import paddle paddle.enable_static() sys.path.append('/home/aistudio/external-libraries') sys.path.append('/home/aistudio/work/paddle-video-classify-tag')from utils.train_utils import train_with_dataloader import models from utils.config_utils import * from reader import get_reader from metrics import get_metrics from utils.utility import check_cuda from utils.utility import check_versionlogging.root.handlers = [] FORMAT = '[%(levelname)s: %(filename)s: %(lineno)4d]: %(message)s' logging.basicConfig(level=logging.INFO, format=FORMAT) logger = logging.getLogger(__name__)class Args():model_name = 'AttentionLSTM'config = '/home/aistudio/work/paddle-video-classify-tag/configs/attention_lstm-single-level1.yaml'batch_size = Nonelearning_rate = Nonepretrain = '/home/aistudio/work/paddle-video-classify-tag/weights/attention_lstm'use_gpu = Trueno_memory_optimize = Falseepoch = Nonevalid_interval = 1save_dir = os.path.join('paddle-video-classify-tag', 'data', 'checkpoints', 'level1')log_interval = 50fix_random_seed = Falsedef train(args):# parse configconfig = parse_config(args.config)train_config = merge_configs(config, 'train', vars(args))valid_config = merge_configs(config, 'valid', vars(args))# print_configs(train_config, 'Train')train_model = models.get_model(args.model_name, train_config, mode='train')valid_model = models.get_model(args.model_name, valid_config, mode='valid')# build modelstartup = paddle.static.Program()train_prog = paddle.static.Program()if args.fix_random_seed:startup.random_seed = 1000train_prog.random_seed = 1000with paddle.static.program_guard(train_prog, startup):with paddle.utils.unique_name.guard():train_model.build_input(use_dataloader=True)train_model.build_model()# for the input, has the form [data1, data2,..., label], so train_feeds[-1] is labeltrain_feeds = train_model.feeds()train_fetch_list = train_model.fetches()train_loss = train_fetch_list[0]optimizer = train_model.optimizer()optimizer.minimize(train_loss)train_dataloader = train_model.dataloader()valid_prog = paddle.static.Program()with paddle.static.program_guard(valid_prog, startup):with paddle.utils.unique_name.guard():valid_model.build_input(use_dataloader=True)valid_model.build_model()valid_feeds = valid_model.feeds()valid_fetch_list = valid_model.fetches()valid_dataloader = valid_model.dataloader()place = paddle.CUDAPlace(0) if args.use_gpu else paddle.CPUPlace()exe = paddle.static.Executor(place)exe.run(startup)if args.pretrain:train_model.load_pretrain_params(exe, args.pretrain, train_prog)build_strategy = paddle.fluid.compiler.BuildStrategy()build_strategy.enable_inplace = Trueexec_strategy = paddle.static.ExecutionStrategy()compiled_train_prog = paddle.static.CompiledProgram(train_prog).with_data_parallel(loss_name=train_loss.name,build_strategy=build_strategy,exec_strategy=exec_strategy)compiled_valid_prog = paddle.static.CompiledProgram(valid_prog).with_data_parallel(share_vars_from=compiled_train_prog,build_strategy=build_strategy,exec_strategy=exec_strategy)# get readerbs_denominator = 1if args.use_gpu:# check number of GPUsgpus = os.getenv("CUDA_VISIBLE_DEVICES", "")if gpus == "":passelse:gpus = gpus.split(",")num_gpus = len(gpus)assert num_gpus == train_config.TRAIN.num_gpus, \"num_gpus({}) set by CUDA_VISIBLE_DEVICES " \"shoud be the same as that " \"set in {}({})".format(num_gpus, args.config, train_config.TRAIN.num_gpus)bs_denominator = train_config.TRAIN.num_gpustrain_config.TRAIN.batch_size = int(train_config.TRAIN.batch_size /bs_denominator)valid_config.VALID.batch_size = int(valid_config.VALID.batch_size /bs_denominator)train_reader = get_reader(args.model_name.upper(), 'train', train_config)valid_reader = get_reader(args.model_name.upper(), 'valid', valid_config)# get metricstrain_metrics = get_metrics(args.model_name.upper(), 'train', train_config)valid_metrics = get_metrics(args.model_name.upper(), 'valid', valid_config)epochs = args.epoch or train_model.epoch_num()exe_places = paddle.static.cuda_places() if args.use_gpu else paddle.static.cpu_places()train_dataloader.set_sample_list_generator(train_reader, places=exe_places)valid_dataloader.set_sample_list_generator(valid_reader, places=exe_places)train_with_dataloader(exe,train_prog,compiled_train_prog,train_dataloader,train_fetch_list,train_metrics,epochs=epochs,log_interval=args.log_interval,valid_interval=args.valid_interval,save_dir=args.save_dir,save_model_name=args.model_name,fix_random_seed=args.fix_random_seed,compiled_test_prog=compiled_valid_prog,test_dataloader=valid_dataloader,test_fetch_list=valid_fetch_list,test_metrics=valid_metrics)args = Args()# check whether the installed paddle is compiled with GPU check_cuda(args.use_gpu) check_version()args.model_name = 'AttentionLSTM' args.log_interval = 50# first layer args.config = '/home/aistudio/work/paddle-video-classify-tag/configs/attention_lstm-single-level1.yaml' args.save_dir = '/home/aistudio/paddle-video-classify-tag/data/checkpoints/level1' if not os.path.exists(args.save_dir):os.makedirs(args.save_dir) train(args)# second layer args.config = '/home/aistudio/work/paddle-video-classify-tag/configs/attention_lstm-single-level2.yaml' args.save_dir = '/home/aistudio/paddle-video-classify-tag/data/checkpoints/level2' if not os.path.exists(args.save_dir):os.makedirs(args.save_dir) train(args) [INFO: regularizer.py: 101]: If regularizer of a Parameter has been set by 'fluid.ParamAttr' or 'fluid.WeightNormParamAttr' already. The Regularization[L2Decay, regularization_coeff=0.000800] in Optimizer will not take effect, and it will only be applied to other Parameters! [INFO: attention_lstm.py: 164]: Load pretrain weights from /home/aistudio/work/paddle-video-classify-tag/weights/attention_lstm, exclude fc layer. [INFO: train_utils.py: 45]: ------- learning rate [0.000125], learning rate counter [-] ----- [INFO: metrics_util.py: 80]: [TRAIN 2022-04-28 22:05:55] Epoch 0, iter 0, time 2.767573356628418, , loss = 2923.538330, Hit@1 = 0.02, PERR = 0.02, GAP = 0.02 [INFO: train_utils.py: 122]: [TRAIN] Epoch 0 training finished, average time: 1.7017907901686065 share_vars_from is set, scope is ignored. [INFO: metrics_util.py: 80]: [TEST] test_iter 0 , loss = 302.009247, Hit@1 = 0.53, PERR = 0.53, GAP = 0.60 [INFO: metrics_util.py: 124]: [TEST] Epoch0 Finish avg_hit_at_one: 0.505859375, avg_perr: 0.505859375, avg_loss :329.6025136311849, aps: [0.2056517240004745, 0.7582162120652176, 0.06977232338023834, 0, 0.0, 0.8453953428088683, 0.5386290618179087, 0.12647074684748238, 0.06464234760008425, 0.8143473229661798, 0.025846861750262674, 0.8125, 0.5606631794981926, 0.025, 0.6332121597279986, 0.48199568449775865, 0.03583711373823246, 0.844207523649567, 0.006626716822675181, 0.8250196464463089, 0.5152466244119106, 0.27196454054549163, 0.6783860174137636, 0, 0.053316102627013265, 0.19644972430583615, 0.9643802486046634, 0.17060293754538206, 0.05704956089391827, 0.07652161831232097, 0, 0.038493472255469084, 0.5543688830575145], gap:0.5390077300563219 [INFO: train_utils.py: 45]: ------- learning rate [0.000125], learning rate counter [-] ----- [INFO: metrics_util.py: 80]: [TRAIN 2022-04-28 22:08:10] Epoch 1, iter 0, time 2.6242427825927734, , loss = 364.424683, Hit@1 = 0.48, PERR = 0.48, GAP = 0.44 [INFO: train_utils.py: 122]: [TRAIN] Epoch 1 training finished, average time: 1.7430822508675712 [INFO: metrics_util.py: 80]: [TEST] test_iter 0 , loss = 248.946411, Hit@1 = 0.62, PERR = 0.62, GAP = 0.70 [INFO: metrics_util.py: 124]: [TEST] Epoch1 Finish avg_hit_at_one: 0.5944010416666666, avg_perr: 0.5944010416666666, avg_loss :282.882386525472, aps: [0.2639931946892855, 0.85903455775213, 0.09778176122448987, 0, 0.030033261592643315, 0.8680903206216465, 0.6081577153299009, 0.22149620838665163, 0.09123414426924248, 0.8905820604115383, 0.055248944301131656, 0.95, 0.5856899952942017, 0.168086815154962, 0.6645151962662453, 0.5090036826258103, 0.17872309016688617, 0.8700421734225023, 0.03301833984693047, 0.8629644432673815, 0.5494279056102854, 0.33194993658077027, 0.7208050878611371, 0, 0.2092875916594547, 0.24870436262552853, 0.9722225403199237, 0.19206833214739047, 0.09036579340064066, 0.10623058817084435, 0, 0.03769712329789659, 0.6627263444618232], gap:0.6346135299340665 [INFO: train_utils.py: 45]: ------- learning rate [0.000125], learning rate counter [-] ----- [INFO: metrics_util.py: 80]: [TRAIN 2022-04-28 22:10:24] Epoch 2, iter 0, time 1.708108901977539, , loss = 315.439453, Hit@1 = 0.55, PERR = 0.55, GAP = 0.56 [INFO: train_utils.py: 122]: [TRAIN] Epoch 2 training finished, average time: 1.795787981578282 [INFO: metrics_util.py: 80]: [TEST] test_iter 0 , loss = 235.408524, Hit@1 = 0.63, PERR = 0.63, GAP = 0.72 [INFO: metrics_util.py: 124]: [TEST] Epoch2 Finish avg_hit_at_one: 0.6002604166666666, avg_perr: 0.6002604166666666, avg_loss :271.76447041829425, aps: [0.31461424520869374, 0.8936133494848872, 0.11043648749825534, 0.125, 0.03722438391699092, 0.8856478710931908, 0.6146493900189034, 0.22876525268503656, 0.10540428561979417, 0.8973407150987074, 0.07046950416809067, 1.0, 0.5973890927620836, 0.19951584164627645, 0.6677338493633069, 0.5268199460078398, 0.24872847879439597, 0.8782532773887733, 0.1096882518043304, 0.8638357292555305, 0.5653448812511401, 0.32151854978205635, 0.7386509578503802, 0, 0.2089245778417543, 0.36278048502308347, 0.9742929999991989, 0.25464991632445033, 0.13271150047535973, 0.13510689282158658, 0.0, 0.04718878169155384, 0.6798868023231818], gap:0.6512923596529407 [INFO: regularizer.py: 101]: If regularizer of a Parameter has been set by 'fluid.ParamAttr' or 'fluid.WeightNormParamAttr' already. The Regularization[L2Decay, regularization_coeff=0.000800] in Optimizer will not take effect, and it will only be applied to other Parameters! [INFO: attention_lstm.py: 164]: Load pretrain weights from /home/aistudio/work/paddle-video-classify-tag/weights/attention_lstm, exclude fc layer. [INFO: train_utils.py: 45]: ------- learning rate [0.000125], learning rate counter [-] ----- [INFO: metrics_util.py: 80]: [TRAIN 2022-04-28 22:13:03] Epoch 0, iter 0, time 2.3079347610473633, , loss = 24682.550781, Hit@1 = 0.00, PERR = 0.00, GAP = 0.00 [INFO: train_utils.py: 122]: [TRAIN] Epoch 0 training finished, average time: 1.7917025527175592 share_vars_from is set, scope is ignored. [INFO: metrics_util.py: 80]: [TEST] test_iter 0 , loss = 600.804932, Hit@1 = 0.30, PERR = 0.30, GAP = 0.24 [INFO: metrics_util.py: 124]: [TEST] Epoch0 Finish avg_hit_at_one: 0.21549479166666666, avg_perr: 0.21549479166666666, avg_loss :640.4301401774088, aps: [0, 0, 0, 0.40242828226797467, 0, 0, 0, 0, 0.33863014399927605, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.017490013599745342, 0, 0, 0, 0, 0, 0, 0, 0.03521126486951533, 0, 0, 0, 0, 0, 0, 0.03405457205661094, 0, 0, 0, 0, 0.0, 0.6962887915273482, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.04010896770467283, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.05729046027105949, 0, 0, 0, 0, 0.006493506493506493, 0, 0, 0, 0, 0, 0.6447801522817966, 0, 0, 0.06971153846153846, 0, 0, 0, 0, 0, 0.06551067886636681, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.20577600830312404, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.05182387738056744, 0.19935288798575898, 0, 0, 0, 0, 0, 0, 0.006060606060606061, 0, 0.01652246148167635, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.3408273189259355, 0, 0, 0.22446935557021744, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.10661600278272008, 0, 0, 0, 0.03607677045177045, 0, 0, 0, 0, 0.12624295024072427, 0, 0, 0, 0.011908733985379907, 0, 0, 0, 0, 0, 0.01123077136610415, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.14973782613585243, 0, 0, 0.6419203037914082, 0.3315040696329643, 0, 0, 0, 0.03735729886111951, 0, 0, 0, 0, 0.0006313131313131313, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.2578085880759878, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], gap:0.1536302281405276 [INFO: train_utils.py: 45]: ------- learning rate [0.000125], learning rate counter [-] ----- [INFO: metrics_util.py: 80]: [TRAIN 2022-04-28 22:15:21] Epoch 1, iter 0, time 2.288398265838623, , loss = 643.896606, Hit@1 = 0.26, PERR = 0.26, GAP = 0.17 [INFO: train_utils.py: 122]: [TRAIN] Epoch 1 training finished, average time: 1.7937103728858792 [INFO: metrics_util.py: 80]: [TEST] test_iter 0 , loss = 506.925232, Hit@1 = 0.42, PERR = 0.42, GAP = 0.35 [INFO: metrics_util.py: 124]: [TEST] Epoch1 Finish avg_hit_at_one: 0.322265625, avg_perr: 0.322265625, avg_loss :543.9810791015625, aps: [0, 0.6284037777940216, 0, 0.4476176894075993, 0, 0.017502088554720133, 0, 0.0, 0.49948578553903844, 0.6726278000591354, 0, 0, 0.014814814814814815, 0, 0, 0.0, 0, 0, 0.12025469917857523, 0, 0, 0, 0, 0, 0, 0, 0.06116341245294592, 0, 0.0, 0, 0, 0, 0, 0.08986510263555623, 0, 0, 0, 0.08181818181818182, 0.1834875432956615, 0.7799351937440513, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.8018812602440357, 0, 0, 0, 0, 0.0, 0, 0, 0, 0, 0, 0, 0.03716972682489924, 0, 0, 0, 0.35282327653333223, 0.29747338431548953, 0, 0, 0, 0.03269644204099986, 0, 0, 0, 0.10658710658710657, 0, 0.7397457184599068, 0.021230242805867577, 0.03970086959790015, 0.21014492753623187, 0, 0, 0, 0, 0, 0.2012167613391781, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0, 0, 0, 0, 0, 0.5718171721543543, 0.20836023284422855, 0, 0, 0.0205176753138432, 0, 0, 0, 0, 0, 0, 0, 0, 0.0, 0, 0.32366077494535755, 0.5132305645871157, 0, 0, 0, 0, 0, 0, 0.059059953550699054, 0, 0.02679003573313643, 0, 0, 0, 0, 0.010724231560075813, 0, 0, 0.0, 0, 0.3694164859158865, 0.04969704265478912, 0.06605492031500655, 0.26678914407765286, 0.5, 0.1117114831313118, 0.48005756650486286, 0, 0, 0, 0, 0, 0, 0, 0.020842379504993486, 0, 0, 0, 0.16806455477716875, 0, 0, 0, 0.35504129480409174, 0, 0, 0, 0, 0.18455439535372678, 0, 0.10785929617324966, 0, 0.07077634245187436, 0, 0, 0, 0.15618791293239848, 0, 0.013192995678206728, 0, 0, 0, 0, 0, 0, 0.38888888888888895, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.9289044289044288, 0.38066141949737636, 0, 0, 0.7315730501252811, 0.30272837068177305, 0.03046189453320919, 0, 0, 0.19456917878558563, 0, 0, 0, 0, 0.015130219289343896, 0, 0, 0, 0, 0, 0, 0, 0, 0.03746427683108207, 0, 0, 0, 0.1412861620299421, 0, 0, 0, 0, 0, 0, 0, 0.3944251458547141, 0, 0, 0.2678798532334657, 0.5851648351648351, 0, 0, 0, 0.018914641343419712, 0, 0, 0.12094981620223223, 0.0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], gap:0.2629719332516419 [INFO: train_utils.py: 45]: ------- learning rate [0.000125], learning rate counter [-] ----- [INFO: metrics_util.py: 80]: [TRAIN 2022-04-28 22:17:41] Epoch 2, iter 0, time 2.2935118675231934, , loss = 554.811279, Hit@1 = 0.31, PERR = 0.31, GAP = 0.25 [INFO: train_utils.py: 122]: [TRAIN] Epoch 2 training finished, average time: 1.8651155549652723 [INFO: metrics_util.py: 80]: [TEST] test_iter 0 , loss = 457.075806, Hit@1 = 0.42, PERR = 0.42, GAP = 0.43 [INFO: metrics_util.py: 124]: [TEST] Epoch2 Finish avg_hit_at_one: 0.3645833333333333, avg_perr: 0.3645833333333333, avg_loss :490.911376953125, aps: [0, 0.7264839860993707, 0, 0.47462585341997715, 0, 0.014711590709519537, 0, 0.04822073388816287, 0.5609560108708409, 0.8027352365113277, 0, 0, 0.32129792215393616, 0, 0, 0.02431476569407604, 0, 0.0, 0.2461927562794056, 0, 0, 0, 0, 0, 0.1660546382768605, 0.03484848484848485, 0.06536851469684964, 0, 0.0, 0, 0, 0, 0, 0.08563124068799817, 0, 0, 0, 0.34035087719298246, 0.33772339731642054, 0.8509592092451618, 0, 0.03125, 0, 0.0, 0.020867208672086725, 0.039261363636363636, 0, 0, 0, 0, 0.8568407053083421, 0, 0, 0, 0, 0.03447038958508859, 0, 0, 0, 0.004165394142997726, 0, 0.08268743667679837, 0.048576534576534575, 0, 0, 0, 0.41189811345296407, 0.6160275221199591, 0, 0, 0, 0.05479598297850816, 0, 0.011745038316747055, 0, 0.14839572192513367, 0.017857142857142856, 0.8142012317112416, 0.03919328856298826, 0.10154059203661106, 0.5147058823529411, 0, 0.0027100271002710027, 0.0, 0, 0, 0.25578997486755245, 0, 0.0, 0.03891402714932127, 0, 0.16742424242424242, 0, 0, 0, 0, 1.0, 0, 0, 0, 0, 0.6971533526064891, 0.25392683350632306, 0.0, 0, 0.12435782848299608, 0, 0, 0.0, 0, 0, 0, 0, 0, 0.03965336134453781, 0, 0.3146511903607467, 0.5880597551385015, 0, 0, 0, 0, 0, 0, 0.08627606708336637, 0, 0.031046668257497162, 0, 0, 0, 0, 0.012812752669984209, 0, 0, 0.027443216190947587, 0, 0.3842286813576051, 0.3885135135135135, 0.08781870642994039, 0.6136011372304279, 0.10267857142857142, 0.17066271855227885, 0.5040137737506158, 0, 0, 0, 0, 0.0, 0, 0.018518518518518517, 0.12702702702702703, 0, 0.09607023411371238, 0, 0.2145263036917232, 0, 0, 0, 0.48504451428566525, 0, 0, 0, 0, 0.24584729615785442, 0, 0.18667368667368664, 0, 0.13666142557651992, 0, 0, 0, 0.3403490239016555, 0.7, 0.025737892154730938, 0.030303030303030304, 0, 0, 0, 0, 0, 0.7166666666666667, 0, 0, 0, 0, 0, 0.027777777777777776, 0.15572533484248255, 0, 0, 0.9294127880666342, 0.3886217948717949, 0, 0, 0.8316227550780863, 0.3991161073245167, 0.13076985608832742, 0, 0, 0.2779068113777733, 0, 0, 0.0, 0, 0.023388721455443927, 0, 0, 0, 0, 0, 0, 0, 0, 0.08238636363636363, 0, 0.0, 0, 0.2810910550416964, 0, 0.06020066889632107, 0, 0, 0, 0, 0, 0.47000346625667766, 0, 0, 0.4060489971691195, 0.6409310594991161, 0, 0, 0, 0.027279501544207423, 0, 0, 0.2954107030662055, 0.1875, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], gap:0.35114517815351143

2.3.3 生成分類標簽結果

運行以下代碼塊生成標簽預測結果。

生成的標簽結果存儲在./predict_results/level{1, 2}_top1.json

In [1]

import os import sys import time import logging import argparse import ast import numpy as np import paddle try:import cPickle as pickle except:import picklesys.path.append('/home/aistudio/external-libraries') sys.path.append('/home/aistudio/work/paddle-video-classify-tag')from utils.config_utils import * import models from reader import get_reader from metrics import get_metrics from utils.utility import check_cuda from utils.utility import check_versionlogging.root.handlers = [] FORMAT = '[%(levelname)s: %(filename)s: %(lineno)4d]: %(message)s' logging.basicConfig(level=logging.DEBUG, format=FORMAT) logger = logging.getLogger(__name__)class Args():model_name = 'AttentionLSTM'config = '/home/aistudio/work/paddle-video-classify-tag/configs/attention_lstm-single-level1.yaml'use_gpu = Truebatch_size = Noneweights = '/home/aistudio/paddle-video-classify-tag/data/checkpoints/level1/AttentionLSTM_epoch2.pdparams'filelist = Nonelog_interval = 200infer_topk = 10save_dir = './predict_results'save_file = "top1.json"label_file = '/home/aistudio/paddle-video-classify-tag/data/level1_label.txt'video_path = Nonedef infer(args):# parse configconfig = parse_config(args.config)infer_config = merge_configs(config, 'infer', vars(args))print_configs(infer_config, "Infer")infer_model = models.get_model(args.model_name, infer_config, mode='infer')infer_model.build_input(use_dataloader=False)infer_model.build_model()infer_feeds = infer_model.feeds()infer_outputs = infer_model.outputs()place = paddle.CUDAPlace(0) if args.use_gpu else paddle.CPUPlace()exe = paddle.static.Executor(place)exe.run(paddle.static.default_startup_program())filelist = args.filelist or infer_config.INFER.filelistfilepath = args.video_path or infer_config.INFER.get('filepath', '')if filepath != '':assert os.path.exists(filepath), "{} not exist.".format(filepath)else:assert os.path.exists(filelist), "{} not exist.".format(filelist)# get infer readerinfer_reader = get_reader(args.model_name.upper(), 'infer', infer_config)if args.weights:assert os.path.exists(args.weights), "Given weight dir {} not exist.".format(args.weights)# if no weight files specified, download weights from paddleweights = args.weights or infer_model.get_weights()infer_model.load_test_weights(exe, weights,paddle.static.default_main_program())infer_feeder = paddle.fluid.DataFeeder(place=place, feed_list=infer_feeds)fetch_list = infer_model.fetches()infer_metrics = get_metrics(args.model_name.upper(), 'infer', infer_config)infer_metrics.reset()periods = []cur_time = time.time()for infer_iter, data in enumerate(infer_reader()):data_feed_in = [items[:-1] for items in data]video_id = [items[-1] for items in data]infer_outs = exe.run(fetch_list=fetch_list,feed=infer_feeder.feed(data_feed_in))infer_result_list = [item for item in infer_outs] + [video_id]prev_time = cur_timecur_time = time.time()period = cur_time - prev_timeperiods.append(period)infer_metrics.accumulate(infer_result_list)if args.log_interval > 0 and infer_iter % args.log_interval == 0:logger.info('Processed {} samples'.format((infer_iter + 1) * len(video_id)))logger.info('[INFER] infer finished. average time: {}'.format(np.mean(periods)))if not os.path.isdir(args.save_dir):os.makedirs(args.save_dir)infer_metrics.finalize_and_log_out(savedir=args.save_dir,savefile=args.save_file,label_file=args.label_file)args = Args()# 一級標簽 args.config = '/home/aistudio/work/paddle-video-classify-tag/configs/attention_lstm-single-level1.yaml' args.weights = '/home/aistudio/paddle-video-classify-tag/data/checkpoints/level1/AttentionLSTM_epoch2.pdparams' args.label_file = '/home/aistudio/paddle-video-classify-tag/data/level1_label.txt' args.save_file = 'level1_top1.json' infer(args)# 二級標簽 args.config = '/home/aistudio/work/paddle-video-classify-tag/configs/attention_lstm-single-level2.yaml' args.weights = '/home/aistudio/paddle-video-classify-tag/data/checkpoints/level2/AttentionLSTM_epoch2.pdparams' args.label_file = '/home/aistudio/paddle-video-classify-tag/data/level2_label.txt' args.save_file = 'level2_top1.json' infer(args) /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecationsdef convert_to_list(value, n, name, dtype=np.int): [INFO: config_utils.py: 70]: ---------------- Infer Arguments ---------------- [INFO: config_utils.py: 72]: MODEL: [INFO: config_utils.py: 74]: name:AttentionLSTM [INFO: config_utils.py: 74]: dataset:YouTube-8M [INFO: config_utils.py: 74]: bone_nework:None [INFO: config_utils.py: 74]: drop_rate:0.5 [INFO: config_utils.py: 74]: feature_names:['rgb'] [INFO: config_utils.py: 74]: feature_dims:[2048] [INFO: config_utils.py: 74]: embedding_size:1024 [INFO: config_utils.py: 74]: lstm_size:512 [INFO: config_utils.py: 74]: num_classes:278 [INFO: config_utils.py: 74]: topk:20 [INFO: config_utils.py: 72]: TRAIN: [INFO: config_utils.py: 74]: epoch:3 [INFO: config_utils.py: 74]: learning_rate:0.000125 [INFO: config_utils.py: 74]: decay_epochs:[5] [INFO: config_utils.py: 74]: decay_gamma:0.1 [INFO: config_utils.py: 74]: weight_decay:0.0008 [INFO: config_utils.py: 74]: num_samples:35952 [INFO: config_utils.py: 74]: pretrain_base:None [INFO: config_utils.py: 74]: batch_size:128 [INFO: config_utils.py: 74]: use_gpu:True [INFO: config_utils.py: 74]: num_gpus:1 [INFO: config_utils.py: 74]: filelist:/home/aistudio/paddle-video-classify-tag/data/level2_train.list [INFO: config_utils.py: 72]: VALID: [INFO: config_utils.py: 74]: batch_size:128 [INFO: config_utils.py: 74]: filelist:/home/aistudio/paddle-video-classify-tag/data/level2_val.list [INFO: config_utils.py: 72]: TEST: [INFO: config_utils.py: 74]: batch_size:128 [INFO: config_utils.py: 74]: filelist:/home/aistudio/paddle-video-classify-tag/data/level2_val.list [INFO: config_utils.py: 72]: INFER: [INFO: config_utils.py: 74]: batch_size:1 [INFO: config_utils.py: 74]: filelist:/home/aistudio/paddle-video-classify-tag/data/level2_test.list [INFO: config_utils.py: 75]: ------------------------------------------------- W0428 23:33:54.649835 8407 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.0, Runtime API Version: 10.1 W0428 23:33:54.655040 8407 device_context.cc:372] device: 0, cuDNN Version: 7.6. [INFO: 3560385804.py: 106]: Processed 1 samples [INFO: 3560385804.py: 106]: Processed 201 samples [INFO: 3560385804.py: 106]: Processed 401 samples [INFO: 3560385804.py: 106]: Processed 601 samples [INFO: 3560385804.py: 106]: Processed 801 samples [INFO: 3560385804.py: 106]: Processed 1001 samples [INFO: 3560385804.py: 106]: Processed 1201 samples [INFO: 3560385804.py: 106]: Processed 1401 samples [INFO: 3560385804.py: 106]: Processed 1601 samples [INFO: 3560385804.py: 106]: Processed 1801 samples [INFO: 3560385804.py: 106]: Processed 2001 samples [INFO: 3560385804.py: 106]: Processed 2201 samples [INFO: 3560385804.py: 106]: Processed 2401 samples [INFO: 3560385804.py: 106]: Processed 2601 samples [INFO: 3560385804.py: 106]: Processed 2801 samples [INFO: 3560385804.py: 106]: Processed 3001 samples [INFO: 3560385804.py: 106]: Processed 3201 samples [INFO: 3560385804.py: 106]: Processed 3401 samples [INFO: 3560385804.py: 106]: Processed 3601 samples [INFO: 3560385804.py: 106]: Processed 3801 samples [INFO: 3560385804.py: 106]: Processed 4001 samples [INFO: 3560385804.py: 106]: Processed 4201 samples [INFO: 3560385804.py: 106]: Processed 4401 samples [INFO: 3560385804.py: 106]: Processed 4601 samples [INFO: 3560385804.py: 106]: Processed 4801 samples [INFO: 3560385804.py: 106]: Processed 5001 samples [INFO: 3560385804.py: 106]: Processed 5201 samples [INFO: 3560385804.py: 106]: Processed 5401 samples [INFO: 3560385804.py: 106]: Processed 5601 samples [INFO: 3560385804.py: 106]: Processed 5801 samples [INFO: 3560385804.py: 106]: Processed 6001 samples [INFO: 3560385804.py: 106]: Processed 6201 samples [INFO: 3560385804.py: 106]: Processed 6401 samples [INFO: 3560385804.py: 106]: Processed 6601 samples [INFO: 3560385804.py: 106]: Processed 6801 samples [INFO: 3560385804.py: 106]: Processed 7001 samples [INFO: 3560385804.py: 106]: Processed 7201 samples [INFO: 3560385804.py: 106]: Processed 7401 samples [INFO: 3560385804.py: 106]: Processed 7601 samples [INFO: 3560385804.py: 106]: Processed 7801 samples [INFO: 3560385804.py: 106]: Processed 8001 samples [INFO: 3560385804.py: 106]: Processed 8201 samples [INFO: 3560385804.py: 106]: Processed 8401 samples [INFO: 3560385804.py: 106]: Processed 8601 samples [INFO: 3560385804.py: 106]: Processed 8801 samples [INFO: 3560385804.py: 106]: Processed 9001 samples [INFO: 3560385804.py: 106]: Processed 9201 samples [INFO: 3560385804.py: 106]: Processed 9401 samples [INFO: 3560385804.py: 106]: Processed 9601 samples [INFO: 3560385804.py: 106]: Processed 9801 samples [INFO: 3560385804.py: 109]: [INFER] infer finished. average time: 0.03070716172327805 [INFO: metrics_util.py: 119]: Saved ./predict_results/level2_top1.json

2.4 視頻語義標簽基線

該基線提供視頻語義標簽的理解能力，基線從視頻的文本信息中抽取表示視頻內容主旨的語義標簽知識（選手可進行升級，如利用給定的知識庫進行推理、融合多模信息提升標簽理解效果等生成標簽）。

2.4.1 數據處理

首先將數據整理成命名實體識別模型所需格式，并劃分訓練集、驗證集等。可以參考PaddleNLP中文命名實體項目。

注：我們在數據處理階段去除了未在title中出現的語義標簽。

In [8]

import os import pandas as pd import json import codecs import argparse import randomsys.path.append('/home/aistudio/external-libraries') TAG_NAMES = ["B-ENT", "I-ENT", "O"]class Args():trainval_path = '/home/aistudio/dataset_sample/train.sample.json'test_path = '/home/aistudio/dataset_sample/test_a.json'def gather_text_and_tags(sample, test_only=False):def fill_tags(surf):'''For entities that appear in text, replace their tags with 'B-ENT/I-ENT'.'''s_idx = text.find(surf)if s_idx != -1:tags[s_idx] = TAG_NAMES[0]for i in range(s_idx + 1, s_idx + len(surf)):tags[i] = TAG_NAMES[1]return 1return 0text = sample["title"].replace(" ", "").replace("\t", "")# init tag sequence with all 'O's.tags = [TAG_NAMES[2] for i in range(len(text))]entities = []if not test_only:entities = [each["@value"] for each in sample["tag"]]# annotate 'B-ENT' and 'I-ENT' tags.n_bingo_entities = sum([fill_tags(surf) for surf in entities if len(surf) > 0])# statisticsstats = {"txt_length": len(text),"n_entities": len(entities),"n_bingo_entities": n_bingo_entities,}return text, tags, statsdef stat_numberic_list(li, name="default"):assert isinstance(li, list)stat = {}stat["size"] = len(li)if all(isinstance(x, int) for x in li):stat["max"] = max(li)stat["min"] = min(li)stat["sum"] = sum(li)stat["avr"] = stat["sum"] / float(len(li))print("list-%s:\n\t%s" % (name, str(stat)))def analyze_annots(stats_list):for key in ["txt_length", "n_entities", "n_bingo_entities"]:numbers = [stats[key] for stats in stats_list]stat_numberic_list(numbers, name=key)def prepare_split(data, split_name, test_only=False):sample_lines = []nid_lines = []stats_list = []for idx in range(len(data)):text, tags, stats = gather_text_and_tags(data[idx], test_only=test_only)if len(text) == 0:continue# proper data format.text = '\002'.join([ch for ch in text])tags = '\002'.join(tags)sample_lines.append('\t'.join([text, tags]) + "\n")nid_lines.append(data[idx]["@id"] + "\n")stats_list.append(stats)if split_name == "trainval":# print statistics.analyze_annots(stats_list)save_split_file = "/home/aistudio/paddle-video-semantic-tag/data/{}.tsv".format(split_name)with codecs.open(save_split_file, "w", encoding="utf-8") as ouf:ouf.writelines(sample_lines)print("Saved {}, size={}".format(save_split_file, len(sample_lines)))with codecs.open("/home/aistudio/paddle-video-semantic-tag/data/nids.txt", "w", encoding="utf-8") as ouf:ouf.writelines(nid_lines)def create_splits_indice(n_samples, SPLITS):assert sum([v for k, v in SPLITS]) == 1.0indices = list(range(n_samples))random.shuffle(indices)split2indice = {}r_offset = 0for idx, (split, ratio) in enumerate(SPLITS):l_offset = r_offsetif idx == len(SPLITS) - 1:r_offset = n_sampleselse:r_offset = int(n_samples * ratio) + l_offsetsplit2indice[split] = indices[l_offset:r_offset]return split2indiceargs = Args() random.seed(6666)# load data for train & validation (have labels). with codecs.open(args.trainval_path, "r", encoding="utf-8") as inf:print("Loading {}...".format(args.trainval_path))lines = inf.readlines()trainval_data = [json.loads(line) for line in lines]# load data for test (no labels). with codecs.open(args.test_path, "r", encoding="utf-8") as inf:print("Loading {}...".format(args.test_path))lines = inf.readlines()test_data = [json.loads(line) for line in lines]# split the trainval data into train-set(80%) and validation-set(20%). split2indice = create_splits_indice(len(trainval_data), [("train", 3.0 / 4.0),("val", 1.0 / 4.0),]) train_data = [trainval_data[idx] for idx in split2indice["train"]] val_data = [trainval_data[idx] for idx in split2indice["val"]]label_map_file = "/home/aistudio/paddle-video-semantic-tag/data/label_map.json" with open(label_map_file, "w") as ouf:json.dump({tag: idx for idx, tag in enumerate(TAG_NAMES)}, ouf) print("Saved " + label_map_file)prepare_split(trainval_data, "trainval") prepare_split(train_data, "train") prepare_split(val_data, "val") prepare_split(test_data, "test", test_only=True) Loading /home/aistudio/dataset_sample/train.sample.json... Loading /home/aistudio/dataset_sample/test_a.json... Saved /home/aistudio/paddle-video-semantic-tag/data/label_map.json list-txt_length:{'size': 8079, 'max': 67, 'min': 1, 'sum': 118072, 'avr': 14.614680034657754} list-n_entities:{'size': 8079, 'max': 38, 'min': 0, 'sum': 21425, 'avr': 2.6519371209308082} list-n_bingo_entities:{'size': 8079, 'max': 8, 'min': 0, 'sum': 8292, 'avr': 1.0263646490902338} Saved /home/aistudio/paddle-video-semantic-tag/data/trainval.tsv, size=8079 Saved /home/aistudio/paddle-video-semantic-tag/data/train.tsv, size=6061 Saved /home/aistudio/paddle-video-semantic-tag/data/val.tsv, size=2018 Saved /home/aistudio/paddle-video-semantic-tag/data/test.tsv, size=9938

2.3.3 訓練與驗證

本模型使用了PaddleNLP模型庫中的bert-wwm-ext-chinese模型，更多模型可參考PaddleNLP Transformer API。

In [?]

import argparse import os import sys import random import time import math from functools import partial import jsonimport numpy as np import paddle from paddle.io import DataLoaderimport paddlenlp as ppnlp from paddlenlp.transformers import LinearDecayWithWarmup from paddlenlp.metrics import ChunkEvaluator from paddlenlp.datasets import load_dataset from paddlenlp.transformers import BertForTokenClassification, BertTokenizer from paddlenlp.data import Stack, Tuple, Pad, Dictsys.path.append('/home/aistudio/external-libraries')class Args():model_name_or_path = Noneoutput_dir = Nonemax_seq_length = 128batch_size = 8learning_rate = 5e-5weight_decay = 0.0adam_epsilon = 1e-8max_grad_norm = 1.0num_train_epochs = 3max_steps = -1warmup_steps = 0logging_steps = 1save_steps = 100seed = 42device = 'gpu'def evaluate(model, loss_fct, metric, data_loader, label_num):model.eval()metric.reset()avg_loss, precision, recall, f1_score = 0, 0, 0, 0for batch in data_loader:input_ids, token_type_ids, length, labels = batchlogits = model(input_ids, token_type_ids)loss = loss_fct(logits, labels)avg_loss = paddle.mean(loss)preds = logits.argmax(axis=2)num_infer_chunks, num_label_chunks, num_correct_chunks = metric.compute(None, length, preds, labels)metric.update(num_infer_chunks.numpy(),num_label_chunks.numpy(), num_correct_chunks.numpy())precision, recall, f1_score = metric.accumulate()print("eval loss: %f, precision: %f, recall: %f, f1: %f" %(avg_loss, precision, recall, f1_score))model.train()def tokenize_and_align_labels(example,tokenizer,no_entity_id,max_seq_len=512):labels = example['labels']example = example['tokens']tokenized_input = tokenizer(example,return_length=True,is_split_into_words=True,max_seq_len=max_seq_len)# -2 for [CLS] and [SEP]if len(tokenized_input['input_ids']) - 2 < len(labels):labels = labels[:len(tokenized_input['input_ids']) - 2]tokenized_input['labels'] = [no_entity_id] + labels + [no_entity_id]tokenized_input['labels'] += [no_entity_id] * (len(tokenized_input['input_ids']) - len(tokenized_input['labels']))return tokenized_inputdef _read(data_file, label_map_file):with open(label_map_file, "r") as inf:tag2label = json.load(inf)with open(data_file, 'r', encoding='utf-8') as inf:for line in inf:line_stripped = line.strip().split('\t')assert len(line_stripped) == 2tokens = line_stripped[0].split("\002")tags = line_stripped[1].split("\002")labels = [tag2label[tag] for tag in tags]yield {"tokens": tokens, "labels": labels}def do_train(args):paddle.set_device(args.device)if paddle.distributed.get_world_size() > 1:paddle.distributed.init_parallel_env()# Create dataset, tokenizer and dataloader.train_ds = load_dataset(_read,data_file="/home/aistudio/paddle-video-semantic-tag/data/train.tsv",label_map_file="/home/aistudio/paddle-video-semantic-tag/data/label_map.json",lazy=False)test_ds = load_dataset(_read,data_file="/home/aistudio/paddle-video-semantic-tag/data/val.tsv",label_map_file="/home/aistudio/paddle-video-semantic-tag/data/label_map.json",lazy=False)train_ds.label_list = test_ds.label_list = ["B-ENT", "I-ENT", "O"]tokenizer = BertTokenizer.from_pretrained(args.model_name_or_path)label_list = train_ds.label_listlabel_num = len(label_list)no_entity_id = label_num - 1trans_func = partial(tokenize_and_align_labels,tokenizer=tokenizer,no_entity_id=no_entity_id,max_seq_len=args.max_seq_length)train_ds = train_ds.map(trans_func)ignore_label = -100batchify_fn = lambda samples, fn=Dict({'input_ids': Pad(axis=0, pad_val=tokenizer.pad_token_id), # input'token_type_ids': Pad(axis=0, pad_val=tokenizer.pad_token_type_id), # segment'seq_len': Stack(), # seq_len'labels': Pad(axis=0, pad_val=ignore_label) # label}): fn(samples)train_batch_sampler = paddle.io.DistributedBatchSampler(train_ds, batch_size=args.batch_size, shuffle=True, drop_last=True)train_data_loader = DataLoader(dataset=train_ds,collate_fn=batchify_fn,num_workers=0,batch_sampler=train_batch_sampler,return_list=True)test_ds = test_ds.map(trans_func)test_data_loader = DataLoader(dataset=test_ds,collate_fn=batchify_fn,num_workers=0,batch_size=args.batch_size,return_list=True)# Define the model netword and its lossmodel = BertForTokenClassification.from_pretrained(args.model_name_or_path, num_classes=label_num)if paddle.distributed.get_world_size() > 1:model = paddle.DataParallel(model)num_training_steps = args.max_steps if args.max_steps > 0 else len(train_data_loader) * args.num_train_epochslr_scheduler = LinearDecayWithWarmup(args.learning_rate,num_training_steps, args.warmup_steps)# Generate parameter names needed to perform weight decay.# All bias and LayerNorm parameters are excluded.decay_params = [p.name for n, p in model.named_parameters()if not any(nd in n for nd in ["bias", "norm"])]optimizer = paddle.optimizer.AdamW(learning_rate=lr_scheduler,epsilon=args.adam_epsilon,parameters=model.parameters(),weight_decay=args.weight_decay,apply_decay_param_fun=lambda x: x in decay_params)loss_fct = paddle.nn.loss.CrossEntropyLoss(ignore_index=ignore_label)metric = ChunkEvaluator(label_list=label_list)global_step = 0last_step = args.num_train_epochs * len(train_data_loader)tic_train = time.time()for epoch in range(args.num_train_epochs):for step, batch in enumerate(train_data_loader):global_step += 1input_ids, token_type_ids, _, labels = batchlogits = model(input_ids, token_type_ids)loss = loss_fct(logits, labels)avg_loss = paddle.mean(loss)if global_step % args.logging_steps == 0:print("global step %d, epoch: %d, batch: %d, loss: %f, speed: %.2f step/s"% (global_step, epoch, step, avg_loss,args.logging_steps / (time.time() - tic_train)))tic_train = time.time()avg_loss.backward()optimizer.step()lr_scheduler.step()optimizer.clear_grad()if global_step % args.save_steps == 0 or global_step == last_step:if paddle.distributed.get_rank() == 0:evaluate(model, loss_fct, metric, test_data_loader,label_num)paddle.save(model.state_dict(),os.path.join(args.output_dir,"model_%d.pdparams" % global_step))args = Args() args.model_name_or_path = 'bert-wwm-ext-chinese' args.max_seq_length = 128 args.batch_size = 32 args.learning_rate = 2e-5 args.num_train_epochs = 3 args.logging_steps = 1 args.save_steps = 500 args.output_dir = '/home/aistudio/paddle-video-semantic-tag/data/checkpoints/semantic_tag' args.device = 'gpu' do_train(args)

In [4]

!ls /home/aistudio/paddle-video-semantic-tag/data/checkpoints/se* model_500.pdparams model_567.pdparams

2.3.4 生成語義標簽結果

生成的識別結果存儲在./predict_results/ents_results.json。

In [5]

import argparse import os import sys import ast import random import time import math from functools import partial import json import codecs from tqdm import tqdmimport numpy as np import paddle from paddle.io import DataLoaderimport paddlenlp as ppnlp from paddlenlp.datasets import load_dataset from paddlenlp.data import Stack, Tuple, Pad, Dict from paddlenlp.transformers import BertForTokenClassification, BertTokenizersys.path.append('/home/aistudio/external-libraries')class Args():model_name_or_path = Noneoutput_dir = Nonemax_seq_length = 128batch_size = 8learning_rate = 5e-5weight_decay = 0.0adam_epsilon = 1e-8max_grad_norm = 1.0num_train_epochs = 3max_steps = -1warmup_steps = 0logging_steps = 1save_steps = 100seed = 42device = 'gpu'def tokenize_and_align_labels(example,tokenizer,no_entity_id,max_seq_len=512):labels = example['labels']example = example['tokens']tokenized_input = tokenizer(example,return_length=True,is_split_into_words=True,max_seq_len=max_seq_len)# -2 for [CLS] and [SEP]if len(tokenized_input['input_ids']) - 2 < len(labels):labels = labels[:len(tokenized_input['input_ids']) - 2]tokenized_input['labels'] = [no_entity_id] + labels + [no_entity_id]tokenized_input['labels'] += [no_entity_id] * (len(tokenized_input['input_ids']) - len(tokenized_input['labels']))return tokenized_inputdef parse_decodes(input_words, id2label, decodes, lens):decodes = [x for batch in decodes for x in batch]lens = [x for batch in lens for x in batch]outputs = []entities_list = []for idx, end in enumerate(lens):sent = "".join(input_words[idx]['tokens'])tags = [id2label[x] for x in decodes[idx][1:end]]sent_out = []tags_out = []words = ""for s, t in zip(sent, tags):if t.startswith('B-') or t == 'O':if len(words):sent_out.append(words)if t.startswith('B-'):tags_out.append(t.split('-')[1])else:tags_out.append(t)words = selse:words += sif len(sent_out) < len(tags_out):sent_out.append(words)outputs.append(''.join([str((s, t)) for s, t in zip(sent_out, tags_out)]))entities_list.append([s for s, t in zip(sent_out, tags_out) if t == "ENT"])return outputs, entities_listdef _read(data_file, label_map_file):with open(label_map_file, "r") as inf:tag2label = json.load(inf)with open(data_file, 'r', encoding='utf-8') as inf:for line in inf:line_stripped = line.strip().split('\t')assert len(line_stripped) == 2tokens = line_stripped[0].split("\002")tags = line_stripped[1].split("\002")labels = [tag2label[tag] for tag in tags]yield {"tokens": tokens, "labels": labels}def do_predict(args):paddle.set_device(args.device)# Create dataset, tokenizer and dataloader.train_ds = load_dataset(_read,data_file="/home/aistudio/paddle-video-semantic-tag/data/train.tsv",label_map_file="/home/aistudio/paddle-video-semantic-tag/data/label_map.json",lazy=False)predict_ds = load_dataset(_read,data_file="/home/aistudio/paddle-video-semantic-tag/data/test.tsv",label_map_file="/home/aistudio/paddle-video-semantic-tag/data/label_map.json",lazy=False)train_ds.label_list = predict_ds.label_list = ["B-ENT", "I-ENT", "O"]tokenizer = BertTokenizer.from_pretrained(args.model_name_or_path)label_list = train_ds.label_listlabel_num = len(label_list)no_entity_id = label_num - 1trans_func = partial(tokenize_and_align_labels,tokenizer=tokenizer,no_entity_id=no_entity_id,max_seq_len=args.max_seq_length)ignore_label = -100batchify_fn = lambda samples, fn=Dict({'input_ids': Pad(axis=0, pad_val=tokenizer.pad_token_id), # input'token_type_ids': Pad(axis=0, pad_val=tokenizer.pad_token_type_id), # segment'seq_len': Stack(),'labels': Pad(axis=0, pad_val=ignore_label) # label}): fn(samples)raw_data = predict_ds.dataid2label = dict(enumerate(predict_ds.label_list))predict_ds = predict_ds.map(trans_func)predict_data_loader = DataLoader(dataset=predict_ds,collate_fn=batchify_fn,num_workers=0,batch_size=args.batch_size,return_list=True)# Define the model networdmodel = BertForTokenClassification.from_pretrained(args.model_name_or_path, num_classes=label_num)if args.init_checkpoint_path:model_dict = paddle.load(args.init_checkpoint_path)model.set_dict(model_dict)model.eval()pred_list = []len_list = []for step, batch in tqdm(enumerate(predict_data_loader)):input_ids, token_type_ids, length, labels = batchlogits = model(input_ids, token_type_ids)pred = paddle.argmax(logits, axis=-1)pred_list.append(pred.numpy())len_list.append(length.numpy())preds, entities_list = parse_decodes(raw_data, id2label, pred_list,len_list)save_dir = "predict_results"if not os.path.isdir(save_dir):os.makedirs(save_dir)file_path = os.path.join(save_dir, "ner_results.txt")with open(file_path, "w", encoding="utf8") as fout:fout.write("\n".join(preds))# Print some examples# print(# "The results have been saved in the file: %s, some examples are shown below: "# % file_path)# print("\n".join(preds[:10]))with open("/home/aistudio/paddle-video-semantic-tag/data/nids.txt", "r") as inf:lines = inf.readlines()nid2ents = {}for entities, nid in zip(entities_list, lines):nid2ents[nid.strip()] = entitiessave_json = os.path.join(save_dir, "ents_results.json")with codecs.open(save_json, "w", encoding="utf-8") as ouf:json.dump(nid2ents, ouf, ensure_ascii=False)print("Saved " + save_json)args = Args() args.model_name_or_path = 'bert-wwm-ext-chinese' args.init_checkpoint_path = '/home/aistudio/paddle-video-semantic-tag/data/checkpoints/semantic_tag/model_567.pdparams' args.max_seq_length = 128 args.batch_size = 32 args.device = 'gpu'do_predict(args) [2022-04-29 02:13:12,761] [ INFO] - Found /home/aistudio/.paddlenlp/models/bert-wwm-ext-chinese/bert-wwm-ext-chinese-vocab.txt [2022-04-29 02:13:12,777] [ INFO] - Already cached /home/aistudio/.paddlenlp/models/bert-wwm-ext-chinese/bert-wwm-ext-chinese.pdparams 311it [00:13, 23.29it/s] Saved predict_results/ents_results.json

3. 結果文件生成和提交

運行下列代碼后，將result.txt文件提交至比賽提交結果頁面即可。

In [6]

import os import os.path as osp import codecs import jsonclass Args():test_path = '/home/aistudio/dataset_sample/test_a.json'category_level1_result = '/home/aistudio/predict_results/level1_top1.json'category_level2_result = '/home/aistudio/predict_results/level2_top1.json'tag_result = '/home/aistudio/predict_results/ents_results.json'if __name__ == "__main__":args = Args()with codecs.open(args.test_path, "r", encoding="utf-8") as inf:print("Loading {}...".format(args.test_path))lines = inf.readlines()nids = [json.loads(line)["@id"] for line in lines]# load the prediction results of 'paddle-video-classify-tag' model on test-setwith codecs.open(args.category_level1_result, "r", encoding="utf-8") as inf:pred_level1 = json.load(inf)with codecs.open(args.category_level2_result, "r", encoding="utf-8") as inf:pred_level2 = json.load(inf)# load the prediction results of 'paddle-video-semantic-tag' model on test-setwith codecs.open(args.tag_result, "r", encoding="utf-8") as inf:pred_tags = json.load(inf)# merge results and generate an entry for each nid.submission_lines = []for nid in nids:level1_category = pred_level1[nid]["class_name"] \if nid in pred_level1 else ""level2_category = pred_level2[nid]["class_name"] \if nid in pred_level2 else ""tags = pred_tags[nid] if nid in pred_tags else []result = {"@id": nid,"category": [{"@meta": {"type": "level1"},"@value": level1_category},{"@meta": {"type": "level2"},"@value": level2_category},],"tag": [{"@value": tag} for tag in tags],}submission_lines.append(json.dumps(result, ensure_ascii=False) + "\n")with codecs.open("result.txt", "w", encoding="utf-8") as ouf:ouf.writelines(submission_lines)print("Saved result.txt") Loading /home/aistudio/dataset_sample/test_a.json... Saved result.txt

退出前務必執行

運行下列指令刪除臨時大文件，下次啟動時重新生成，否則可能導致下次在線項目加載較慢。

In [7]

!rm -rf dataset_sample paddle-video-classify-tag paddle-video-semantic-tag predict_results tsn_features_test_a tsn_features_train_sample

請點擊此處查看本環境基本用法.
Please click?here?for more detailed instructions.

總結

以上是生活随笔為你收集整理的LIC 2022 视频语义理解基线（快速启动版）的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：顾维灏谈百度地图数据采集：POI自动处理
下一篇：电机不动米兔机器人_米兔机器人上手组装