當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Detectron2学习笔记

發(fā)布時間：2023/12/15 编程问答 27 豆豆

生活随笔收集整理的這篇文章主要介紹了 Detectron2学习笔记小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

文章目錄

- 一、Detectron2 操作介紹
- - 1.1 訓練
  - 1.2 測試
  - 1.3 數(shù)據(jù)及格式要求
  - 1.4 Load/Save model
  - 1.5 模型輸入形式
  - 1.6 模型輸出
  - 1.7 config usage
- 二、Detectron2 代碼結(jié)構(gòu)介紹
- - 2.1 數(shù)據(jù)
  - 2.2 模型
  - 2.3 訓練類的實現(xiàn)
  - 2.4 訓練
  - 2.5 推理
  - 2.6 模型的加載和保存
  - 2.7 性能評估
  - 2.8 日志存儲

一、Detectron2 操作介紹

Detectron2代碼鏈接：https://github.com/facebookresearch/detectron2

Detectron2說明文檔：https://detectron2.readthedocs.io/index.html

安裝之后要編譯：

# 編譯 python setup.py build develop

1.1 訓練

1、訓練有兩個腳本， tools/plain_train_net.py 提供的默認參數(shù)更少

- tools/plain_train_net.py - tools/train_net.py

2、訓練之前要設(shè)置對應(yīng)數(shù)據(jù)集

https://github.com/facebookresearch/detectron2/blob/master/datasets/README.md

3、訓練

# 單GPU cd toos/ ./train_net.py \--config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml \--num-gpus 1 SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025 # 多GPU cd tools/ ./train_net.py --num-gpus 8 \--config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml

4、評估模型性能

./train_net.py \--config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml \--eval-only MODEL.WEIGHTS /path/to/checkpoint_file

更多的信息可以使用下面的命令查看：

./train_net.py -h

1.2 測試

1、從 model zoo 下載官方訓好的模型

2、測試demo

# demo測試 cd demo/ python demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml \--input input1.jpg input2.jpg \[--other-options]--opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl # 可修改細節(jié) - To run on your webcam, replace --input files with --webcam. - To run on a video, replace --input files with --video-input video.mp4. - To run on cpu, add MODEL.DEVICE cpu after --opts. - To save outputs to a directory (for images) or a file (for webcam or video), use --output.

1.3 數(shù)據(jù)及格式要求

https://detectron2.readthedocs.io/en/latest/tutorials/builtin_datasets.html

1.4 Load/Save model

1、detectron2 的 Models (和其他 sub-models) 以如下形式建立：

build_model, build_backbone, build_roi_heads: from detectron2.modeling import build_model model = build_model(cfg) # returns a torch.nn.Module

2、Load/Save checkpoint：

from detectron2.checkpoint import DetectionCheckpointer DetectionCheckpointer(model).load(file_path_or_url) # load a file, usually from cfg.MODEL.WEIGHTScheckpointer = DetectionCheckpointer(model, save_dir="output") checkpointer.save("model_999") # save to output/model_999.pth

Detectron2 的 checkpointer 將模型以 .pth 和 .pkl 的形式保存，可以使用 torch.load / torch.save 來處理前者，使用pickle.dump / pickle.load 來處理后者。

1.5 模型輸入形式

outputs = model(inputs) # inputs is a list[dict] The dict may contain the following keys:- “image”: Tensor in (C, H, W) format. The meaning of channels are defined by cfg.INPUT.FORMAT. Image normalization, if any, will be performed inside the model using cfg.MODEL.PIXEL_{MEAN,STD}.- “height”, “width”: the desired output height and width, which is not necessarily the same as the height or width of the image field. For example, the image field contains the resized image, if resize is used as a preprocessing step. But you may want the outputs to be in original resolution. If provided, the model will produce output in this resolution, rather than in the resolution of the image as input into the model. This is more efficient and accurate.- “instances”: an Instances object for training, with the following fields:- “gt_boxes”: a Boxes object storing N boxes, one for each instance.- “gt_classes”: Tensor of long type, a vector of N labels, in range [0, num_categories).- “gt_masks”: a PolygonMasks or BitMasks object storing N masks, one for each instance.- “gt_keypoints”: a Keypoints object storing N keypoint sets, one for each instance.- “sem_seg”: Tensor[int] in (H, W) format. The semantic segmentation ground truth for training. Values represent category labels starting from 0.- “proposals”: an Instances object used only in Fast R-CNN style models, with the following fields:- “proposal_boxes”: a Boxes object storing P proposal boxes.- “objectness_logits”: Tensor, a vector of P scores, one for each proposal.For inference of builtin models, only “image” key is required, and “width/height” are optional.

1.6 模型輸出

訓練模式：a dict[str->ScalarTensor] with all the losses.

推理模式： a list[dict], one dict for each image.

每個dict包含內(nèi)容如下:

- “instances”: Instances object with the following fields:- “pred_boxes”: Boxes object storing N boxes, one for each detected instance.- “scores”: Tensor, a vector of N confidence scores.- “pred_classes”: Tensor, a vector of N labels in range [0, num_categories).- “pred_masks”: a Tensor of shape (N, H, W), masks for each detected instance.- “pred_keypoints”: a Tensor of shape (N, num_keypoint, 3). Each row in the last dimension is (x, y, score). Confidence scores are larger than 0.- “sem_seg”: Tensor of (num_categories, H, W), the semantic segmentation prediction.- “proposals”: Instances object with the following fields:- “proposal_boxes”: Boxes object storing N boxes.- “objectness_logits”: a torch vector of N confidence scores.- “panoptic_seg”: A tuple of (pred: Tensor, segments_info: Optional[list[dict]]). The pred tensor has shape (H, W), containing the segment id of each pixel.If segments_info exists, each dict describes one segment id in pred and has the following fields:- “id”: the segment id- “isthing”: whether the segment is a thing or stuff- “category_id”: the category id of this segment.If a pixel’s id does not exist in segments_info, it is considered to be void label defined in Panoptic Segmentation.If segments_info is None, all pixel values in pred must be ≥ -1. Pixels with value -1 are assigned void labels. Otherwise, the category id of each pixel is obtained by category_id = pixel // metadata.label_divisor.

1.7 config usage

from detectron2.config import get_cfg cfg = get_cfg() # obtain detectron2's default config cfg.xxx = yyy # add new configs for your own custom components cfg.merge_from_file("my_cfg.yaml") # load values from a filecfg.merge_from_list(["MODEL.WEIGHTS", "weights.pth"]) # can also load values from a list of str print(cfg.dump()) # print formatted configs

二、Detectron2 代碼結(jié)構(gòu)介紹

API地址

engine：整合數(shù)據(jù)和model的過程，實現(xiàn)訓練、測試
data：model的輸入
modeling：solver（optimizer）的輸入
solver：優(yōu)化器
layers：構(gòu)成modeling的基本層
evaluation：評估
config：讀取配置文件
projects：工程示例
checkpoint：存儲和加載模型權(quán)重

2.1 數(shù)據(jù)

./detectron2/data/

1、讀取圖片

./data/common.py

2、數(shù)據(jù)增強

./data/common.py

3、轉(zhuǎn)化為batch

./data/build.py

4、修改數(shù)據(jù)路徑

./data/datasets/

pascal voc
coco (register_coco.py & coco.py )

修改超參數(shù)：

./configs/Base-RCNN-FPN.yaml

2.2 模型

./detectron2/modeling/

1、backbone

# backbone的抽象基類 # ./modeling/backbone/backbone.py

抽象類：

類, 是從一堆對象中抽象出來的, 比如貓類,狗類,人類
抽象類, 是從一堆類中抽象出來的, 比如上面的三個類可以抽取出動物類
抽象類的特點是不能給實例化, 只能被子類繼承, 由子類實現(xiàn)了父類的抽象方法后, 子類才能被實例化
Python的abc提供了@abstractmethod裝飾器實現(xiàn)抽象方法

build backbone

./modeling/backbone/build.py

backbone/resnet.py 中繼承了 CNNBlockBase，定義了不同的Block，ResNet繼承Backbone，并使用定義的block實現(xiàn)ResNet的backbone

當使用 from 模塊名 import * 時，想要有一些變量不被調(diào)用，可以借助模塊提供的 __all__ 變量：

該變量的值是一個列表，存儲的是當前模塊中一些成員（變量、函數(shù)或者類）的名稱。通過在模塊文件中設(shè)置 __all__ 變量，當其它文件以 “from 模塊名 import * ”的形式導入該模塊時，該文件中只能使用 __all__ 列表中指定的成員。
也就是說，只有以“from 模塊名 import *”形式導入的模塊，當該模塊設(shè)有 __all__ 變量時，只能導入該變量指定的成員，未指定的成員是無法導入的。

注冊與調(diào)用：

定義：

@BACKBONE_REGISTRY.register() def build_resnet_backbone(cfg, input_shape):return ResNet(stem, stages, out_features=out_features).freeze(freeze_at)

調(diào)用：./build.py中根據(jù)配置文件名調(diào)用之前Register好的backbone

backbone = BACKBONE_REGISTRY.get(backbone_name)(cfg, input_shape)

./modeling/backbone/fpn.py 文件又把build_resnet_backbone生產(chǎn)的resnet作為子結(jié)構(gòu)輸入，擴展了不同的FPN的backbone:

# resnet @BACKBONE_REGISTRY.register() def build_resnet_fpn_backbone(cfg, input_shape: ShapeSpec): # retinanet @BACKBONE_REGISTRY.register() def build_resnet_fpn_backbone(cfg, input_shape: ShapeSpec):

2、proposal 生成

./modeling/proposal_generator/build.py

根據(jù)配置文件調(diào)用相應(yīng)的 proposal generator ：

PROPOSAL_GENERATOR_REGISTRY.get(name)(cfg, input_shape) # 1 @RPN_HEAD_REGISTRY.register() class StandardRPNHead(nn.Module): # 2 @PROPOSAL_GENERATOR_REGISTRY.register() class RPN(nn.Module):

3、RoI Heads

接口：

./modeling/roi_heads/roi_heads.py

實現(xiàn)：

# 1 @ROI_HEADS_REGISTRY.register() class Res5ROIHeads(ROIHeads): # 2 @ROI_HEADS_REGISTRY.register() class StandardROIHeads(ROIHeads):

4、mask head

def build_mask_head(cfg, input_shape):name = cfg.MODEL.ROI_MASK_HEAD.NAMEreturn ROI_MASK_HEAD_REGISTRY.get(name)(cfg, input_shape)

5、keypoint head

def build_keypoint_head(cfg, input_shape):name = cfg.MODEL.ROI_KEYPOINT_HEAD.NAMEreturn ROI_KEYPOINT_HEAD_REGISTRY.get(name)(cfg, input_shape)

6、執(zhí)行流程

./modeling/meta_arch/

batch_input 進行預(yù)處理
輸入backbone進行特征提取
將feature和img輸入給proposal_generator
將proposal 結(jié)果給到 RoI Heads

def forward(self, batched_inputs):if not self.training:return self.inference(batched_inputs)images = self.preprocess_image(batched_inputs)if "instances" in batched_inputs[0]:gt_instances = [x["instances"].to(self.device) for x in batched_inputs]else:gt_instances = Nonefeatures = self.backbone(images.tensor)if self.proposal_generator:proposals, proposal_losses = self.proposal_generator(images, features, gt_instances)else:assert "proposals" in batched_inputs[0]proposals = [x["proposals"].to(self.device) for x in batched_inputs]proposal_losses = {}_, detector_losses = self.roi_heads(images, features, proposals, gt_instances)if self.vis_period > 0:storage = get_event_storage()if storage.iter % self.vis_period == 0:self.visualize_training(batched_inputs, proposals)losses = {}losses.update(detector_losses)losses.update(proposal_losses)return losses

2.3 訓練類的實現(xiàn)

./detectron2/engine/train_loop.py

1、HookBase 定義了四個階段：

before_train
after_train
before_step
after_step

2、TrainerBase 對 hook 靈活調(diào)用，使用各個功能

3、SimpleTrainer （./train_loop.py) 繼承自TrainerBase，對TrainerBase中預(yù)留接口的訓練核心部分的方法def run_step(self)做了具體實現(xiàn)，包括推理計算loss以及backward：

4、DefaultTrainer（./defaults.py）繼承自SimpleTrainer，實現(xiàn)了訓練流程，包括創(chuàng)建model, optimizer, scheduler, dataloader，根據(jù)配置文件增加了輔助功能hooks類中的功能

2.4 訓練

./tools/

可見GPU號修改：

os.environ['CUDA_VISIBLE_DEVIES'] = '0, 1, 2'

train_net.py中層層抽象，在之前TrainBase $→\to$ SimpleTrainer $→\to$ DefaultTrainer上又增加了一層抽象，添加evaluation模塊的功能，以及inference with test-time augmentation功能

2.5 推理

./detectron2/engine/defaults.py

2.6 模型的加載和保存

1、兩種保存模型的方法：

僅保存權(quán)重

# 保存 torch.save(model.state_dict(), path) # 加載 model = Model() model.load_state_dict(torch.load(path)) model.eval()

保存整個模型和對應(yīng)權(quán)重

# 保存 torch.save(model, path) # 加載 model = torch.load(path) model.eval()

2、保存 checkpoint 的方法

完整的checkpoint一般保存了模型的 state_dict、優(yōu)化器的state_dict、epoch等

保存checkpoint

torch.save({'epoch': epoch,'model_state_dict': model.state_dict(),'optimizer_state_dict': optimizer.state_dict(),'loss': loss}, path)

加載checkpoint

checkpoitn = torch.load(path) model.load_state_dict(checkpoint['model_state_dict']) optimizer.load_state_dict(checkpoint['optimizer_state_dict']) epoch = checkpoint['epoch'] loss = checkpoint['loss'] model.eval()

2.7 性能評估

./detectron2/evaluation/evaluator.py

2.8 日志存儲

./engine/hooks.py

日志存儲是通過 hook 來控制的，hooks.py 中的 after_step() 方法調(diào)用 writer.write() 進行日志的寫入。

總結(jié)

以上是生活随笔為你收集整理的Detectron2学习笔记的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

学习笔记

上一篇：什么手机可以用两张电信卡
下一篇：【语义分割】OCRNet：Object-