Faster-RCNN 训练自己的数据

在前一篇随笔中，数据制作成了VOC2007格式，可以用于Faster-RCNN的训练。

1.针对数据的修改

修改datasets\VOCdevkit2007\VOCcode\VOCinit.m，我只做了两类

VOCopts.classes={...

        'dog'

        'flower'};

修改function\fast_rcnn\fast_rcnn_train.m，val_iters不能大于val数据量（我的只有几十个）。

ip.addParamValue('val_iters',       20,            @isscalar);

修改function\rpn\proposal_train.m，与上一致。

ip.addParamValue('val_iters',           20,                @isscalar);

修改models\fast_rcnn_prototxts中两个文件夹里面的train_val.prototxt和test.prototxt，以K代表类别数做相应的修改，（共4个文件修改12处）。

input: "bbox_targets"

input_dim: 1  # to be changed on-the-fly to match num ROIs

input_dim: 12 # 4 * (K+1) (=21) classes

input_dim: 1

input_dim: 1

input: "bbox_loss_weights"

input_dim: 1  # to be changed on-the-fly to match num ROIs

input_dim: 12 # 4 * (K+1) (=21) classes

input_dim: 1

input_dim: 1

type: "InnerProduct"

    inner_product_param {

        num_output: 3  #K+1

        weight_filler {

            type: "gaussian"

            std: 0.01

        }

        bias_filler {

            type: "constant"

            value: 0

        }

    }

layer {

    bottom: "fc7"

    top: "cls_score"

    name: "cls_score"

    param {

        lr_mult: 1.0

    }

    param {

        lr_mult: 2.0

    }

    type: "InnerProduct"

    inner_product_param {

        num_output: 3  # K+1

        weight_filler {

            type: "gaussian"

            std: 0.01

        }

        bias_filler {

            type: "constant"

            value: 0

        }

    }

}

layer {

    bottom: "fc7"

    top: "bbox_pred"

    name: "bbox_pred"

    type: "InnerProduct"

    param {

        lr_mult: 1.0

    }

    param {

        lr_mult: 2.0

    }

    inner_product_param {

        num_output: 12  # 4 * (K+1)

        weight_filler {

            type: "gaussian"

            std: 0.001

        }

        bias_filler {

            type: "constant"

            value: 0

        }

    }

}

修改experiments\+Model\ZF_for_Faster_RCNN_VOC2007.m的三个为solver_30k40k.prototxt，默认60k80k所需时间过长。

2.根据设备性能的修改

显卡GTX750，显存2G，尽管数据不多，在默认设置下出现了内存不够的错误。

修改functions\fast_rcnn\fast_rcnn_config.m，以%标注的为默认值。

%% training

    % whether use gpu

    ip.addParamValue('use_gpu',         gpuDeviceCount > 0, ...

                                                        @islogical);

    % Image scales -- the short edge of input image

    ip.addParamValue('scales',          60,            @ismatrix);  %600

    % Max pixel size of a scaled input image

    ip.addParamValue('max_size',        1000,           @isscalar);

    % Images per batch

    ip.addParamValue('ims_per_batch',   2,              @isscalar);

    % Minibatch size

    ip.addParamValue('batch_size',      32,            @isscalar);  %128

    % Fraction of minibatch that is foreground labeled (class > 0)

    ip.addParamValue('fg_fraction',     0.25,           @isscalar);

    % Overlap threshold for a ROI to be considered foreground (if >= fg_thresh)

    ip.addParamValue('fg_thresh',       0.5,            @isscalar);

    % Overlap threshold for a ROI to be considered background (class = 0 if

    % overlap in [bg_thresh_lo, bg_thresh_hi))

    ip.addParamValue('bg_thresh_hi',    0.5,            @isscalar);

    ip.addParamValue('bg_thresh_lo',    0.1,            @isscalar);

    % mean image, in RGB order

    ip.addParamValue('image_means',     128,            @ismatrix);

    % Use horizontally-flipped images during training?

    ip.addParamValue('use_flipped',     true,           @islogical);

    % Vaild training sample (IoU > bbox_thresh) for bounding box regresion

    ip.addParamValue('bbox_thresh',     0.5,            @isscalar);

    % random seed

    ip.addParamValue('rng_seed',        6,              @isscalar);

    %% testing

    ip.addParamValue('test_scales',     60,            @isscalar);  %600

    ip.addParamValue('test_max_size',   1000,           @isscalar);

    ip.addParamValue('test_nms',        0.3,            @isscalar);

    ip.addParamValue('test_binary',     false,          @islogical);

3.开始训练

训练前删除或备份output，imdb\cache，运行experiments/script_faster_rcnn_VOC2007_ZF.m 开始训练。

在我的显卡上经过四个小时，训练完成。

下面是未删除output重新运行（很快）的结果。

***************

stage one proposal

***************

aver_boxes_num = 1090, select top 2000

aver_boxes_num = 1091, select top 2000

***************

stage one fast rcnn

***************

!!! dog : 0.8969 0.9418

!!! flower : 0.9006 0.9458

~~~~~~~~~~~~~~~~~~~~

Results:

   89.6920

   90.0606

   89.8763

~~~~~~~~~~~~~~~~~~~~

***************

stage two proposal

***************

aver_boxes_num = 1263, select top 2000

aver_boxes_num = 1271, select top 2000

***************

stage two fast rcnn

***************

***************

final test

***************

aver_boxes_num = 233, select top 300

!!! dog : 0.8893 0.9449

!!! flower : 0.8990 0.9445

~~~~~~~~~~~~~~~~~~~~

Results:

   88.9304

   89.9025

   89.4165

~~~~~~~~~~~~~~~~~~~~

Cleared 0 solvers and 2 stand-alone nets

please modify detection_test.prototxt file for sharing conv layers with proposal model (delete layers until relu5)

>>

4.测试

训练结束已有提示，要先修改detection_test.prototxt。

修改data为1*256*50*50，去掉roi_pool5之前的layer并将bottom改为data。

name: "Zeiler_conv5"

input: "data"

input_dim: 1

input_dim: 256

input_dim: 50

input_dim: 50

input: "rois"

input_dim: 1 # to be changed on-the-fly to num ROIs

input_dim: 5 # [batch ind, x1, y1, x2, y2] zero-based indexing

input_dim: 1

input_dim: 1

layer {

    bottom: "data"

    bottom: "rois"

    top: "pool5"

    name: "roi_pool5"

    type: "ROIPooling"

    roi_pooling_param {

        pooled_w: 6

        pooled_h: 6

        spatial_scale: 0.0625  # (1/16)

    }

}

在experiments\script_faster_rcnn_demo.m中将路径更改成本地相应路径，根据测试结果可以修改thres值。

model_dir                   = fullfile(pwd, 'output', 'faster_rcnn_final', 'faster_rcnn_VOC2007_ZF'); %% ZF_test

im_names = {'000001.jpg','000002.jpg','000034.jpg','000212.jpg','000213.jpg', '001150.jpg'};

thres = 0.3;  %0.6

检测速度很快，不过此次我的数据检测效果很不好，可能由于数据太少、画框不认真或某些没有意识到的参数错误-_-!。