top of page
Search

Paper Review: Centernet: Keypoint Triplets for Object Detection


Problem the paper is solving

  • The paper aims to improve keypoint based object detection

  • The approach before this suffers from incorrect bounding boxes.

  • This paper solves this issue by detecting each object as triplet rather than a pair of keypoints

  • Introduce two modules - cascade corner pooling and center pooling - for enriching of information.

Issues with anchor based approaches

  • Size and aspect ratio need to be manually designed

  • Need large number of anchors to achieve high IOU rate with ground truths

  • Anchors are usually not aligned with ground truth boxes

Problems with CornerNet

  • Cornernet - each object was represented by cornet points

  • but had weak ability of referring to global information of object

  • sensitive to object boundaries

Improvements

  • ability to perceive visual patterns within each proposed region

  • if predicted bb has high IOu with ground truth, high prob that center keypoint lies in central region

  • center pooling - helps center keypoint obtain more recognizable visual patterns

  • max summed response in both horizontal and vertical directions

  • cascade corner pooling - equips original corner pooling module with ability of perceiving internal information.


Cornernet as baseline

  • produces two heatmaps - top-left cornet and bottom-right corner

  • heatmaps - represent location of keypoints + assigns a confidence score for each point + predicts an embedding and a group of offsets for each corner

  • embeddings - used to identify two corners from same object

  • Offsets - learn to remap corners from heatmaps to input image

  • generating bounding boxes - top-k left top corners and bottom right corners are selected according to scores, distance between embedding vectors of a pair of corners is calculated if paired corners belong to same object based on a threshold

  • score of bounding box - average scores of corner pairs


Centernet

  • Each object represented by a triplet - additional keypoint in center

  • Additionaly embed heatmap for center keypoint too and predict its offsets

  • USe the cornernet method to generate top k boxes

  • Procedure to select top k boxes -

    1. select top k boxes according to their scores

    2. use offsets to remap boxes to input image

    3. define central region for each box and see if central region contains the center keypoint and has same class as corner keypoints

    4. if center keypoint is detected, preserve the box

    5. scores = avg of the three keypoints

  • Choosing the central region size - affects precision and recall

  • Scale aware central region is proposed


  • n=3 for boxes less than 150 (width) and n=5 for boxes greater than 150

Center Pooling

  • geometric centers do not necessarily convey very recognizable visual patters

  • backbone -> feature map -> find maximum value in both horizontal and vertical directions and add them together


Cascade Corner Pooling

  • CornerNet - uses Corner Pooling - Aims to find maximum values on boundary directions so as to determine corners - makes corners sensitive to edges

  • Cascade corner pooling - finds the maximum value along the boundary, then looks inwards along location of boundary maximum value to find internal maximum value - add the two values together

Building Center Pooling and Casacade Corner pooling

  • Can be achieved by combining corner pooling at different locations

Training

  • Input image - 511*511 -> heatmaps of size 128*128

  • Training loss - focal loss for corner and center keypoints + pull loss for corners + push loss corners + l1 losses for corners to learn offsets

  • alpha, beta, gamma are weights for pull, push and offset losses (0.1, 0.1 and 1 by default)

Inference

  • top 70 center keypoints and top 70 top-left corners and top 70 bottom-right corners are selected

  • soft-nms is used

  • top 100 bounding boxes are selected based on scores

Results


Reference

All figures are taken from the link mentioned above.


 
 
 

Commenti


©2020 by Akshat Mandloi

bottom of page