top of page
Search

Paper Review - GPS-Net: Graph Property Sensing Network for Scene Graph Generation

Introduction

  • Three key properties of scene graph - a) edge direction information b) difference in priority between nodes c) long tailed distribution of relationships

  • Importance of node varies according to number of triplets they are included in the graph - existing works treat all nodes as equal in a scene graph

  • Motivation of GPS-Net -

    • Direction Aware Message Passing (DMP) which enhance node feature with node-specific contextual information

    • Node-Priority Sensitive Loss (NPS Loss) - to encode difference in priority between different nodes

    • Adaptive Reasoning Module - for handling long tailed distribution of relationships

Approach

  • Faster RCNN to obtain object proposals for each image

  • O - object categories (including background), R - relationship cateories

Direction Aware Message Passing

  • Global Context Message Parsing - adopts softmax for normalization

  • Takes node features as input and gives an output + Neighborhood of this node

  • The exponential part is defined as the pairwise contextual coefficient between two nodes

  • However, GCMP generates the same contextual information for all nodes - why??

  • Since, GCMP treats all nodes equal it can be simplifies as Figure 3 (b) - However this also ignores edge direction information and cannot provide node specific contextual information.

  • DCM - inspired by multi-modal low rank bilinear pooling

  • Contextual info is formulated as tri linear model based on Tucker Decomposition

  • Advantages - a) union box features to expand receptive field b) tri-linear model is better c) Hadamard product jointly affects context modeling d) position of subject and object is specified - considers edge direction information

  • Consider both forward and backward relations and stack them and then take a kornecker product

  • Transformer layer - Refine the contextual information

Node Priority Sensitive Loss

  • Cross-entropy loss - doesn't account for importance of nodes

  • Priority proportional to number of triplets they are involved in

  • Inspired by focal loss - key differences - a) Mainly used to solve node priority problem in Scene Graph Generation(SGG) b) focusing parameter is a function of node priority

  • Loss function formula similar to focal loss


Adaptive Reasoning Module

  • To classify relationships

  • provides prior for classification - 2 steps - a) frequency softening b) bias adaptation

  • Frequency Softening -

  • Bias Adaptation -


Results



 
 
 

コメント


©2020 by Akshat Mandloi

bottom of page