top of page
Search

Paper Review - Finding Label Errors in Autonomous Vehicle Data with Learned Observation Assertions

Updated: Feb 8, 2023




Why label errors are an issue?


  • Label errors can lead to downstream safety risks in trained models

  • They are a serious risk where trained models are used in safety critical applications

  • They impact model training.

Contributions

  • New abstraction - Learned Observation Assertions are proposed.

  • Leverages existing labeled datasets, and ML models to learn a probabilistic model for finding errors in labels.

  • The proposed system learns priors using user-provided features and existing resources.

  • Learned Observations Assertions (LOA) -

    • three components -> data associations, priors over features, application objective functions (AOF)

    • supports associating observations together - across frames and across time - which are jointly considered for finding errors.

  • Methods of leveraging organizational resources

    • users specify features over data

    • these features are used to generate priors and application objective functions to guide search for errors

    • Priors - input is sets of observations, output is probability of seeing a feature of input

    • AOF - transform prior values for application at hand.

Scene Syntax -

  • Scene is a set of tracks

  • track consists of observation bundles

  • observation bundles consists of observations

  • features are defined over individual elemets , elemets across scenes, whole tracks, can be anything

  • AOF are defined over these feature distribution - by default its KDEobsDistribution

  • Once fit is done, graphical model is generated, creating nodes for each observation and feature distribution.

Feature Distributions -

  • Features over single observations -> eg box volume

  • Features over observation bundles -> eg observations within bundles should agree on a class

  • Features between observations -> eg velocity estimated by box center offset

  • Features over tracks -> can be used to normalize score over tracks'

  • These act as input to Learning Feature Distributions which further act as input to Application Objective Functions

Scoring -


The heart of the process -

LOA and AOF are the heart- which by default its KDE

Kernel Density Estimator from sklearn is used.

Uses Ball Tree or KD Tree for efficient queries

Can be used to tell whether a data point is unusual.


Testing on COCO to follow. Stay Tuned.



 
 
 

Comentários


©2020 by Akshat Mandloi

bottom of page