Researchers from Nvidia, the University of Toronto, and the Vector Institute for Artificial Intelligence in Toronto have devised a way to more precisely detect and predict where an object begins and ends. This is knowledge that can improve inference for existing computer vision models and labeling training data for future models.
In experiments by researchers, Semantically Thinned Edge Alignment Learning (STEAL) is able to improve the precision of the state-of-the-art CASENet semantic boundary prediction model by 4%. More precise recognition of the boundaries of an object can have applications for computer vision tasks ranging from image generation to 3D reconstruction to object detection.
STEAL can be applied to improve existing CNNs or boundary detection models, but researchers also believe it can help them more efficiently label or annotate data for computer vision models. To prove this, the STEAL approach was used to refine Cityscapes, a data set of urban environments first introduced at the Computer Vision and Pattern Recognition (CVPR) conference in 2016.
On GitHub now, the STEAL framework learns and predicts object edges in pixels in a method that researchers refer to as “active alignment.” Explicit reasoning about annotation noise during training and a level-set formulation for networks to learn from misaligned labels in an end-to-end fashion also help produce the results.
“We further show that our predicted boundaries are significantly better than those obtained from the latest DeepLab-v3 segmentation outputs, while using a much more lightweight architecture,” research authors said in a paper published in April and revised June 9, according to arXiv.
“Devil is in the Edges: Learning Semantic Boundaries from Noisy Annotations” will be shared in an oral presentation this week at the CVPR 2019 conference in Long Beach, California. Nearly a dozen research papers written in part by Nvidia Research will be shared in oral presentations at the conference, Nvidia said today in a blog post.
In other recent news, Nvidia said it will support high-performance computing hardware from British manufacturer Arm in 2020, and Nvidia’s inference software TensorRT open-sourced parsers and plugins today to allow for more customization.