News/Research

Ken Goldberg at RSS Workshop

13 Jul, 2018

Ken Goldberg at RSS Workshop

"New Benchmarks, Metrics, and Competitions for Robotic Learning," an RSS Workshop, was held in Pittsburgh, from June 29 - 30th, 2018. The workshop sought to discuss and propose new benchmarks, competitions, and performance metrics that address the specific challenges arising when deploying (deep) learning in robotics.

From the website:

Researchers in robotics currently lack widely-accepted meaningful benchmarks and competitions that inspire the community to work on the critical research challenges for robotic learning, and allow repeatable experiments and quantitative evaluation.

This workshop will therefore bring together experts from the robotics, machine learning, and computer vision communities to identify the shortcomings of existing benchmarks, datasets, and evaluation metrics. We will discuss the critical challenges for learning in robotic perception, planning, and control that are not well covered by the existing benchmarks, and combine the results of these discussions to outline new benchmarks for learning in robotic perception, planning, and control.

Ken Goldberg spoke on June 29th and contributed Towards An Empirically Reproducible Benchmark for Deep Learning Grasping Algorithms, a paper written by Andrey Kurenkov, Roberto Martin-Martin, Animesh Garg, Ken Goldberg, Silvio Savarese.

From the abstract:

We propose to empirically evaluate to what extent the recently proposed YCB and ACRV benchmarks for robotic manipulation [7, 19] exhibit three essential properties of scientific experiments – repeatability, replicability, and reproducibility – as defined by the Association for Computing Machinery (ACM). For quantifying these properties, we propose to use two standard statistical metrics: the repeatability coefficient and the intraclass correlation measure. We further propose to make use of the lessons learned from this evaluation in specifying a new bench-mark for the growing family of deep learning robotic grasping algorithms, and describe key properties of this benchmark.