ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking

Oliver Groth, Fabian B. Fuchs, Ingmar Posner, Andrea Vedaldi; The European Conference on Computer Vision (ECCV), 2018, pp. 702-717


Physical intuition is pivotal for intelligent agents to perform complex tasks. In this paper we investigate the passive acquisition of an intuitive understanding of physical principles as well as the active utilisation of this intuition in the context of generalised object stacking. To this end, we provide ShapeStacks: a simulation-based dataset featuring 20,000 stack configurations composed of a variety of elementary geometric primitives richly annotated regarding semantics and structural stability. We train visual classifiers for binary stability prediction on the ShapeStacks data and scrutinise their learned physical intuition. Due to the richness of the training data our approach also generalises favourably to real-world scenarios achieving state-of-the-art stability prediction on a publicly available benchmark of block towers. We then leverage the physical intuition learned by our model to actively construct stable stacks and observe the emergence of an intuitive notion of stackability - an inherent object affordance - induced by the active stacking task. Our approach performs well exceeding the stack height observed during training and even manages to counterbalance initially unstable structures.

Related Material

author = {Groth, Oliver and Fuchs, Fabian B. and Posner, Ingmar and Vedaldi, Andrea},
title = {ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}