The Curated Demonstration

The Curated Demonstration

Imitation learning assumes that more demonstrations improve performance. But demonstrations vary in quality, style, and coverage. A random selection of fifty demonstrations might contain forty that traverse the same trajectory and ten that show genuinely different strategies. The random sample overweights the common case and underweights the informative variation.

FAKTUAL selects demonstrations by maximizing signature kernel entropy — a measure of how much the selected set covers the space of possible trajectories. The algorithm is model-free: it doesn’t need to know the task, the robot, or the policy architecture. It measures trajectory diversity directly and selects the subset with maximum coverage.

The result: consistently better imitation learning success rates across RoboMimic, MetaWorld, and real-world tasks compared to random selection, with negligible computational overhead. The improvement comes entirely from which demonstrations the model trains on, not from any change to the model itself.

The through-claim: the bottleneck in imitation learning is not data quantity but data geometry. Random sampling treats demonstrations as interchangeable units. They’re not — they’re points in a trajectory space, and their informativeness depends on their position relative to each other, not just their individual quality. A demonstration that’s excellent in isolation contributes nothing if five nearly identical demonstrations are already in the training set. The curation algorithm sees this; random sampling cannot. The same principle applies anywhere training data has geometric structure: the value of each datum depends on what else is in the set.


No comments yet.