← Back to work

Technical Note

What sparse rewards teach us about system design

Reward shaping is not a shortcut. It is an interface-design problem between an objective and a learning algorithm.

June 2026 · 5 min read

The learner only sees the interface

A sparse reward can make a valid objective almost unusable as a learning signal. MountainCar exposes the distinction clearly: success matters, but a learner may need a more informative path to discover it.

Reward shaping is a design decision

A shaped reward should preserve the direction of the real objective while making useful progress observable. Momentum and position can provide information without pretending the benchmark has changed.

  • Define the real success condition first.
  • Add informative signals carefully.
  • Compare designs under the same evaluation protocol.
  • Publish learning curves, not only endpoints.

The broader lesson

System design often determines whether an algorithm can use the information available. Observation design, temporal context, and reward structure can matter as much as model choice.