Discussion about this post

User's avatar
Max Iyer's avatar

Is ' compounding correctness' mainly observed in the coding or math domains? Or in other complex real-world domains too? Especially for open-ended tasks?

Extraordinary claims need extraordinary evidence.

If 'compounding correctness" is reliable why is it not being used by Anthropic to solve unsolved math conjectures every hour? How about every week? Every month?

Or why is it not being used to create new breakthrough killer apps or middleware or efficient OSes or new device drivers or publication-worthy algorithms?

How Out-of-training-distribution does the task have to get before the situation reverts to compounding error heh.

1 more comment...

No posts

Ready for more?