Good post! I agree with a lot of this, and I say that as somebody who lit money on fire trying gastown just in case Steve was right. But I think we're going to see functional agent swarms a lot sooner based on the RLM architecture. Agents calling other agents programmatically and passing them variables gets around the issue of all I/O having to go through the mode'ls context window
Yea excited about RLMs as well. I think it will probably be a few months (~6?) before they really come through though? I don’t think the context rot issue will really be solved until the big labs can do a rl training loop with these new techniques. Notably we already get a lot of the juice of RLMs by just spinning up the ‘research’ phase of a standard claude code session to a set of subagents (unless I’m missing something about RLMs, I think these are ~equivalent?)
Good post! I agree with a lot of this, and I say that as somebody who lit money on fire trying gastown just in case Steve was right. But I think we're going to see functional agent swarms a lot sooner based on the RLM architecture. Agents calling other agents programmatically and passing them variables gets around the issue of all I/O having to go through the mode'ls context window
Yea excited about RLMs as well. I think it will probably be a few months (~6?) before they really come through though? I don’t think the context rot issue will really be solved until the big labs can do a rl training loop with these new techniques. Notably we already get a lot of the juice of RLMs by just spinning up the ‘research’ phase of a standard claude code session to a set of subagents (unless I’m missing something about RLMs, I think these are ~equivalent?)