Apenwarr SWE Simulator
Apenwarr wrote a “SWE Simulator” to understand the effect of different project planning approaches on productivity.
In the first experiment, he measured the effect of PMs shifting which features the developers building for the next milestone.

His big takeaway here:
If you want to know what Tesla does right and most of us do wrong, it’s this: they ship something small, as fast as they can. Then they listen. Then they make a decision. Then they stick to it. And repeat.
They don’t make decisions any better than we do. That’s key. It’s not the quality of the decisions that matters. Well, I mean, all else being equal, higher quality decisions are better. But even if your decisions aren’t optimal, sticking to them, unless you’re completely wrong, usually works better than changing them.
If you only take one thing away from reading this talk, it should be that. Make decisions and stick to them. (Pennarun 2017)
In the second experiment, he measured the effect of how many features should go into a milestone.

His main insight is that smaller milestones ship earlier and bring the improvement to the customer earlier; thus, the business starts accruing value earlier. He calls this the “aggregate value delivered”
So the total value delivered to customers is a function of the number of awesomeness units that have launched, and how long ago they launched. (Pennarun 2017)
He also points out that work in progress still accumulates bugs over time even though it isn’t release (representing security fixes, need for migrations, integrations, etc); if a project is sufficiently large, this leads to this pathology:
Where we simply never launch at all, because there are too many features with too many integration points and the team just can’t handle it; every time they fix one thing, another thing breaks. (Pennarun 2017)