Timewarp attack 600 second grace period

I don’t understand how one of the effects can be ignored. Reference [1] switched from saying we’re looking at the Erlang of 2015 blocks to 2016 blocks. It seems to imply the attacker received a double benefit, i.e. one by increasing timespan by 600, but then another by somehow going back in real-world time by 600 seconds to get 2016 * 600 seconds to mine instead of being stuck with 2015 * 600.

[ for time taken to mine the last 2015 blocks ] … IBT is 2016/2014 * 600. This is equivalent to 600 * E(2016 * 600/X) where X~ErlangDistribution(k=2015, λ=1/600). In the case of a miner deliberately reducing timestamps by 600 seconds on the difficulty-retargeting block, we are effectively changing the difficulty multiplier to (2016 / (time taken to mine the last 2016 blocks + 600)), or 600 * E(2016 * 600/(X + 600)) where X~Erlang Distribution(k=2016, λ=1/600), which is effectively targeting an inter-block time of ~599.9999 seconds.

Concerning if this means “difficulty drop can accumulate”, yes, for the last block in this sequence he can use real time in for the last timestamp, then the next period will have a lower difficulty, but then the next period would be back to normal. He would have to do 168 periods with a 2 hr limit to drop difficulty to I think 1/2.

I think there’s a lot more risk to making it strict. I can see only the very smallest of benefit in making it strict, but I can’t estimate the risk that’s being increased.