Timewarp attack 600 second grace period

@AntoineP exponential growth is always a problem if it goes on for long enough. Is there some upper bound to the speedup or can blocks come in every second if the 51% attack persists?

Wait, this is not a ā€œbug in Bitcoin Coreā€ this is a fix to the PR which changed block validation rules without adapting the block creation logic…

I’m not sure what exponential growth you are referring to. Are you asking if in theory it’s possible to extremely increase the block rate with any grace period value if the attack persists long enough? If so then yes.

In theory the difficulty can always be brought down to 1 if you can constantly claim what took you 2 weeks took you 2 weeks + x seconds. See the numbers i shared above. But i’m not sure how relevant it is: for instance with a 600 seconds grace period it would take hundreds of years to bring the difficulty down to 1.

Which is a bug in Bitcoin Core. It was caught before release. The second bug wasn’t, it’s fixed by 31600. It just illustrates how easy it is to screw things up when the grace period is only 600 seconds.

Update: it was a known omission before the original PR was merged, so maybe shouldn’t count as a bug: Move maximum timewarp attack threshold back to 600s from 7200s by TheBlueMatt Ā· Pull Request #30647 Ā· bitcoin/bitcoin Ā· GitHub

Exactly. I’m not really worried about a one-off 10% increase in block speed. But compounded over time it’s more worrying.

Hundreds of years is enough time to come up with a plan, but serious problems could happen long before difficulty reaches 1. E.g. deep reorgs would be easier to execute.

If we take a difficulty reduction of 10x as the (arbitrary) definition of ā€œbadā€, and if assume a 10% difficulty decrease every ~10 retarget periods, it takes 240 retarget periods to reduce to reach this ā€œbadā€ situation. That’s several years, probably enough to discuss and deploy a soft-fork to make it stricter. But not super comfortable either.

Whereas with a grace period of 600 reaching ā€œbadā€ takes 20 times longer, which seems very safe.

So this may be a good argument for keeping the number at 600 and to insist that pools either use the min_time field provided by Bitcoin Core, or implement their own code for the minimum time that takes both MTP and the new timewarp rule into account.

Not if you want to win a 51% race. After claiming you took 2 wks + x, you’re going to solve the next 2016 blocks in 2 wks - x if you maintain your hashrate. If you don’t, you’re not going to win chain work. To claim it took 2 wks + x in that round, you have to add 2x to to your 2 wks - x real time. You have only 1 x to manipulate without penalty and the other x you have to add to real time. As those accumulate past 7200 seconds, no nodes will accept your blocks.

Yes, that’s why i started with ā€œin theoryā€. In practice i think any of the discussed values for the grace period make extreme drops in difficulty completely unlikely.

There’s no theory that allows an increase the block production rate. A difficulty drop won’t accumulate more than 1 round unless hashrate drops. You want it to drop if that happens. Difficulty can’t drop more 1.2% in 1 and only 1 round if the reverse limit is 7200 and the attacker sets the last block to 7200 in the future. There’s no average decrease in difficulty because it will cause an increase in difficulty in the next round.

If the difficulty drop doesn’t accumulate then I flip back to my preference for a grace period high enough to avoid bugs, and which maintains the rule that the minimum nTime of a block is given by MTP + 1, without the need for a clock.

Update 2025-01-09: you don’t need a clock to honor the timewarp rule.

@zawy Imagine the proposed rule, that the first block in a period (i.e., one where n = 0 \mod 2016) must have a timestamp no more than 600s before its immediate predecessor (the last block of the previous period).

Call this constant G = 600 (grace period). Also introduce P = 2016 \cdot 600 (the period, 2 weeks).

For simplicity, let’s replace the ā€œtimestamp must be strictly larger than MTPā€ rule with a ā€œtimestamp must not be below MTPā€ rule (i.e., we allow timestamp=MTP). This doesn’t materially affect the attack, but means we don’t need to increment by 1s every 6 blocks.

The following sequence of timestamps for periods is valid, starting at time 0:

  • Period 0: [0, 0, 0, …, 0, 0, P].
  • Period 1: [P-G, P-G, …, P-G, 2P-G].
  • Period 2: [2P-2G, …, 3P-2G].
  • Period 3: [3P-3G, …, 4P-3G],
  • …
  • Period k: [kP-kG, …, (1+k)P-kG].

Within each period, the difference between the first and last block timestamp is exactly P, so the difficulty will not adjust. However, the net timestamp increase per period is only P-G, i.e., 10 minutes less than 2 weeks.

So this allows attacker to permanently keep the block rate higher than intended, without incurring a difficulty increase cost.

The reason for the 600s grace period is that is compensates the effect of the off-by-one in the difficulty calculation (it only looks at how long 2015 blocks take, not 2016), which would otherwise result in (under constant hashrate) periods of 2 weeks plus 10 minutes. So with the proposed rule, a timewarp attacker can not cause a steady constant-difficulty block production rate of more than 2016 per two weeks.

There is another effect due to asymmetry of the Erlang distribution that results in another 10 minutes lengthening under constant hashrate. I forget why this isn’t incorporated for compensation in the proposed GCC rule.

EDIT: ah, it’s explained in footnote [1] in bips/bip-XXXX.mediawiki at 7f9670b643b7c943a0cc6d2197d3eabe661050c2 Ā· TheBlueMatt/bips Ā· GitHub. The attacker can ignore the first effect but not the second.

1 Like

I don’t understand how one of the effects can be ignored. Reference [1] switched from saying we’re looking at the Erlang of 2015 blocks to 2016 blocks. It seems to imply the attacker received a double benefit, i.e. one by increasing timespan by 600, but then another by somehow going back in real-world time by 600 seconds to get 2016 * 600 seconds to mine instead of being stuck with 2015 * 600.

[ for time taken to mine the last 2015 blocks ] … IBT is 2016/2014 * 600. This is equivalent to 600 * E(2016 * 600/X) where X~ErlangDistribution(k=2015, Ī»=1/600). In the case of a miner deliberately reducing timestamps by 600 seconds on the difficulty-retargeting block, we are effectively changing the difficulty multiplier to (2016 / (time taken to mine the last 2016 blocks + 600)), or 600 * E(2016 * 600/(X + 600)) where X~Erlang Distribution(k=2016, Ī»=1/600), which is effectively targeting an inter-block time of ~599.9999 seconds.

Concerning if this means ā€œdifficulty drop can accumulateā€, yes, for the last block in this sequence he can use real time in for the last timestamp, then the next period will have a lower difficulty, but then the next period would be back to normal. He would have to do 168 periods with a 2 hr limit to drop difficulty to I think 1/2.

I think there’s a lot more risk to making it strict. I can see only the very smallest of benefit in making it strict, but I can’t estimate the risk that’s being increased.

Just so we’re talking about the same thing, my points are:

  • I believe a majority-hashrate attacker can use the timewarp attack to permanently increase the block rate beyond 1 per 600s on average, while keeping the hashrate constant, as long as they keep up the timewarping scheme. They can’t keep increasing exponentially or anything like that.
  • With a grace-period limiting how much time can go backwards between the last block of a period and the first block of the next period, we limit how much that that increased rate is. If the grace period is G seconds, then the attacker can maintain a block rate of roughly one per \frac{2017 \cdot 600 - G}{2016} seconds, at constant hashrate. I don’t remember why it’s 2017 and not 2018 there (as would be expected if both the off-by-one and Erlang asymmetry are relevant), but I will try to simulate it.
  • I have no opinion about when/where to enforce timestamp rules beyond bounding the amount of time the timestamp can go backwards between the last block of a period and the first block of the next period.
  • I understand and agree with your example.
  • Right, I don’t see why the BIP thinks it’s 2017 instead of 2018. I bolded the unexplained jump in logic in the BIP that I didn’t follow.
  • You agree that making the limit 600 seconds is at least as safe as 7200?

Rather than 2017 / 2018, we’re measuring a timespan across 2015 blocks, so via Erlang we expected 2014 blocks. Therefore the adjustment is supposed to be ā€œ2014_blocks / 2015_expected_solvetimesā€. But the code calculates ā€œ2016_blocks / 2015_timesā€. Allowing +600 in the denominator makes it 2016 / 2016. But we should add 1 to the numerator if we’re going to add one to the denominator: (2014+1) / (2015 + 1) = 2015 / 2016. So 2016 / 2016 is too large.

For your example of subtracting 600 at the beginning and end, the difficulty is too much as always 2016 / 2015 instead of ā€œstaying constantā€ with 2014 / 2015, i.e. it’s taking 600.6 seconds / block (2 weeks + 20 minutes). So G = 3 * 600 in order to solve in 2 weeks minus 10 minutes?

Let’s say c_i is the actual time at which block i is found, and t_i is its timestamp.

If the attacker (who is assumed to have 100% hashrate) follows this strategy:

  • The first block in every period p uses as timestamp the previous block’s timestamp minus the grace period: t_{2016p} = t_{2016p-1} - G (the minimum legal time according to the proposed rule).
  • The 2014 middle blocks use the minimum legal time, which doesn’t really matter as long as it’s low enough: t_{2016p+k} = 0 for 0 < k < 2015.
  • The final block in the period uses the current time t_{2016p+2015} = c_{2016p+2015}.

Then the observed duration of the 2015 blocks in period p (relevant for difficulty adjustment) is t_{2016p+2015} - t_{2016p} = c_{2016p+2015} - c_{2016p-1} + G, i.e. the time it took to mine 2016 blocks plus G.

The difficulty multiplier m will thus be m = \frac{P}{X + G}, where X is Erlang distributed with k=2016 and \lambda=\operatorname{hashrate} \cdot \frac{\operatorname{target}}{2^{256}}.

In a simulation with G=600 I get an effective average block interval under constant hashrate of 599.9997… seconds. With G=7200 I get 596.7211… seconds.

1 Like

OK, now I see. It’s not that an adjustment for both Erlang and ā€œ2015 holeā€ aren’t needed, it’s that 600 seconds before the previous block isn’t a 600 second lie, it’s a 1200 second lie because we expected a timestamp 600 seconds after it.

4 Likes

That’s a great way to put it.

1 Like

See this comment: Great Consensus Cleanup Revival - #66 by AntoineP

Although @AntoineP now agrees with increasing the grace period, for reasons explained in the above comment, it might still be useful to investigate the following:

Does anyone have a log of when their node saw blocks arrive over e.g. the past 10 years? Perhaps we can determine, or rule out, the presence of such a bug in the wild.

I don’t know how though.

One problem with such analysis is that any block violating the MTP rule would not get propagated, so you wouldn’t observe it. Even compact block relay checks the header: bips/bip-0152.mediawiki at 5333e5e9514aa9f92810cfbde830da79c44051bf Ā· bitcoin/bips Ā· GitHub

Another problem is that you don’t know know the clock of the (pool) node that generated the block template.

A bug that ignores the template timestamp can be present in the pool software, the ASIC (stock or alternative) firmware or a proxy run by the miner.

If the bug is in pool software, we might be able to observe it live by studying live stratum jobs. This is easy on testnet4 today because MTP is ahead of wall clock time there. But not on mainnet.

Perhaps long term logs of such jobs exist, though again I’m not sure what to look for.

It seems a bit less likely to me that ASIC firmware ignores the timestamp provided in its stratum job. My own experience with old linux machines is that NTP doesn’t always work out of the box, so someone would have noticed such a bug the hard way quickly.

I have no idea how stratum proxies work. Those might exist at the scale of a small mining farm that rarely finds an actual block, so it seems undetectable.

Systemic use of NTP breaks security.

I’m still trying to understand whether the difficulty decrease compounds. Take the following (same MTP simplification as above):

  • Period 0: [0, 0, 0, …, 0, 0, P].
  • Period 1: [P-G, P-G, …, P-G, 2P].
  • Period 2: [2P-G, …, 3P].
  • Period 3: [3P-G, …, 4P],
  • …
  • Period k: [kP-G, …, (1+k)P].

So each period is stretched by G, causing the difficulty to drop every time. For G = 2 hours, this would be 0.6% per retarget period. That translates to about 15% per year.


Note that any difficulty decrease accumulation can be reset by an honest miner finding the last block of any retarget period.

1 Like

That’s fair, I think this may work, but even if it does, I don’t believe this is a problem.

Given that the end-of-window time (which is bounded by the current time) goes up by P per block, it means the real-time block production rate in your scheme is exactly P per window, exactly the intended rate. So, from the perspective of validating nodes, there is no resource consumption attack.

And beyond that, indeed, this allows miners to reduce the difficulty while keeping the block rate constant. But a colluding group of miners is always able to do that: they can just agree to all reduce their hashrate, for any difficulty reduction they’d like. But, all that achieves is making it easier/cheaper for a competing miner who doesn’t follow the scheme to take over.

1 Like

What Pieter said. As soon as you have a restriction on the timestamp of the first block in a period, miners can only ā€œcarry overā€ a fixed block rate increase. There is no compounding.

Right. Exactly setting the timestamps isn’t something a computationally-bounded miner can do because it is still a Poisson process.

If you try to adapt Sjors’ scheme to a real algorithm for deciding timestamps, I think you exactly get the scheme I posted above. And we simulated that: it has no compounding effect because the attackers’ block production rate is affected by the difficulty adjustment they caused, providing negative feedback. But, again, even if there was a compounding effect it would not be a concern, as mentioned above, because it’s at worst equivalent to attackers just choosing to reduce their hashrate.

1 Like