Correcting the error in getnetworkhashrateps

TL;DR: I think @sipa used Beta where he should have used Gamma Binomial.

It’s suspicious that the variance denominator jumps from k-2 to k when I “increase k by 1”.

That is, we have

W = (\text{#S}-1) \times \frac{2^{256}}{H}

and I say “let’s increase #S by 1” by pretending T was a hash. It’s just a little higher than H by definition. I believe this is legitimate because I believe the mean interval from H to T is equal to the mean interval between the elements in #S. So I’ll add T as an extra “hash” to the set S to make S’.

W = (\text{#S}'-1) \times \frac{2^{256}}{T} = \text{#S} \times \frac{2^{256}}{T}

If difficulty is a constant at T, then b = #S’-1 and the W equation for the Beta distribution is equal to the Poisson’s. But the variance denominators are supposedly different, i.e. 1/b verses 1/(#S’ - 2).

So I wonder if we’re back to the original point of this thread, i.e. an (N-1)/N correction is needed when we try to apply a distribution that counts events in a given amount of time in situations where we make the mistake of starting with a given number of events. Or there could be a vice versa mistake. In this case, the Poisson counts events in a fixed amount of time and the Beta distribution counts time for a fixed number of blocks, so there’s a probable mismatch.

In asking Grok about this, it appears the Erlang (or Gamma) should be used in place of the Poisson if we look back a fixed number of blocks, or the Binomial used in place of the Beta if we’re looking back a fixed amount of time. (Poisson is for fixed time and Beta is for fixed blocks). I believe we’re doing the latter, and if I knew how to apply the Binomial I would get “lowest hashes” method to have n^2 / (#S’-1) variance, same as Poisson’s n^2 / b. In other words, the Nth lowest hash is precisely like a difficulty target for N-1 “blocks found” so that we can just use Poisson in both cases. This is a given if my supposition above about the mean interval is correct.

If the difficulty isn’t constant, it still looks correct. In that situation b > #S’ so that the W from b has a lower variance while still using the same Poisson equations for W and Var for both b and #S’.

To be clear, my claim of \mathrm{Var}[\frac{k-1}{H_{(k)}}] \approx \frac{n^2}{k-2} is for the case where k is fixed, before the experiment is conducted. I don’t know if it remains when k becomes dependent on the experiment (as is the case in the “use largest-hash block among those beating the lowest target” approach).

Correction: I meant to say use Binomial (not Gamma) in place of Beta. Yes, Beta is the distribution of a normalized time for a fixed number of events, and Poisson is the distribution of events for a fixed time, so there’s a probable mismatch.

Since Beta is for fixed blocks and Poisson is for fixed time, I would guess we have to make the (N-1)/N correction like this to your Poisson equations to convert it to fixed blocks:

E[bw] = n \frac{b-1}{b}

Var[bw] = \frac{n^2}{b} \frac{b}{b-1}

In other words, if Beta is used (fixed-blocks), then make the above correction to Poisson which should be the same as using Erlang (or Gamma). If Poisson is used (fixed time interval), use Bionomial in place of Beta.

Experimentally, given b blocks seen, a difficulty, and a uniformly random number of hashes, the Poisson equation seems to need a (b+1)/b correction which is in the opposite direction I had guessed. It’s accurate down to b=1. This makes me wonder if my OP about hashrate needing a correction is correct. In the hashrate case, we’re specifying a number of blocks to look at and the correction (N-1)/N is, in my intuition, based on converting it to be in a fixed amount of time, as in this case. But this is showing that if hashes per that fixed time are uniformly random, an (N+1)/N correction is also needed, giving a net N-1/N correction, or no net correction if our (N-1)/N is really supposed to be N/(N+1). This makes more sense to my intuitive interpretation.

This correction is if work and therefore λ are uniformly random, which may not be an assumption we should use when adjusting difficulty or checking hashrate.

Output:

target/max_target = 0.0100
b = 5, runs = 100000
runs that had b blocks = 6156
Mean work when b blocks were seen = 600
Poisson_work = b * max_target/target = 500
Poisson_work * (b+1)/b = 600
Stdev work / mean_work when b blocks were seen = 0.409 
StdDev 1/SQRT(b+1)= 0.408

Code:

import random
import statistics

# Mean work for a given difficulty and given number of blocks found if work is uniformly random in a fixed amount of time. 

runs = 100000 # number of times to do the experiment  
work_if_b_blocks_found = []
b = 5 # a given number of found blocks
max_target = pow(2,256)
target = 0.01*max_target # difficulty target
max_work = int((1+5/pow(b,0.5))*b*max_target/target) # get 5 StdDev's more than estimated mean hashes needed
Poisson_work = b * max_target / target
for i in range(1, runs):     
    work = int(random.random()*max_work)  # perform random # of hashes
    if work < b: continue
    hash_list = [random.random()*max_target for _ in range(work)] 
    wins_list = [x for x in hash_list if x < target] # get winning hashes
    if len(wins_list) == b:
        work_if_b_blocks_found.append(work)
print(f"\ntarget/max_target = {target/max_target:,.4f}")
print("b = " + str(b) + ", runs = " + str(runs) )
print("runs that had b blocks = " + str(len(work_if_b_blocks_found)) )
print("Mean work when b blocks were seen = " + str(int(statistics.mean(work_if_b_blocks_found))))
print("Poisson_work = b * max_target/target = " + str(int(Poisson_work)))
print("Poisson_work * (b+1)/b = " + str(int(Poisson_work*(b+1)/b)))
print(f"Stdev work / mean_work when b blocks were seen = {statistics.stdev(work_if_b_blocks_found)/statistics.mean(work_if_b_blocks_found):.3f} ")
print(f"StdDev 1/SQRT(b+1)= {1/pow(b+1,0.5):,.3f}")

Why do you only look at cases where a specific (b = 5) number of blocks during the fixed interval? That must create a distortion.

If you fix the time window, the number of blocks found within it will be variable, and estimations should account for all of the possibilities observed.

In my simulation, even with a variable hashrate during the window, and variable difficulties during the window, the estimator sum(block.work for block in window) / window.duration) gives an unbiased estimate for the (average) hashrate during the window, without any correction factors.

I fixed b because you fixed k and I was wanting to see if the equations for b and k-1 were the same in the limit, i.e. if the kth lowest hash is exactly like a target for b. Also, when someone runs getnetworkhashps, they’re selecting a b number of blocks. Difficulty adjustment also uses a fixed number of blocks.

My simulation doesn’t specify a timespan, so I guess I could be wrong I’m wrong in saying it’s for a fixed amount of time and thereby doesn’t say or reveal anything about hashrate. But if I said it’s for a certain timespan before I ran it, I don’t see what I would need to change.

I fixed b because you fixed k and I was wanting to see if the equations for b and k-1 were the same in the limit, i.e. if the kth lowest hash is exactly like a target for b

Fair enough. However, I’ve also come to the conclusion that using the k’th lowest hash for estimating hashrate is inferior to using (n-1)/n * sum(work)/sum(time), so while interesting, I don’t think it’s useful for actually measuring hashrate.

Also, when someone runs getnetworkhashps, they’re selecting a b number of blocks.

Sure, but then you’re talking about a random process where the number of blocks is fixed, and the time isn’t. If you fix the time, the number of blocks isn’t fixed anymore, and then indeed, it isn’t directly applicable anymore in the current getnetworkhashps RPC. But intuitively, because in that case sum(work in window)/(window length) is such a simple and unbiased estimator, it seems that “fixed time window, variable number of blocks” is actually the better way of measuring hashrate if say you want to know the hashrate in the last week (as opposed to first checking how many blocks there in the window, and then running a fixed-number-of-block estimator on that, which - whether you want it or not - brings actually back to the fixed-blocks-variable-time case).

To simulate:

  • Pick a scenario, which is a fixed amount of time, which you subdivide into a number of fixed segments, each having a fixed hashrate and a fixed difficulty. I think that roughly matches reality, allowing the hashrate and difficulty to change as a function of time. Further down I’ll relax this to allow them to be functions of the previous blocks’ timestamps too.
  • For each segment i, sample \mathrm{Poisson}(\mathrm{duration}_i \times \mathrm{hashrate}_i / \mathrm{difficulty}_i), representing the number of blocks b_i found in that segment.
  • The hashrate estimate is sum(work)/(window duration), so in this simulation that corresponds to (\sum_{i}b_i \times \mathrm{difficulty}_i) / \sum_{i}\mathrm{duration}_i.
  • The real average hashrate through the window is (\sum_{i} \mathrm{duration}_i \times \mathrm{hashrate}_i) / \sum_{i} \mathrm{duration}_i

My finding is that the expected value of the estimate exactly matches the real average hashrate, without any correction factor to make it unbiased. I also think this is easy to prove: the expected value of each (b_i \times \mathrm{difficulty}_i) is (\mathrm{duration}_i \times \mathrm{hashrate}_i), and the expectation of a sum is the sum of the expectations. Lastly, dividing by \sum_{i} \mathrm{duration}_i is just dividing the expectation by a constant (it’s this last step that is much harder in the variable-time case, because it’s not a constant anymore then).

We can generalize this to making the difficulty a function of the prior blocks’ timestamps without losing the result. First, observe that we can arbitrarily split each segment into multiple smaller segments with all the same hashrate and difficulty. In the limit, we can think of there being an infinite number of tiny segments, which can each have their own hashrate and difficulty, i.e., this approaches making hashrate and difficulty a function of time.

Secondly, observe that the difficulty just does not matter. It affects the number b_i of blocks within a segment, but is cancelled out in the expression of the expected value of \sum_{i} b_i \times \mathrm{difficulty}_i. So the \mathrm{difficulty}_i can be made a function of the prior b_j for j < i.

Combining these, we can treat the difficulty as an arbitrarily function of the previous block’s timestamps (which are encoded in the b_i values, if the segments become infinitesimal), changing every time N blocks are found (well, infinitesimally after that block is found, during the next segment).

Making the hashrate a function of previous blocks’ timestamps is a bit more complicated, but the result still holds. This does not cancel out in the expression for the expectation of b_i, and thus these are no longer independent. However, since the expectation of a sum is still the sum of the expectations, even when the terms being added are not independent, this does not affect the result.

So, I conclude that sum(work in window) / (window duration) is an unbiased estimator for the hashate during a fixed window, even when the hashrate and difficulty can change as a function of both time and prior blocks’ timestamps.

I had mentioned my preference for sum(work in window) / (window duration) over a given time window. It’s good, interesting, and useful to know it’s correct even for difficulty and hashrate changes. Maybe an update to getnetworkhashps could include it (by choosing a time range in addition a block range) as an option.

I believe this is the best-possible estimate of current hashrate for an arbitrarily-small fixed time period t when given only the lowest hash seen in t.

L = \text{lowest hash seen in time t}

W = \text{work in t}

\lambda = \frac{W}{2^{256}}

\text{exponential CDF}(L) = 1 - e^{-\lambda L}

E = \text{error signal} = \text{CDF}(L) - 0.5

The error signal is the probability that the prior estimate of W was wrong. 0.5 is the median observation if the prior estimate was correct, so there would be no error. Use the error signal in the EMA equation.

h = \text{height of t time segments}

W_{h} = W_{h-1} \cdot e^{-\frac{E}{N}}

\text{Stdev} \approx \frac{W}{\sqrt{2N}}

W and L in my lambda are at h -1

N = “mean lifetime” of the EMA estimate in units of t. You choose N to get your desired stability / slowness of the estimate. Divide by t to get hashrate.

To simplify the equation, use e^{-x} \approx 1-x for small x = \frac{E}{N}:

W_{h} = W_{h-1} \cdot ( 1 + \frac{e^{-L \cdot W_{h-1}}}{N} - \frac{1}{2N})

The median L is expected to be ln(2) \cdot W_{h-1} which would be no correction. The smallest-possible L makes the largest-possible correction:

W_{h} = W_{h-1} \cdot ( 1 + \frac{1}{2N})

A large L can has a large correction, but an accidentally-large L isn’t possible like a small L. This is a spot check on my math and the legitimacy of the idea The idea comes from my search for the mathematically-perfect difficulty algorithm which similarly uses the exponential CDF of solvetimes to adjust difficulty every block and experiments have shown it is the best-known (fastest response time to changes in hashrate with the least variation).

hashrate = sum(work)/(time duration) from h - N to h gives a better estimate of hashrate at h -N/2. The EMA needs a starting W. If it starts at h-N The starting W_{h-N} could be obtained by sum(work) from h-\frac{3N}{2} to h-\frac{N}{2} and dividing by N.