Interesting. This works with any “k’th lowest hash”, actually.
If H_{(k)} is the k’th lowest hash seen (the lowest one having k=1), normalized to [0 \ldots 1], then \frac{k-1}{H_{(k)}} is an estimate for the amount of work.
If we model this as n uniformly random samples being taken from [0 \ldots 1] (which is what each hash attempt is), then
For such a distribution
which means that using higher k gives a better estimate.
However, for any small constant k, this variance is still \mathcal{O}(n^2). Using the number of blocks is a much better estimators. Let b be the number of blocks seen, and w the expected amount of work per block:
Then
And we can esimate the amount of work as bw:
Which has just \mathcal{O}(n) variance.
EDIT: I made a mistake in the variance of the bw estimator, it’s nw, not n. The two approaches do seem similar in terms of variance:
- \frac{k-1}{H_{(k)}} has standard deviation roughly equal to \frac{n}{\sqrt{k-2}}
- bw has standard deviation roughly equal to \frac{n}{\sqrt{b}}.