How fair is Bitcoin mining?
Aim: During the last few years, Bitcoin mining has become a massively collaborative enterprise. Miners invest heavily in ASICs/FPGAs/GPUs and tend to pool their resources as Bitcoin mining is assumed to be 'fair' in the following sense [3]: The probability that a pool mines the next block is
(i) equal, or at least directly proportional, to its hashrate share of the Bitcoin network, and
(ii) independent of who mined the previous block.
Both (i) and (ii) stem from the SHA-256 proof-of-work employed by the Bitcoin protocol. Some researchers have challenged the second assumption arguing the possibility of 'selfish mining' in which a pool delays announcing a successfully mined block to get a head start on mining the next block [4, 7]. However, we are not aware of any empirical study investigating assumption (i) or (ii). Our research aims to validate/challenge both the assumptions based on real-life Bitcoin mining data.
Method: We use publicly available data [2] on block numbers and the identities of the mining pools generating these blocks for the Bitcoin blockchain. We extracted this data for the six-month period from November 6, 2017 - April 30, 2018, constituting 31,290 blocks. The actual hashrates of Bitcoin mining pools over the same period were obtained from [1]. The six-month time window is chosen as larger time intervals lead to the inclusion of now defunct mining pools.
We partition the six-month period into 26 weeks and compute the following two ratios for each mining pool $P$ over each week: the block ratio $b_P$ defined as the ratio of blocks generated by the pool to the total number of blocks generated, and the hash ratio $h_P$ defined as the ratio of pool hashrate to the total hashrate of the Bitcoin mining network. Note that $b_P$ equals the experimental probability of pool $P$ mining the next block and assumption (i) states that $b_P$ is equal, or at least proportional, to $h_P$ . We say that a pool $P$ overachieves if $h_P < b_P$ or underachieves if $h_P > b_P$ in a given week. For each mining pool $P$, we compute the relative difference $\Delta_P = \frac{b_P - h_P}{b_P}\times 100$, and the Pearson correlation coefficient $r_{_P}$ between the data series $b_P$ and $h_P$.
Results: The data indicates that although they agree for most of the pools, the two ratios could differ significantly for a subset of mining pools. For instance, we identify a 10 week window in which BTC.TOP, one of the largest mining pools, satisfies $-14.794\% \ge \Delta_P \ge -58.522\%$. To get an idea of how this translates into potential revenue loss, we note that in the week of April 23-30, 2018, BTC.TOP mined 11 blocks, whereas based on the hash ratio they were expected to mine 17-18 blocks. Currently, each block is worth at least 12.5 bitcoins. On April 30th, the price of a bitcoin was USD 9,075. Therefore, BTC.TOP potentially lost at least USD 680,625 during that week.
The numbers also show that $b_P$ and $h_P$ are strongly correlated with $0.87653 \le r_{_P} \le 0.99113$ over all mining pools. The two pools with $r_{_P} <0.9$ are BTC.TOP and BitClub Network.
We observe that BTC.TOP underachieved through most of the period under consideration. On the other hand, there are pools like BTC.COM that mostly overachieved. This motivates a closer examination of assumption (ii) focusing on broadcast delays and selfish mining. The next section outlines our ongoing work in this direction.
Conclusion and Ongoing Work: Our results provide strong evidence for direct proportionality but counter evidence for equality in assumption (i). For a single mining pool, block generation can be modeled as a Bernoulli trial [3]. Theoretical estimates are known for the number of success runs of length $k$ over $n$ Bernoulli trials [5, 6]. We have calculated the number of success runs of varying lengths for different pools from our data. Currently, we are comparing these empirical stats with the theoretical results. Any significant deviation from theory will imply that assumption (ii) should be revisited to account for factors such as network delays or selfish mining. We plan to report our findings in BTS'18.
This is joint work with Robert Benkoczi, Daya Ram Gaur, Shahadat Hossain, and Arshdeep Singh (supported, in part, by NSERC USRA).
References
[1] Bitcoinity, https://data.bitcoinity.org/bitcoin/hashrate/, (accessed on May 5, 2018).
[2] BLOCKCHAIR, https://blockchair.com/bitcoin/blocks, (accessed on May 1, 2018).
[3] I. Eyal, The miner's dilemma. In Proc. IEEE Symposium on Security and Privacy (2015), 89-103.
[4] I. Eyal and E. G. Sirer, Majority is not enough: Bitcoin mining is vulnerable, In Proc. 18th Int. Conf. Financial Cryptogr. Data Secur. (2014), 436-454.
[5] F. S. Makri and Z. M. Psillakis, On success runs of a fixed length in Bernoulli sequences: Exact and asymptotic results, Comput. Math. Appl. 61 (2011), no. 4, 761-772.
[6] M. Muselli, Simple expressions for success run distributions in Bernoulli trials, Statist. Probability Lett. 31 (1996), no. 2, 121-128.
[7] K. Nayak, S. Kumar, A. Miller and E. Shi, Stubborn mining: generalizing selfish mining and combining with an eclipse attack, In Proc. IEEE European Symposium on Security and Privacy (EuroS&P) (2016), 305-320.