Matthew Wampler-Doty
TL;DR: In this paper, I investigate the effects of issuance policies on wealth inequality in ethereum by looking at how the Gini coefficients change over time. Two simulated models wealth under a for a proof of work issuance protocol are given, as well as two simulations for the "Slasher" Proof of Stake under two different reward schedules. With the exception of the first proposed mining model, none of the simulations forecast significant change in the Gini Coefficient from the initial genesis block, which appears to be roughly steady-state already. This contradicts claims made by some (including myself) that Proof of Stake will lead to run away income inequality.
There are two popular models for issuance of distributed currency: Proof of Work and Proof of Stake.
The prevailing model, used by BitCoin, is Proof of Work (PoW). Proof of Work involves computing a solution to some computationally difficult problem. This has some known problems:
Alternatives proposed are Proof of Stake (PoS), as well social-network consensus mechanisms such as the one used in Ripple. In PoS, a number of stakeholders coorperate to ratify a new unit of currency. Proof of Stake presents the tantilizing possibility of very rapid blockchain progression, which is strongly desired for an ecomerce platform.
One criticism of PoS is that it leads to wealth inequality. The ratifiers of newly issued blocks will tend to already possess a large balance, and provided they are compensated they will command even more wealth. To date, no analysis of the severity of this inequality has been made.
It is my view that the design of a cryptocurrency is social engineering, and this aspect of protocol design must be considered. Apart from analyzing capital lockup, I have not endeavored to model any other economic implications of the choice of issuance model.
A standard measure of wealth and income inequality is the Gini coefficient. It is based on the Lorenz curve. The Lorenz curve plots the cummulative $Y\%$ of wealth or income held by the bottom $X\%$ of households. The ideal Lorenz curve for when every household has exactly the same stake as every other is the line $y = x$; this is the Line of Equality. Let $A$ be the area between the true Lorenz curve and the ideal; $2A$ is the Gini coefficient.
from IPython.display import SVG
SVG(filename='Gini_Coefficient.svg')
Calculating the Gini coefficient is straightforward, since $A$ is the difference of two sums we can calculate it as a sum of differences, normalized appropriately.
def gini_coefficient(data,sort=True):
"Compute the gini coefficient for some distribution"
slope = float(sum(data)) / len(data)
running_sum = 0
i = 0
area = 0
if sort:
data = sorted(data)
for d in data:
i += 1
running_sum += d
area += slope*i - running_sum
return (2. * area) / (running_sum * len(data))
In the case of equality, where every household has exactly the same wealth as every other, then $A$ is zero, so we expect the Gini coefficient to be zero:
import numpy as np
gini_coefficient(np.ones(100))
0.0
The Gini coefficients have been worked out for a number of distributions; for instance in a uniform distribution we expect the Gini coefficient is $1/3$:
gini_coefficient(xrange(10**5))
0.33333666666666667
Note that the result isn't exact because xrange(10**5)
is only an approximate uniform distribution.
We next turn to studying the results of the EtherSale, which gives the wealth distribution of the genesis block which may in turn be used for any simulation.
Since the results of the Ethersale are a matter of public record (see Vitalik's post), we can think of the distribution it gives as providing an initial state for simulations. The original data can be found in this spreadsheet. This data does not include the endowment.
ethersale_data = np.load('ethersale.npz')['data']
ethersale_data.sort()
It is edifying to see Lorenz Curve of the ethersale, to give us a sense of wealth inequality in the genesis block:
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
import matplotlib.pyplot as plt
def Lorenz_Curve(data):
"Calculates the Lorenz Curve from a wealth distribution"
total = sum(data)
out = [0]
for d in sorted(data):
out.append(out[-1] + d / total)
return out
X = [i / float(len(ethersale_data)) for i in range(len(ethersale_data)+1)]
Y = Lorenz_Curve(ethersale_data)
f = plt.figure()
ax = f.add_subplot(111, aspect='equal')
ax.yaxis.tick_right()
ax.yaxis.set_label_position("right")
ax.yaxis.set_ticks_position('both')
plt.title('Ethereum Genesis Lorenz Curve')
plt.plot(X, X, label='Line of Equality')
plt.plot(X, Y, label='Lorenz Curve')
plt.xlabel('Percentage of Stakeholders')
plt.ylabel('Percentage of Stake')
plt.legend(loc='upper left')
plt.show()
print "Gini Coefficient:", gini_coefficient(ethersale_data)
Gini Coefficient: 0.820282261829
As a point of comparison, in 2009 the country with the greatest wealth inequality was South Africa, with a Gini coefficient of $\approx 0.7$ (according to this KPMG report). As another point of comparison, some estimates of BitCoin's Gini coefficient have it exceeding $0.88$ (see Izabella Kaminska's blog post from last January)
The distribution of wealth in the genesis block should be thought of as a baseline for comparison. Different issuance protocols and models will evolve this underylying distribution over time, changing the Gini coefficient in turn.
We start with the simplest model: a BitCoin style Proof of Work system, where every stakeholder in Ethereum has the same mining power. I do not think of this model as realistic, although some miners might find it appealing.
In this basic model, every stakeholder in ethereum is equally likely to be the next miner. For simplicity, let's assume the miner will receive a reward of 7 ETH. This implies that $10 \times 60 \times 60 \times 24\times 365 / 12 = 18396000$ ETH will be issued each year, which is $\approx 18000000$ ETH planned as per Vitalik's announcement in April. This leads to a simple simulation, which reflects how the Gini Coefficient of ethereum will evolve over time given our assumptions:
ginis = [gini_coefficient(ethersale_data)]
data = np.copy(ethersale_data)
sample_frequency = 200000
for t in xrange(1,20000001):
idx = np.random.randint(len(data))
data[idx] += 7
if idx != len(data)-1 and data[idx] > data[idx+1]:
data[idx], data[idx+1] = (data[idx+1], data[idx])
if t % sample_frequency == 0:
ginis.append(gini_coefficient(data,sort=False))
The following convergence model fits this simulation well:
$$model(t) = \frac{\text{Gini Coefficient}_{genesis}}{t / \alpha + 1} $$Here $t$ is the timestep, and $\alpha$ is an empirical parameter dependent on the reward amount.
X = sample_frequency * np.arange(len(ginis))
X_in_Years = X * 12. / (60*60*24*365) # Assuming 12 second timesteps
from scipy.optimize import curve_fit
def model(x, alpha):
return ginis[0] / (x / alpha + 1)
(alpha,),_ = curve_fit(model,X,ginis)
Y = model(X, alpha)
import pylab
pylab.ylim([0,1])
plt.plot(X_in_Years, ginis, color='blue', label='Simulation')
plt.plot(X_in_Years, Y, color='red', linestyle='--', linewidth=2, label='Model Fit')
plt.title('Egalitarian Mining')
plt.ylabel('Gini Coefficient')
plt.xlabel('Years')
plt.legend(loc='upper right')
plt.show()
print "⍺ ≈", alpha
⍺ ≈ 6641953.91481
As we can see from the above model, this simulation converges linearly to an equilibrium state, where wealth is uniformly distributed and the Gini Coefficient is therefore zero. A little experimentation shows that $\alpha$ is inversely related to reward amount, and that larger rewards lead to more rapid convergence to equilibrium.
It may be unrealistic to assume that mining power will be evenly distributed. It is well known that BitCoin mining is currently dominated by a handful of players with massive ASIC compute resources. To the Ethereum team's immense credit, a considerable amount of effort has been devoted to developing mining protocols that are intended to resist implementation in ASICs to in order to avoid this scenario.
Even in the ideal situation where ASICs, FPGAs and GPUs cannot be leveraged for mining, I expect that mining power will be proportional to stakeholder wealth. This could be simulated by making the simplifying assumption that:
$$ \text{Mining Power of $X$} = \frac{\text{Wealth of $X$}}{\text{Total Wealth}}$$Without further ado, here is a simulation of this assumption:
ginis = [gini_coefficient(ethersale_data)]
data = np.copy(ethersale_data)
csum = np.cumsum(ethersale_data)
sample_frequency = 200000
for t in xrange(1,20000001):
top_stakes = csum > np.random.randint(csum[-1])
csum += top_stakes * 7
idx = np.argwhere(top_stakes)[0,0]
data[idx] += 7
if idx != len(data)-1 and data[idx] > data[idx+1]:
csum[idx], csum[idx+1] = (csum[idx+1] - data[idx], csum[idx] + data[idx+1])
data[idx], data[idx+1] = (data[idx+1], data[idx])
if t % sample_frequency == 0:
ginis.append(gini_coefficient(data,sort=False))
I cannot think of a nice model, suitable for forecasting, to propose for this particular simulation - it leaves wealth inequality more or less fixed, with minor stochastic variation. Different runs have different outcomes. The fluctuations appear to taper off over time, presumably since inflation causes added money into the system to have less impact on everybody's overall wealth.
X = sample_frequency * np.arange(len(ginis))
X_in_Years = X * 12. / (60*60*24*365)
import pylab
plt.plot(X_in_Years, ginis)
plt.title('Weighted Mining')
plt.ylabel('Gini Coefficient')
plt.xlabel('Years')
plt.show()
As we can see, in this simulation there is only slight variation in the Gini coefficient over time, more or less at the 4th decimal place.
If I were to hazard a guess which model is closer to reality, egalitarian or weighted mining, I would guess weighted mining. This may be disappointing to people who are hoping for Ethereum to be a vehicle for social mobility. Even if ASIC mining is impossible, I would expect ASIC mining farms to be replaced by botnets or more conventional compute farms held by the wealthiest members on the platform.
Slasher Ghost is one of Vitalik's more recent proposals, presented in this Blog post in October. Here are details I consider relevant for simulation purposes:
- Blocks are produced by miners; in order for a block to be valid it must satisfy a proof-of-work condition. However, this condition is relatively weak (eg. we can target the mining reward to something like $0.02\times$ the genesis supply every year)
N + 3000
is the set of addresses such that sha3(address + block[N].hash) < block[N].balance(address) * D2
where D2
is a difficulty parameter targeting 15 signers per block (ie. if block N
has less than 15 signers it goes down otherwise it goes up). Note that the set of potential signers is very computationally intensive to fully enumerate, and we don’t try to do so; instead we rely on signers to self-declare.Vitalik goes on to talk about how the system can fall back to a mining protocol in the absence of block signers. Since sha3
is unbiased, then this is quite similar to the weighted mining simulation I presented in the previous section. Instead of picking one miner per round, fifteen are selected. However, while weighted mining was based on the hypothesis that mining power was proportional to wealth, I can be far more confident in simulating Slasher Ghost, as the random selection process is built into the protocol.
I am not entirely sure how Vitalik arrived at the "$0.02\times$ the genesis supply every year" mining reward target. For simplicity (and numerical stability), I will round this to 0.03
ETH. For comparison, I will also simulate the system with the usual reward of roughly 7
ETH, split amongst the 15
signers, rounded up to 0.47
ETH.
def slasher_sim(reward=0.03, sample_frequency=1000, steps=500000):
ginis = [gini_coefficient(ethersale_data)]
data = np.copy(ethersale_data)
csum = np.cumsum(ethersale_data)
reward = 0.03
sample_frequency = 1000
for t in xrange(1,steps+1):
new_csum = np.copy(csum)
for _ in xrange(15):
top_stakes = csum > np.random.randint(csum[-1])
new_csum += top_stakes * reward
idx = np.argwhere(top_stakes)[0,0]
data[idx] += reward
if idx != len(data)-1 and data[idx] > data[idx+1]:
new_csum[idx], new_csum[idx+1] = (new_csum[idx+1] - data[idx], new_csum[idx] + data[idx+1])
data[idx], data[idx+1] = (data[idx+1], data[idx])
csum = new_csum
if t % sample_frequency == 0:
ginis.append(gini_coefficient(data,sort=False))
return ginis
small_ginis = slasher_sim(reward=0.03)
large_ginis = slasher_sim(reward=0.47)
X = 1000 * np.arange(len(small_ginis))
X_in_Days = X * 12 / (60.*60*24)
import pylab
plt.plot(X_in_Days, small_ginis, label="0.02x Genesis Supply")
plt.plot(X_in_Days, large_ginis, label="Regular Issuance")
plt.title('Slasher Ghost')
plt.ylabel('Gini Coefficient')
plt.xlabel('Days')
plt.legend(loc='upper left')
plt.show()
As we can see, in either schedule, the Gini coefficients fluctuate at the 7th decimal place.
The analysis above leads to the following counter-intuitive conclusion:
The Matthew Effect, where the rich get richer, is not much of a concern for Ethereum on a Proof of Stake issuance schedule, since the platform more or less in a steady state of relatively extreme wealth inequality.
That being said, I see no reason at all for changing the issuance schedule, since it doesn't seem to have any effect on the over-all income distribution either way.