# The Art of Slashing

Consensys’s glossary of Ethereum 2.0 terms eloquently describes the act of “slashing” as one of Ethereum’s new processes created to maintain blockchain security. They state that Ethereum 2.0’s consensus mechanism has a couple of rules that are designed to prevent attacks on the network. Any validator, an actor that proposes and attests new blocks, found to have broken these rules will be slashed and ejected from the network. According to Codefi, there are three ways a validator can gain the slashed condition:

1. By being a proposer and signing two different beacon blocks for the same slot.
2. By being an attester and signing an attestation that “surrounds” another one.
3. By being an attester and signing two different attestations having the same target.

Slashing means that a significant part of the validator’s stake is removed, potentially up to the whole stake of 32 ETH in the worst case. Validator software and staking providers will have built-in protection against getting slashed accidentally so that slashing should only affect validators who misbehave deliberately.

## The Data

The data we used for this analysis was a webscrape of the  BeaconScan block explorer’s slashed slots repository. The page can be found here.  Our dataset contains information on the 1751 slashings that occurred between the genesis Epoch and Epoch 15087, which corresponds to about 64 days worth of data. This analysis was performed using R, and associated packages including tidyverse and rmarkdown. Below shows the first 5 rows of the data:

See Code
# Load libraries needed
library(knitr)
library(tidyverse)
library(stringi)
library(ggfortify)
library(ggnewscale)
library(networkD3)
library(igraph)



The 7 variables provided are:

• X – The row index of the validator
• epoch – The epoch number that the validator was slashed
• slot – The slot number that the validator was slashed
• age – The amount of time passed since the validator was slashed
• validatorSlashed – The index of the validator who was slashed
• slashedBy – The index of the validator who was doing the slashing
• reason – The reason why the validator was slashed

Due to the sink-source structure of the slashedBy and validatorSlashed columns, this data naturally fits the paradigm of a graph network, consisting of validators (as nodes), and slashes (as directed edges). Because we know some attributes including the epoch, slot, and the reason for the slashing, we can easily apply some standard exploratory analysis techniques in conjunction with other approaches for analyzing networks. We explore the data with this in mind in the rest of the analysis.

## Slashing 101

We begin our analysis with some high level-statistics on the prevalence of slashing on the network:

• Number of Unique Validators: 80,392
• Total Number of Slashings: 1,751
• Number of Slashed Validators: 1,647
• Number of Slashers: 771

Given that the vast majority of the validators on the network have not been slashed, nor have performed a slashing, the likelihood of a slashing in any one slot is actually quite small. However, when we analyze the number of slashings over time we observe spikes where a large number of consensus violations are committed in relatively quick succession. Analyzing the per-epoch count of slashings over time highlights these aforementioned spikes.

See Code
ggplot(data=num_slashed_over_epoch, aes(x=Var1, y=Freq, group=1)) +
scale_x_continuous(breaks = seq(0, 15000, by = 1000))+
scale_y_continuous(breaks = seq(0, 150, by = 10))+
geom_line()+
labs(title="Number of slashed over epoch",x="Epoch", y = "Frequency")


To better assess the impact of these spikes in slashings, we produced a cumulative count plot that tracks the total number of slashings across epochs. The first large spike in slashings occurs around epoch 3000 and another smaller spike in slashing around epoch 12500. Despite the fact that these jumps are significant, when focusing on the rate of change of slashings, the number of offensive rule violations are quite stable the majority of the time. Globally, the rate of slashing is approximately 117 slashes per 1000 epochs. When we exclude the spikes, the rate of change is approximately 63 slashes per 1000 epochs.

See Code
df_validator=read.csv('validator_data.csv')
num_slashed_over_epoch=as.data.frame(table(df_slashed$epoch)) num_slashed_over_epoch$Var1=as.numeric(as.character(num_slashed_over_epoch$Var1)) cumul=cumsum(num_slashed_over_epoch)  On average, a validator marks their first slash in the initial 3409 epochs after activation. The fastest first slash was found to occur only 4 epochs after activation, while the slowest first slash was 14892 epochs after activation. Within an epoch, approximately 2.1 distinct slots are slashed on average while approximately 4.6 overall slots are slashed on average. This implies that many of the same slots are being slashed with in an epoch. If we exclude slashings that occur in the same epoch, on average, 40 epochs elapse between slashings. The following, fairly skewed, histogram shows the distribution of time lapsed between slashings. We can see that it is very common that less than 50 epochs will elapse between slashings. In fact, about 41% of the time only 1 epoch without a slashing will occur between two epochs with at least 1 slashing. The longest period without a slashing lasted 900 epochs, which is 93 hours. See Code df_slashed$temp=df_slashed$epoch df_slashed = df_slashed %>% mutate_at(c("temp"), funs(lead), n = 1 ) df_slashed$diff_epoch = df_slashed$epoch-df_slashed$temp
df_slashed_diff_epoch = filter(df_slashed,diff_epoch != 0 & !is.na(diff_epoch))
mean(df_slashed_diff_epoch$diff_epoch) ggplot(df_slashed_diff_epoch, aes(x=diff_epoch)) + geom_histogram(color="darkblue", fill="lightblue", boundary=0)+ labs(title="A Distribution of the Number of Epochs lapsed between Slashings", x="Epoch elapsed", y="Frequency")+ scale_y_continuous(breaks = seq(0, 300, by = 25))+ scale_x_continuous(breaks = seq(0, 900, by = 100))  See Code first_slash=aggregate(df_slashed$epoch, by=list(df_slashed$slashedBy), FUN=min) df_first_slash = df_validator %>% inner_join(first_slash, by = c("index"="Group.1")) df_first_slash$activationEpoch[df_first_slash$activationEpoch=='genesis']<-0 df_first_slash$timebeforefirstslash = df_first_slash$x-as.numeric(df_first_slash$activationEpoch)
mean(df_first_slash$timebeforefirstslash) min(df_first_slash$timebeforefirstslash)
max(df_first_slash$timebeforefirstslash)  ## Why are people being slashed? Of the three ways a validator can violate consensus rules, there are only two such categories of offenses: attestation rule and proposer rule violations. Though a slashing of a validator's stake could occur for either of these two reasons, the distribution is skewed heavily towards attestation rule violations as they encompass nearly 97% of justifications for slashes in our data. The remaining 3% of slashes can be attributed to proposer rule offenses. See Code ggplot(df_slashed, aes(x= reason)) + geom_bar(aes(y = ..count..), stat="count",width=0.5, fill="steelblue") + geom_text(aes(label= scales::percent(..count../sum(..count..))), stat= "count")+ scale_y_continuous(breaks = scales::pretty_breaks(n=10))+ labs(title="Number of slashes per reason") Interestingly, this distribution has not been constant over time. Although proposer rule offenses comprise only 3% of total slashings, about 67% of those occurred after epoch 13500. Furthermore, despite the proposer rule offenses being rare throughout all epochs it was, interestingly enough, the very first offense committed by a validator on the network. Overtime proposer violations have becoming more frequent as shown in the subsequent time series graphs. See Code cumul_num_slashed_over_epoch_reason <- df_slashed %>% mutate(epoch = factor(epoch), reason = factor(reason)) %>% group_by(epoch, reason, .drop = FALSE) %>% tally() %>% group_by(reason) %>% arrange(epoch) %>% mutate(cumul = cumsum(n), epoch = as.numeric(as.character(epoch))) blank_data_1 <- data.frame(reason = c("Attestation rule offense","Attestation rule offense","Proposer rule offense","Proposer rule offense"), x = 0, y = c(0, 1800, 0, 55)) ggplot(data=cumul_num_slashed_over_epoch_reason, aes(x=epoch, y=cumul, group = reason, color=reason)) + geom_line()+ geom_blank(data = blank_data_1, aes(x = x, y = y))+ scale_x_continuous(breaks = seq(0, 15000, by = 3000))+ facet_wrap(~reason,scales="free_y") + labs(title="Reasons for slashes",x="epoch", y = "cummulative frequency")+ theme(legend.position = "none")+ expand_limits(y = 0) + scale_y_continuous(expand = c(0, 0))+ scale_y_continuous(breaks = scales::pretty_breaks(n=20)) num_slashed_over_epoch_reason <- df_slashed %>% mutate(reason = factor(reason)) %>% group_by(epoch, reason, .drop = FALSE) %>% tally() %>% group_by(reason) %>% arrange(epoch) blank_data_2 <- data.frame(reason = c("Attestation rule offense","Attestation rule offense","Proposer rule offense","Proposer rule offense"), x = 0, y = c(0, 135, 0, 2.5)) ggplot(data=num_slashed_over_epoch_reason, aes(x=epoch, y=n, group = reason, color =reason)) + geom_line()+ geom_blank(data = blank_data_2, aes(x = x, y = y))+ scale_x_continuous(breaks = seq(0, 15000, by = 3000))+ facet_wrap(~reason, scales="free_y") + labs(title="Reasons for slashes",x="epoch", y = "frequency")+ theme(legend.position = "none")+ expand_limits(y = 0) + scale_y_continuous(expand = c(0, 0))+ scale_y_continuous(breaks = scales::pretty_breaks(n=15)) ## Slash or Be Slashed Despite sounding scary, the slashings on the Medalla testnet are actually quite rare. Of the nodes on the network there are only 771 validators out of 80932 validators who are doing the slashings, which is less than 1% of the validators. Among these 771 validators, 59 of them were slashed at least once themselves. See Code df_validator %>% summarise_all(n_distinct) df_slashed %>% summarise_all(n_distinct) 771/80392*100 num_slasher=as.data.frame(table(df_slashed$slashedBy))
num_slasher$Var1=as.numeric(as.character(num_slasher$Var1))
num_slashed=as.data.frame(table(df_slashed$validatorSlashed)) num_slashed$Var1=as.numeric(as.character(num_slashed$Var1)) df_validator_slasher = df_validator %>% inner_join(num_slasher, by = c("index"="Var1")) frequent_slasher = head(df_validator_slasher[order(df_validator_slasher$Freq, decreasing=TRUE),],10)

sum(df_validator_slasher$slashed=='true') Below we illustrate the distribution of number of slashings received and the number of slashings performed. We can see that, of the validators that have slashed, most have only done slashings once or twice. Similarly, most validators, who have been slashed, have received only one or two lashings, and only a handful of them have been slashed more than 2 times. See Code ggplot(num_slasher, aes(x=Freq)) + geom_histogram(color="darkblue", fill="lightblue",boundary=0)+ labs(title="Distribution of the Number of Slashings Performed by a Validator", x="Number of slashings", y="Frequency")+ scale_x_continuous(breaks = seq(0, 100, by = 5))+ scale_y_continuous(limits=c(0,750), breaks = seq(0, 750, by = 50)) ggplot(num_slashed, aes(x=Freq)) + geom_bar(color="darkblue", fill="lightblue")+ labs(title="Distribution of the Number of Slashings Received by a Validator", y="Frequency")+ scale_x_continuous(breaks = seq(1, 7, by = 1))+ scale_y_continuous(breaks = seq(0, 1600, by = 100)) ## Top Slashers We collected statistics on validators in order to rank validators and define tiers, ranging from a top tier (1) with typically ideal characteristics (lots of executed blocks, never slashed, etc) to a Tier 7 which has not performed as expected in the role as validator. (See our article here for more details). The table below shows the top 10 validators that have done the most slashings, and their respective tiers. These slashers have similar current balance and effective balance. Most of them were also active for a long period of time. It is interesting that 8 out of the top 10 slashers reside in tier 3 where validators' performance becomes noticeably worse. Note that we have a great validator who is in tier 1 and a bad validator who is in tier 4. This shows not all of the frequent slashers have the same track record. We've reproduced our definitions of tiers 3 and 4 below, which highlights some of the reasons that Validator 11806 was the only one of the top slashers to be classified into tier 4. As it turns out, 11806 has been slashed him/herself, which contrasts with most of the others, particularly 36677 which achieved tier 1 in our rankings and had many executed blocks without being slashed. Tier 3 (Ranks 6943 – 38396): While validators in this tier are still healthy overall, they do have more skipped blocks and slightly fewer successful block proposals. This group has a lower average active time than tiers 1 and 2. It is in this tier we observe the first set of inactive validators.  Tier 4 (Ranks 38397 – 56534): This is the tier where the prevalence of validators with more serious performance issues begins to rise. The majority of actors are active and have not been slashed, though there are some. This tier is unique because it also houses many of the newer validation nodes who are trying to move up the ranks, many of which have not even had their first assignment.  ## Visualizing the Slashings As previously mentioned, the nature of the slashing data allows us to treat the various slashes as edges in a graph. A graph consists of a set of nodes and edges, where the edges represent some relationship between the nodes. The nodes in this instance are the individual validators, and an edge exists between two nodes if one node has slashed the other. Since the entire network is comprised of many vertices, we decomposed the network to all its connected subgraph to have a better understanding. The following graph is a simple 3 node network that shows a validator in the center who is responsible for slashing the two nodes on its sides. It is intuitive that, as a particular validator begins performing a number of slashes, the network around that validator node grows like a star pattern, with many edges from the centroid validator node performing the slashes to the slashed validators. There were two interesting observations about the slashing behavior that were particularly important to understanding the nature of the network visualizations. The first was that there was not one validator that had been slashed by the same validator twice. The second observation we discovered was that there were no instances of "revenge slashing" in which a validator slashed a second validator, and then the second validator eventually slashed the first in return. When you combine these two facts, it explains why all of the networks we produced were only simple graphs (i.e. it has no loops or multiple edges). Again, note that these are directed graphs, where the slashed validator is the vertex at the end of the arrow (edge) and the slasher is the vertex at the beginning of the arrow. To further illustrate the "star" pattern forming, consider the animation below. This animation explores a particular validator as the slashings are performed over time. Initially just a few other validators are slashed, but as the slashings become more prolific, we see dozens of edges form. On the other hand, this gif shows different validators that are being slashed by others. As a node accumulates more slashes, it also has a star structure. However, note that all the arrows are pointing towards the center. The most slashed validator was slashed a total of 7 times, which is extremely surprising as many validators exited after 1 or 2 slashes. This gif shows a possible progression of one of our most connected subgraph: two validators get connected as they slash the same validator, which then proceeds to slash more, forming their own star structure, and eventually get connected with other validators who have slashed many others. One other interesting visualization of the graph of slashings is to observe pairs of slashers that happen to only be slashing the same validators. In the animation below, the slashed nodes were slashed by the same two validators each time, mimicking the act of coordination between the two. With our understanding of the data structure, how the slashing patterns are represented in graphs, and our insights into frequency of their occurrences, we can now comfortably present the graph network of slashings in its entirety. See Code networkData <- data.frame(df_slashed$slashedBy,df_slashed\$validatorSlashed)
simpleNetwork(networkData)
network <- graph_from_data_frame(d=networkData)
plot(network,layout=layout.sphere(network),vertex.size=2, edge.arrow.size=0.01, vertex.label=NA, main="Whole network")

## Conclusion

Through our analysis of ETH2's security mechanism for blockchain security known as "slashing", we've observed some interesting patterns in its frequency, those who perform them, and their recipients. Some key findings include:

• Less than 1% of the validators have been slashed or slashed someone else.
• The number of attestation offenses vastly outweighed the number of proposer rule violations.
• Slashings take place at a rate of only 6.3 per 100 epochs.
• We identified presence of "super-slashers" who, despite their prevalence for slashing other validators, typically didn't have the best performance themselves.
• There was no evidence of "revenge" slashing, where a validator who was slashed reciprocated one.
• No two pairs of slasher and slashed appeared twice in the data.
• Slashing patterns in the network induce a simple star like structure when graphing the nodes and edges,
• Complexity in the graphs come in the form of single link or multi link connections that expand with the number of slashings.

As the network of interconnected violators continues to grow, we expect the number of interesting sub-graphs to grow with it and represent some interesting dynamics in terms of the interaction between validators as it pertains to slashing.