STORE is bringing Cloud 2.0 to the world with a zero-fee cryptocurrency and checks and balances governance.
Tendermint is open source software for securely and consistently replicating the application state on many machines. The replication is Byzantine fault-tolerant in that it works even if up to 1/3 of machines fail in arbitrary ways. Tendermint core implements the blockchain consensus engine, which ensures that the same transactions are recorded on every machine in the same order. While highly performant, Tendermint core suffers from the curse of peer-to-peer system, namely, the communication overhead when a large number of machines are involved in the consensus. As the number of peers increases the throughput suffers, which makes Tendermint core unsuitable for large deployments with thousands of peers. In this paper we propose a novel idea to optimize the peer-to-peer communication, which ensures that the throughput is unaffected by the size of the network. Our proposal is an open source idea to potentially improve the core Tendermint consensus engine.
Tendermint consists of two components: a blockchain consensus engine and a generic application interface. The consensus engine, called Tendermint Core, ensures that the same transactions are recorded on every machine in the same order. The application interface, called the Application BlockChain Interface (ABCI), enables the transactions to be processed in any programming language. The Tendermint core is run in a network of peer machines where the application state is replicated. The ABCI app is responsible for implementing any business logic, such as validating and executing transactions. When a transaction is submitted to any of the peers running Tendermint core, the following steps take place.
Separate checks are required because each peer maintains the state machine locally, independent of other peers. However this leads to unnecessary calls to CheckTx() by every peer. It also burdens the ABCI app because all the peers who receive the transactions want to validate them before adding them to their local mempool. This can be optimized.
Secondly, the block proposer is elected on a round-robin basis, who proposes the new block for voting. The transactions included in the proposed block get committed, when the block is committed via a 3-stage process. Since the new blocks are produced one-at-a-time and the peers only focus on the transactions in the proposed block during the consensus rounds, a secure shared mempool can be devised for queuing validated transactions. The elected block producers dequeue the transactions from the shared mempool to create the new blocks.
In this paper, we examine an approach that optimizes the peer-to-peer messages required to validate the transactions and complete the consensus round. Specifically, we propose the following two enhancements to Tendermint core.
This approach is agnostic to the number of peers in the network. Since peer-to-peer communication is eliminated the consensus throughput remains unaffected by the number of peers.
Tendermint core offloads the trust responsibilities to the associated ABCI app. It focuses only on secure and consistent state machine replication among participating peers. For example, it relies on the ABCI app to validate the received transaction via a call to CheckTx() API. It truststhe ABCI interface to do the right things to validate the transaction based on whatever business rules applicable for that app. This reliance on the ABCI app for trust can be extended to avoid relaying transactions among the peers and the resulting unnecessary and repeated transaction validations. In this section, we propose a shared state machine model that the peers can rely on during the consensus rounds. We also demonstrate that the shared state is secure and tamper-proof.
Fig. 1 below shows the traditional interaction between Tendermint core and ABCI app.
When a peer is elected as a proposer, it takes the transactions from its local mempool to build the new block and proposes the new block to its peers for voting. The transactions in the mempool may have been received by the proposing peer as well as from other peers. This means, the proposer somehow needs to grab a list ofvalidatedtransactions to propose the new block for voting. Since transaction validation itself is offloaded to the ABCI app, maintaining the list of validated transactions can also be offloaded to it. So, we need ashared mempool with the following properties.
Tendermint core cannot have blind trust on the ABCI app for total ordering. The peers forming the core consensus engine must do some work to guarantee total ordering without excessive communication overhead. We propose an approach similar to Proof-of-History used by Loom [1] that provides a way to cryptographically verify passage of time between two events. This approach is much simpler than maintaining synchronized clocks using NTP (Network Time Protocol) or vector clocks among the participating peers because these approaches still need peer-to-peer communication to agree on the total order. The ordering protocol uses a cryptographically secure function (CSF) written so that the output cannot be predicted from the input without completely executing the CSF to generate the output. The function is run in a sequence, its previous output as the current input thus forming a series of outputs. Data for an event can be timestamped into this sequence by appending data as part of the input to the function. This guarantees that the data associated with a particular output must have been created prior to the data associated with the next output because the outputs form a sequence. Total order of the associated events is thus ensured.
One example of cryptographic secure function is sha256. Its output cannot be predicted without running the function. We can run the function starting with a random value and feed the output of the function as the input to the same function to form a sequence of outputs. Table 1 shows an output series with an initial random value as a timestamp —"2018-03-30 17:23:19 UTC".
The 4th output “db1893f6caaeddacd29fc008bab127ced6b9ebbf00f4fcc1809ffbdbb9421a40” cannot be predicted without runing the function 4 times starting with the predefined random string. If we have events attached to this sequence as we generate the outputs, we can be assured that the events must have occured in the same sequence.
Table 2 shows a scheme that attaches event data to the input for the function. An event happening at the time of generating the next output can append the event data (such as the hash of the transaction, for example) to the input, in effect timestamping the event. In the following table H(Tx) is any hash function such as sha256 itself that produces the hash of event data.
Notice sequences 2 and 4 appended the hash of the incoming transactions to the outputs of 1 and 3 respectively, so the corresponding outputs not only depend on the previous outputs, but also the transaction data. Given two transactions Tx1 and Tx2, anyone can verify that Tx2 is created after Tx1.
Finally, we need a way to ensure that the peer creating the output sequences can prove that it received the transactions included in the sequence. This can be achieved with cryptographic signature scheme as shown in table 3 below.
Each peer generates the output sequence independent of the other at predetermined intervals. Whenever the peers receive transactions from the clients, the following steps take place.
Notice that there is no relaying of the transactions to the other peers. The unnecessary communication overhead is eliminated altogether. Notice also that only the peers that receive transactions need to call SequenceTx().
Each peer generates the output sequence independent of the other. This leaves one particular attack possible. Some peers may generate the sequence at a higher rate than others, so all the transactions received by them will have higher order (more recent) than other peers. But such malicious behavior is easy to detect because the malicious peer would produce a longer sequence in a given period of time (such as between consensus rounds) compared to the honest peers. Since Tendermint recommends punitive measures (such as slashing, etc.) for misbehaving peers, the same punishment can be extended for this attack also. Since each transaction in the shared mempool is signed by the peer sending it, the malicious peer can be identified by other peers.
We alluded to a new API, SequenceTx()above. This new interface must be added to the ABCI specification to build a shared mempool. The shared mempool stores the validated transactions received by all the peers. This responsibility is handed over to the ABCI app because it can use any suitable technology to implement the shared mempool. But more importantly, Tendermint core leverages the trust it has with the ABCI app, so it doesn’t have to deal with managing the shared mempool. It is assumed that the ABCI app ensures data consistency and availability. The order of transactions in the shared mempool are guaranteed by the approach described above.
In order to fulfill the management of shared mempool, getTxs(), and removeTx() interfaces are also added to the ABCI specification. getTxs()returns the current list of validated transactions in the shared mempool. The elected proposer calls this method to get the transactions before constructing the new block. Similarly, removeTx() is called by the proposer after deliverTx() to remove the finalized transaction from the shared mempool. Fig 2 below shows populating the shared mempool.
Compare fig. 2 with fig. 1. Notice that the peers
Fig. 3 shows the steps that the proposer takes when it is elected to add the new block to the blockchain.
The flow is similar to fig. 1, except for retrieving the proposal transactions. The proposer gets the proposal transactions from the shared mempool, instead of its local mempool. It uses getTxs()to get the proposal transactions. Since these transactions are added by multiple peers, the proposer needs to ensure that the proposal transactions are indeed valid. The proposer takes the following steps before adding them to the new block.
The usual locking mechanisms implemented in the Tendermint core works here also. While getTxs() is being processed, no additional transactions are added to the shared mempool. The peers can continue to accept transactions, sequence them, and store them in their local mempool until the shared mempool is unlocked.
The shared mempool opens the possibility of parallel consensus rounds by multiple block producers. Since the transactions are ordered in the shared mempool, multiple block producers can propose multiple blocks for voting. The prevote and precommit phases of multiple proposed blocks can be pipelined, so the voting happens in parallel. Only the commit phase needs to be serial, so the block proposed by thecurrent block producer is committed first, followed by the block proposed by thenextproducer, and so on. Since the block producers are elected on a round-robin basis, the order of the producers is known to all the peers, so the transactions can be sliced according to this order. This is especially useful when large number of peers are configured in a network, each of which receives large volumes of transactions.
Tendermint core is a high performant consensus engine, but the communication overhead stemming from relaying messages among peers makes it unsuitable for large deployments with thousands of peers. We propose a novel approach to maintain a shared mempool of ordered transactions, which eliminates the need for relaying the incoming transactions to all the peers. The proposal also eliminates the security and trust issues by leveraging the ABCI interface, which Tendermint core trusts inherently. The proposed approach however, adds 3 more APIs to the ABCI interface and offloads the consistency and availability responsibilities to the ABCI app. This should not be perceived as a weakness because the Tendermint core is not a standalone consensus engine and is always used in association with an ABCI app.
References