Scalability is one of the major issues facing blockchain technology today. Blockchain systems are designed to provide a decentralized, immutable ledger that can be used for secure and transparent transactions. However, as more users adopt blockchain technology, the system can become slower and less efficient, making it difficult to scale to meet the needs of large-scale applications.
One of the key reasons why scalability is a challenge for blockchain systems is that each transaction must be verified by all nodes in the network. As the number of transactions increases, the processing power required to verify each transaction also increases. This can lead to slow transaction times and increased fees.
Another key reason is that blockchain nodes are single-threaded, meaning they can process only one transaction at a time. This is because the blockchain is essentially a single data structure that is continuously updated with new transactions, blocks, and other data, and it is critical that updates to the blockchain are performed in a specific order to ensure consistency.
2. Parallelization overview
There are two main parallelization methods for improving blockchain scalability: inter-chain parallelization and intra-chain parallelization. Both approaches aim to improve the scalability of blockchain networks, but they differ in their focus and implementation.
Inter-chain parallelization: It refers to the process of dividing a large blockchain network into smaller, sub-blockchains that can process transactions in parallel. Each sub-blockchain operates independently and processes a subset of all transactions in the network. This approach allows for the processing of transactions to be distributed across multiple blockchains, which can improve scalability, reduce congestion, and potentially increase transaction speeds.
Intra-chain parallelization: On the other hand, involves optimizing the internal processing of transactions within a single blockchain node. This can involve parallelizing certain parts of the consensus algorithm, using multi-threading or sharding to enable nodes to process multiple transactions simultaneously. This approach focuses on improving the efficiency of processing multiple transactions within a single node, which can increase the overall transaction throughput and processing speed of the blockchain network.
Inter-chain and intra-chain parallelization can be used together to improve the scalability and performance of blockchain networks. Inter-chain parallelization can help to distribute the processing of transactions across multiple nodes, while intra-chain parallelization can help to optimize the processing of transactions within each node.
2.1. Inter-chain parallelization
Sharding, rollups, and sidechains are examples of inter-chain parallelization solutions, which involve splitting a large blockchain network into smaller sub-chains that can process transactions in parallel.
Sharding involves breaking up a blockchain network into smaller sub-chains, called shards, that can process transactions independently. Each shard processes a subset of all transactions in the network, which can improve scalability and reduce congestion.
Rollups are a technique that enables multiple transactions to be processed off-chain and then aggregated into a single transaction that is recorded on the blockchain. This can improve scalability and reduce transaction costs.
Sidechains are separate blockchains that are connected to a main blockchain network. Transactions can be processed on the sidechain and then settled on the main blockchain network, which can improve scalability and enable new functionality.
All of these solutions involve parallelizing the processing of transactions across multiple blockchains, which can help to improve scalability and reduce congestion in the main blockchain network.
2.2. Intra-chain parallelization
Intra-chain parallelization is a method of parallel processing that focuses on processing multiple transactions in parallel within the same blockchain.
Intra-chain parallelization focuses more on increasing the efficiency of transaction processing within a single chain. This can involve optimizing the consensus mechanism or introducing parallel processing techniques to enable multiple transactions to be processed simultaneously within a single block.
3. Concurrency control strategies
When multiple processes or threads are executing in parallel and accessing shared resources, some form of concurrency control is necessary to ensure that the system operates correctly and consistently. General speaking, there are two major concurrency control strategies: Pessimistic and optimistic.
Pessimistic: This approach assumes that conflicts between transactions are likely to occur, and takes a conservative approach to prevent conflicts. With pessimistic concurrency control, locks are placed on resources that transactions need to access. This prevents other transactions from accessing those resources until the lock is released, ensuring that only one transaction can access a resource at a time. While this approach can prevent conflicts, it can also lead to decreased performance due to the need for locking and waiting.
Optimistic: This approach assumes that conflicts between transactions are rare, and takes an optimistic approach to allow transactions to proceed without locking resources. With optimistic concurrency control, transactions are allowed to proceed without any locking, and are only checked for conflicts at the end of the transaction. If a conflict is detected, the transaction is rolled back and must be retried. While this approach can improve performance, it can also lead to increased rollbacks and retries due to conflicts.
In summary, pessimistic concurrency control assumes conflicts are likely and uses locking to prevent them, while optimistic concurrency control assumes conflicts are rare and allows transactions to proceed without locking, only checking for conflicts at the end of the transaction.
3.1. Pessimistic concurrency control (PCC)
Pessimistic concurrency control involves locking resources to prevent conflicts between multiple concurrent transactions. This approach assumes that conflicts are likely to occur and tries to prevent them from happening.
- Provides strong consistency guarantees
- Prevents conflicts before they occur, reducing the likelihood of rollbacks
- Suitable for systems with high contention or low throughput
- Can lead to blocking and reduced concurrency if there are many conflicts
- Can lead to deadlock if locks are not released properly
- Can reduce system performance if locks are held for a long time
3.2. Optimistic concurrency control (OCC)
Optimistic concurrency control (OCC) is a technique that can be used to manage concurrent access to shared resources in a database or software system.
The basic idea behind OCC is to allow concurrent access to shared resources and only check for conflicts at the time of update. In other words, if multiple threads are accessing the same data, each thread can make changes to that data without checking with the others, assuming that there will be no conflicts. When a thread tries to commit its changes, the system checks whether there have been any conflicts with changes made by other threads. If there are no conflicts, the changes are committed; otherwise, the thread must roll back its changes and retry.
OCC works by allowing all threads to make changes to the data without any locking, assuming that there will be no conflicts. However, if a conflict does occur, one of the threads must roll back its changes and retry, which can lead to increased contention and reduced performance.
There are some pros and cons to using OCC in this context:
- OCC can provide better concurrency and throughput than traditional locking mechanisms.
- It can help reduce the amount of contention for shared resources.
- Developers don’t need to worry about conflicts with other threads.
- OCC is most effective when contention is low, and conflicts are relatively rare.
- OCC can result in higher rollback rates if conflicts occur frequently.
- OCC requires careful tuning to ensure that it provides the desired performance benefits.
3.3. PCC blockchains
The proposal EIP 648 introduced a new transaction type that allows transactions in the EVM to be processed in parallel. Transactions contain a list of address prefixes that represent the range of all addresses with that prefix, and any access to addresses outside this range will fail. This proposal incentivizes easy parallelization of transactions and is maximally backwards compatible with old-style transactions.
The proposal EIP 2930 adds a transaction type which contains an access list, a list of addresses and storage keys that the transaction plans to access. Accesses outside the list are possible but become more expensive.
Solana needs a list of accounts that will be accessed by a transaction. Solana uses an account-based model, where accounts hold the state of the network and transactions modify this state. Accounts in Solana can be thought of as similar to smart contract accounts in other blockchain platforms.
In Fuel blockchain, to enable parallel processing, each transaction specifies the accounts it intends to access. Only non-overlapping transactions can be processed in parallel.
The concurrency control strategies used in both Solana and Fuel blockchain are pessimistic because they require transactions to acquire locks on the resources that they need to access before they can proceed with execution. To acquire locks on the resources they need to access, transactions in Solana and Fuel blockchain must specify the accounts they intend to access in advance.
In both Solana and Fuel, the account lists provide coarse-grained concurrency control at the account level meaning that only transactions accessing different accounts can be processed in parallel without conflicts. Transactions that access the same account, even if they access different states within that account, are subject to serial execution to prevent potential conflicts and ensure state consistency.
However, both Solana and Fuel’s concurrency control mechanisms do not support intra-contract parallelization, meaning that transactions calling the same contract must be processed sequentially to ensure data consistency and prevent conflicts.
It is a common pattern that when a popular decentralized application (dApp) is released, a large portion of the transactions on the blockchain are related to that application. It can attract a significant number of users and generate a high volume of transactions. It is obvious that a blockchain must be capable of parallelizing concurrent calls to the same smart contract.
3.4. OCC blockchains
APTOS uses an optimistic concurrency control technology called “block-STM” to support multi-threaded block validation and to avoid state conflicts. It is based on the concept of “software transactional memory” (STM), which allows multiple threads to execute a transaction in parallel, assuming that there are no conflicts between the transactions. If a conflict is detected, the transaction is rolled back and retried with the updated state.
3.5. OCC vs PCC
PCC acquires locks dynamically at runtime, which can lead to indeterminate behavior. To mitigate this issue, several techniques require pre-declaration of resources to be accessed. However, these methods often require modifications to the original transaction format, which can impact compatibility.
Additionally, pre-declaring resources can be difficult as it may be challenging to accurately know which states will be accessed during the transaction execution. As a result, resource declarations may be coarse-grained, leading to reduced concurrency in some cases.
In situations where both concurrency and compatibility must be taken into account, an OCC-based concurrency framework is the superior choice.
4. What is EVM?
The Ethereum Virtual Machine (EVM) is a crucial component of the Ethereum blockchain, responsible for executing smart contracts and processing transactions. The Ethereum Virtual Machine (EVM) is a runtime environment for smart contracts in the Ethereum blockchain. It executes code written in Solidity or other programming languages and enables the deployment of decentralized applications (dApps) on the Ethereum network.
4.1. EVM is single-threaded
Traditionally, the EVM has been single-threaded, meaning it can only process one task at a time. This design choice was made to simplify the implementation of the EVM and to avoid race conditions that might arise when multiple threads try to access the same resources simultaneously. However, as the demand for faster and more efficient processing has grown, there has been a need for a more powerful and scalable EVM.
4.2. Why EVM compatibility is important?
EVM compatibility refers to the ability of a blockchain platform to execute smart contracts that were originally written for the Ethereum Virtual Machine (EVM). Because Ethereum is the most widely used smart contract platform, many developers write their smart contracts in Solidity (Ethereum’s programming language) and expect them to be compatible with the EVM.
Blockchain platforms that support EVM compatibility can also execute these same smart contracts written for the EVM, allowing developers to potentially use the same code across multiple platforms. This makes it easier for developers to migrate their dApps from one blockchain to another or to offer their dApps on multiple blockchains simultaneously.
The EVM has already established itself as the de facto standard for blockchain development. EVM-compatibility enables developers to deploy existing Ethereum dApps on the new blockchain with minimal modifications. This can save time and resources and help to accelerate the growth and adoption of the blockchain.
This compatibility also enables developers to leverage the existing Ethereum ecosystem, including its tools, documentation, and community support. This can save time and resources, allowing developers to focus on creating new and innovative applications rather than reinventing the wheel.
EVM-compatibility can accelerate the growth and adoption of new blockchains by enabling the seamless migration of existing Ethereum dApps to the new platform. This reduces the friction and barriers to entry for developers, making it easier to build and deploy new applications on the blockchain.
4.3. What is concurrent EVM?
Multithreading is the ability of a program to perform multiple tasks simultaneously, using multiple threads of execution. In the context of the EVM, a Multithreaded EVM allows for multiple transactions to be processed concurrently, significantly improving the overall processing speed and performance of the network.
By implementing Multithreaded EVM, the Ethereum blockchain can handle a larger volume of transactions without sacrificing decentralization, which is a key feature of the blockchain. This is because Multithreaded EVM distributes processing tasks across multiple nodes, rather than relying on a single central authority to process all transactions.
Multithreaded EVM is a more powerful and scalable version of the Ethereum Virtual Machine, which allows for faster and more efficient processing of transactions while maintaining the decentralized nature of the blockchain.
4.4. Designing Concurrent EVM
Since the EVM was originally designed as a single-threaded system, adding concurrency support would require significant changes to the original design. It would involve creating a new transaction format, changing the way transactions are validated, and modifying the way gas limits are enforced. Additionally, concurrency support may require changes to the underlying architecture of the EVM to enable parallel processing of transactions. In addition, there are several key considerations that should be taken into account:
Determinism: It is the property of a system where the same input always produces the same output. In the context of blockchain, determinism is critical because it ensures that all transactions are executed in a predictable and consistent manner.
State consistency: The concurrency protocol should be designed in such a way that it ensures the correctness of the transactions and the final states at all times.
Compatibility: The concurrent EVM should be compatible with existing smart contracts and applications on the Ethereum network. This will make it easier for developers to integrate it into their applications and ensure the smooth transition from the existing EVM to the CC-based EVM.
Performance: The concurrency protocol should be designed to minimize the latency and overhead associated with transaction processing. This will ensure that the CC-based EVM can handle a high volume of transactions with low latency and high throughput.
Gas pricing: The gas pricing mechanism would need to be adjusted to account for the increased complexity of executing multiple transactions in parallel. The current gas pricing model is based on the assumption that transactions are executed sequentially.
Solidity APIs: The Solidity programming language used to write smart contracts would need to be updated to support concurrent programming. The API should also help developers write smart contracts that reduce the likelihood of transaction contention.
Ease of Use: The concurrent EVM should be easy to use and developer friendly. It should be easy for developers to integrate CC into their applications and write smart contracts that work seamlessly with the concurrent EVM.
Backward compatibility: It should support code written in previous versions of Solidity without requiring any modifications or changes to the code.
Forward compatibility: The Ethereum team is currently working on upgrading the original EVM implementation, so making extensive changes to the original implementation could result in increased complexity, maintenance costs, and potential compatibility issues in the long term.
Maintenance costs: Being EVM-compatible requires ongoing maintenance and upgrades to keep up with changes to the Ethereum protocol. This can be time-consuming and costly.
Minimal changes: Introducing concurrency to the EVM should be done with the goal of minimizing changes to the original design and implementation. This will ensure that the concurrent EVM is compatible with existing smart contracts and applications and can be maintained effectively in the long term.
Limited control: By being EVM-compatible, the design will have to adhere to the rules and constraints of the EVM. This can limit the blockchain’s ability to implement these important features.
4.5. Parallelism types
When it comes to designing a concurrent EVM, there are various concurrency strategies that can be used, each with its own benefits and drawbacks. The choice of concurrency strategy depends on several factors, including the workload characteristics, the available hardware, and the design goals of the EVM.
Thread-level parallelism where multiple threads are used to execute transactions concurrently. This approach is relatively simple to implement and can provide good performance gains, especially on multi-core systems. However, it may require significant changes to the existing EVM design to enable thread-safe execution of transactions.
Transaction-level parallelism: Where transactions are partitioned and executed independently on different processors. This approach can provide even greater scalability than TLP, but it may require additional overhead to manage transaction dependencies and ensure consistency across the distributed system.
Task-level parallelism: Where the execution of individual EVM instructions is split into smaller tasks that can be executed concurrently. This approach can provide good scalability and fine-grained parallelism, but it may require significant changes to the EVM instruction set and runtime environment.