This article is the second in our series about blockchain technology. If you haven’t already done so, we recommend you read Introduction to Blockchain first.
The essential ingredients of a blockchain are decentralization and immutability. In this article, we will explain both and why you can’t have a blockchain without them.
Let’s say you did that ten times so that you have ten identical chains of glued-together wooden blocks, each spelling the word “blockchain”. Now you distribute those ten chains to ten different people in ten different locations.
Later, you call each of those ten people and ask them to read back to you the word spelled out by their wooden blocks. Nine out of ten of the people read back to you the word “blockchain” thereby forming a consensus of the correct value of the chain. But the tenth person to whom you gave a chain of wooden blocks has taken out a saw and hacked into his so that it now says “block party” instead of “blockchain”.
It is obvious that this chain has been hacked into and had it’s values changed because his answer does not match with that of the rest of the group. Therefore, you can no longer trust this person and won’t be calling him again to read back the value of the wooden blocks. Instead, you add someone new to your group of ten who will hopefully give you the correct answer each time you call.
Believe it or not, this is the essence of how blockchain technology works: each “block” in a chain contains data that proves it’s connection to the one that came before it. If any block in the chain is altered, the collective message it comprises will be different from all the others when it should be an identical copy.
Let’s break this down into more technical terms.
Unlike the centralized data sources like SQL or Oracle, complete and identical versions of a given blockchain are distributed among nodes in a network. Those nodes are the ten people to whom you gave the wooden blocks. In public blockchains such as Bitcoin and Ethereum, this can be anyone, anywhere. Presumably, people who are running nodes have some connected interest in the blockchain such as using it to transfer money. At the time of this writing, the Bitcoin blockchain has 8,417 nodes and the Ethereum blockchain has 23,669 nodes. Each one of these nodes has an entire, current copy of the blockchain.
So instead of a mere group of ten people to verify the value of your blockchain at any given time you have 8,417 and 23,669 for the Bitcoin and Ethereum blockchains respectively. That means that if the majority of these nodes report that the present value of the chain is “blockchain”, then a consensus has been reached as to the present true value of the chain. Any nodes reporting other values can be dismissed.
Readers who have spent a career working with traditional centralized relational databases, are shaking their heads at this point because keeping so much data in near real-time perfect parity across all nodes is simply too impractical. If this is the thought you are having, remember that blockchains are not intended to store sprawling relational data as you would in a centralized data store. They are meant to store small amounts of information about a transaction that has occurred. Think of the Bitcoin blockchain as a single table in a database called “Money Transfers”. There would be but four tiny columns:
- ID of user sending money
- ID of user receiving money
- Date of transaction
This is of course is a generalization, but at the time of this writing, the entire Bitcoin blockchain of nearly 243 million transactions constitutes only about 125GB of storage space. This would easily fit on most personal computers.
Even with the understanding that blockchains record a vast amount of information in a relatively small storage space, you may be wondering about how those instances of the block chain could be efficiently compared. The answer to this question is uncovered by another interesting aspect of blockchains called “immutability”.
Things that can not be changed are immutable. If you take a picture of me wearing a silly hat and share that all over the Internet where it becomes a sensationally successful meme, it is not immutable. I can change my hat, but I can’t change the fact that now hundreds of thousands of other people have a copy of the picture in which I was wearing the silly hat.
So how do we make a block chain immutable and also make it fast and easy to verify it’s immutable value across thousands of nodes? You use a very cool cryptography trick called a Merkle Tree.
While it has some complicated parts, the general concept behind Merkle trees is simple.
You take a series of values, break them into pairs, and sum the pairs. Now you repeat the process until you arrive to a single value.
For the sake of simplicity, assume that in the example above there are no other mathematical arrangements for coming up with a total of “10”. It is now very easy for all nodes participating in a blockchain to instantly report and reach a consensus that the present value of the chain is “10”. For nodes reporting different values, it means that someone tampered with a block thereby changing the value of that copy of the blockchain.
In the real world, cryptographic hashes are used to “glue” a block in the chain to the block that came before it. A “hashing algorithm” is a mathematical algorithm that can take values of any length and reduce them to a value of known length. Using the SHA-1 algorithm produces an output of exactly 40 characters for each of the following:
The Gettysburg Address: a663989e9e45ed022bea82b5a6e8a279dd961370
As you can see the hashed value of the 9-character word “Polyrific” becomes a 40 character value as does the hashed value of the 1,269 character Gettysburg address. Interested in learning more about how this works? Try it yourself: our interactive article on public key encryption provides a hands-on experience for understanding basic encryption principles.
So now substitute the single letter on those wooden blocks with a hash value like the ones above. Here comes an important concept:
The hashed value on the face of each block in your chain is derived from the the hashed output of all data in the block plus the hashed value of the block that came before it.
Remember that no matter the length of text (or size of data) stored in a block, the hashing algorithm will take it down to a 40 character representation. You now take that hash and hash it again with the value of the hash representing the block that came before it and you arrive to a new unique hash value.
Back to immutability, the chain has become immutable because if you change any value in the chain, it will have the effect of changing the final hashed value of the most recent block which is what all nodes will report in order to gain a consensus. This means that the majority of participating nodes would have to conspire to change a specific value at a specific place in the chain, and then re-hash the entire chain in order to win a consensus vote that is not a true picture of historical record.
Aside from it being impractical to coordinate a large number of nodes in real-time to change a specific value, it is far too computationally expensive to recalculate (hash) the entire blockchain. In fact, blockchains like Bitcoin and Ethereum specify that the hash value of a block has to be even harder to calculate than calculating the hash of the blocks value against the value of the previous block alone. That’s where the miners come in. We will talk about them more in our article entitled “Miners & Cryptocurrency”.
As you can see, without having the immutability assured by the hashing algorithms and the decentralized consensus as to what that immutable value is, then you can’t have a blockchain as we presently know it. Of course, you can have private blockchains that generally forgo the decentralized aspect and focus on immutability instead. These are often called distributed ledgers and will be discussed our article entitled “Private Blockchains and Distributed Ledgers”.
Blockchain technology is changing the way enterprises, and the customers who support them, operate. If you would like technical guidance or implementation of a public or private blockchain, or simply help participating in an existing blockchain, then please call us at 1-833-POLYRIFIC or send us a message to learn more.