What is a Merkle tree?

advanced Blockchain

By Anycoin Direct

18-12-2023

4 min read

Today we're going to tell you all about the Merkle Tree, a crucial structure in blockchain and cryptography. Let's get started right away and discover why it's so important.

Brief summary

✔️ A Merkle tree partitions data, creates hash values from these data and merges them into a tree structure, enabling efficient verification and transfer.

✔️ The Merkle tree was invented by Ralph Merkle and is a tree diagram used for data integrity.

✔️ Merkle trees are widely used in blockchain technologies such as Bitcoin and Ethereum, where they help validate transactions and reduce data burden.

What is a Merkle tree?

The concept of the Merkle tree was invented by American mathematician Ralph Merkle and patented in 1979. So the German ex-chancellor has nothing to do with it. It was described in the paper "A certified digital signature."

In the name Merkle tree, the last part stands for tree. This is not without reason, as this is a tree diagram, or branching model, where more and more branches lead to one main branch. It is similar to a family tree branching model.

Another name for the Merkle tree is the hash-tree. A hash is an algorithmic derivative of an input, that is, you can draw a hash (derivative) from, say, a word (input) by means of a certain system of algorithms.

A very well-known system is SHA (Secure Hash Algorithm) 256. This involves drawing a hash of 256 bits from a given input. The outcome (output) then becomes a string of 64 numbers and letters.

A typical SHA 256 hash derivative:

"Anycoindirect" has as output when you run SHA 256 on it:

98948d20dc70322d424ea5bbf37970e20b6aafa90c498116779ea964042c2c64

"anycoindirect" has as output:

a730a577d61512581afdd9c74e68abb2c43b0f7db3c70a92583464277b7abc81

You can see that a single capital letter already changes everything. From these kinds of hashes, a tree is created. Just like Christmas, isn't it?

How does a Merkle tree work?

Suppose you want to send a message or file to someone else, but you don't want others to be able to see it. This can be secret information, such as a password, but also a file share with important information. Then you can use SHA 256 to encrypt the message.

A message is divided into pieces and turned into a hash. Each hash appears in the tree, with the root at the top of the tree. This is the complete message or file. A big advantage here is that the message can be broken up into smaller pieces, so huge files do not have to be delivered entirely by a single computer and it can be done much faster.

So the tree is built from the bottom up, sort of like what you do with Bittorrent. Everyone then has a small piece of the file (peer to peer networks) and at the end user's end all the small pieces are put back together again, after which the tree is complete again. Then the end user can calculate with his Merkle tree whether all pieces are original (the hashes that can be calculated must then match his table).

The root has a specific outcome, so if you know it, you also know if the message or file is original. As soon as it is not, a renewed attempt will be made to receive the original. Once the root tree and the output match with you you have received the correct information.

Above, we explained how to create one hash. In the tree diagram, some of these outputs are combined and put into a new hash using the same method. So, as you get to the top of the tree, more and more information is put into a higher hash. Until you get to the top and merge all the hashes from the tree into 1 hash, then you are ready to send the information.

The receiver of the message is notified of the root tree and begins downloading the branches, the hashes. These branches can come from multiple sources. After this, he starts reconstructing all the hashes in the tree. Each part of the tree must be correct, if it is not, the root tree is not correct, but through reconstruction he can determine which part is not correct. Then he usually tries to download the whole tree again.

Since these Merkle trees work with binary data, they are also called binary trees.

What are Merkle trees used for in crypto?

Now that you know that to check the correctness of a hash you really only need to know the root tree, you might imagine that there could be systems that take advantage of this.

One important blockchain that uses this is Bitcoin. On this blockchain, a lot of information has to be processed in a block. The less information you have to check and put into such a block, the better.

Since a Merkle tree can aggregate quite a bit of data into a single number, it is very useful when compiling a block.

Suppose you have had 200 transactions to process as a miner in the past ten minutes (approximately every ten minutes a new block arrives on the Bitcoin network). The Merkle tree then goes to work reducing all these transactions to a single root tree.

Validators on the Bitcoin network must track the state of the blockchain, validate transactions and add new coins. If they only have to check 1 sequence for correctness, not only are they much faster, but a block is also much smaller. If 51% of these miners agree that the root tree is correct, the block can be added.

This root tree hash is put in the beginning of the next block, so that all miners know the value of the previous block's hash. So when they create a new block they put the entire state of the network with a single cipher string. Then new transactions are added, going through the same process and so on. If the root tree is not correct, for example because there are rogue records among them, then miners can't agree either and they start looking for a Merkle tree with the correct root tree. Ethereum also uses a Merkle tree, but of course they have invented something special: the Merkle Patricia tree.

Were networks not using Merkle trees, all the data would have to be transmitted, and you can imagine what that would mean for speed, let alone security.