Introduction

For many, cryptography can seem like an intimidating field. While online resources exist, they often lack a beginner-friendly approach. This lack of accessible information can be a barrier to understanding this crucial technology that underpins blockchain.

This book aims to bridge that gap. By focusing on fundamental concepts, it will provide a clear understanding of what cryptography is, how it works, and its role in securing blockchain technology. Blockchain itself is a revolutionary application of cryptography, and this book will delve into its inner workings, including digital signatures, consensus mechanisms, and tokenomics.

More than just understanding blockchain, this book will equip you with the knowledge to engineer your own cryptocurrency. It explores the essential tools and techniques used in cryptocurrency development, giving you a practical foundation to turn your ideas into reality.

We'll explore the revolutionary concept of a peer-to-peer (P2P) digital cash system, as first proposed by Satoshi Nakamoto in the Bitcoin whitepaper. You'll learn how cryptography, digital signatures, and consensus mechanisms work together to make this secure and transparent form of digital money possible.

Money

"The lack of money is the root of all evil."

~ Mark Twain, The Prince and the Pauper

There is something called the coincidence of ones. Suppose that I have a sheep and another person has wheat. I am hungry and I'd like to make bread and the other person would like to make a sweater. My sheep would be useful to the other person and the other person's wheat would be useful to me. We can barter.

What if the other person has vegetables and I don't want the vegetables but the other person still wants the wool from the sheep? How do we execute this trade? We don't have a coincidence of ones. Some theories suggest that money developed as a result of this problem.

The things that represented money had certain characteristics.

They were rare
They were not easily reproducable

Another theory about how many came about is the idea of receipts. I can store something of value somewhere in a depository and get a receipt which entitles me to the good being stored. The idea is money evolved out of trading this receipts. Whoever owns the receipt has access to the good stored in the depository.

Why does money have value?

Money has value because we give it value. It's a collective story that we all tell. If you have a dollar bill and you want to buy something from someone, they are going to take it. And they know they can also make an exchange with it too.

Tokens are essentially digital representations of things that represent money. The reason that they have value is because we give them value.

How Payments Work

In traditional payments, banks keep records on a ledger of who owns what. Assume we have Alice and Bob, and Alice wants to pay Bob. Alice authenticates with a bank to prove she is Alice. The bank updates the ledger, debits the money from Alice's account, and credits Bob's account.

Pros

Digital payments using banks ensure that the two parties don't have to be in the same location.

Cons

The bank must be online.
Privacy concerns.
The bank can censor transactions.

Ecash

Designed to remove the double spending problem. In traditional Ecash, the bank creates a coin, i.e., a digital representation of a coin with a unique serial number on it. It then gives the coin with the serial number to Alice, and Alice can prove the bank produced the coin using something called digital signatures (we will explore them later). Alice gives the coin to Bob and gets the sandwich, and Bob goes back to the bank, and the bank confirms the serial number has not been spent. The problem with this is the bank has to be online.

Chaumian Ecash

In chaumian Ecash, Alice generates the secret number Alice then adds some randomness to the secret number so that the bank cannot see the secret number The bank verifies on that blinded secret number and there is no way it knows that it came from Alice.

Alice requests for a coin b(SN) where b is the blinding factor. The bank gives sig(b(SN)) where sig is the bank's signature. Alice can remove the blinding factor and the coin is sig(SN), SN Alice gives this to bob, Bob checks if this is a valid signature from the bank. The bank also records the secret number and it runs a list of secret numbers to make sure secret numbers aren't used more than once.

What if Alice gave the same to Charlie? This would be a double spend right? Well the way chaumian ecash works is that actually the bank stores more information and incase a double spend is detected, the bank will know that it was Alice who tried to double spend. Initially if no double spend, the bank wouldn't know but if double spend is detected, the bank will know it was alice and this is kind of a punishment and also a motivator for Alice not to do so.

This is a clever method and it has several pros: 1.Peer to peer 2. Offline double spend detection 3. Privacy

It still suffers from a big problem:

The bank can decide that they don't want to play this game with you.

The real question is how can we have a digital payment system that doesn't rely on a central party like a bank

Hash functions and signatures

The idea of a hash function is very simple but powerful.

hash(data)->output

A hash function takes in data of any size and produces an output of a fixed size.

The output of a hash function is ‘random looking’ but it is deterministic because if you provide the same input to the hash, you will get the same output.

Hash functions tend to exhibit particular traits, which you should consider if you are implementing your own hash functions from scratch (although it’s not recommended), one of which is called the “Avalanche effect.”

The Avalanche effect is the idea that changing 1 bit in the input of the hash function should change about half the output bits.

The second trait that a hash function should have is preimage resistance.

I’ll define preimage resistance as below:

Given y, you can’t find any x such that hash(x)==y.

You can find it eventually but that will take 2²⁵⁶ guesses which is not doable with current computers.

Hash functions should also have second preimage resistance. What this means is:

Given x and y such that hash(x)==y, you cannot find x¹ where x¹!=x and hash (x¹)==y.

A hash function should also be collision resistant. What this means is nobody can find any x, z such that x!=z and hash(x)==hash(z).

Again, you can find it theoretically after 2¹²⁸ guesses but practically this would take a huge amount of time with today’s hardware.

Practically speaking, collision resistance is harder to achieve. Some previous hash functions such as sha-1 and md5 have their collision resistance broken.

According to Wikipedia, “In early 2005, Vincent Rijmen and Elisabeth Oswald published an attack on a reduced version of SHA-1–53 out of 80 rounds — which finds collisions with a computational effort of fewer than 280 operations.[39]

In February 2005, an attack by Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu was announced.[5] The attacks can find collisions in the full version of SHA-1, requiring fewer than 269 operations. (A brute-force search would require 280 operations.)”

Hash functions are widely used to create digital signatures in blockchains like bitcoin.

Digital Signatures and Lamport’s signature scheme

Simply put, digital signatures serve as proof that you’re the author of a message. Here’s how it works: You sign a message using a secret key. Then, you share the signature along with a public key. Anyone receiving this can verify that you indeed authored the message and possess the corresponding private key used for signing.

Here is how this is applied in Bitcoin. When you want to spend Bitcoin, you create a transaction. This transaction includes information about the sender, receiver, and the amount of Bitcoin being transferred. To authorize this transaction, you sign it with your private key.

Once you’ve signed the transaction, it’s broadcast to the Bitcoin network. Every node in the network verifies the transaction by checking the digital signature against the public key associated with the sending address. If the signature is valid, it confirms that the sender indeed owns the Bitcoin being spent and has the right to transfer it.

Lamport signatures are a type of cryptographic signature scheme based on a one-time secure hash function. They were proposed by Leslie Lamport in 1979.

Here is how it works:

You have a function that generates a public key and a private key. The private key is generated from random characters, and the public key is the hash of the private key. Below is pseudocode to demonstrate this.

generate_keys():
 private_key = Array[2][n]
 for i from 0 to n-1:
 for j from 0 to n-1:
 private_key[i][j] = randomly_generate_bits()
 public_key = hash_each_element_of_private_key(private_key)
 return private_key, public_key

You then create a function called sign that takes in the message to sign and the private key. You hash the message and represent it in bits. Then you iterate through the message bits, and if the bit in the digest is 0, you select the corresponding bit from the first row of the private key and append it to an array called signature. Otherwise, you select the bit from the second row and append it to the signature array. At the end, you’ll have a signature made up of different parts of your private key.

sign(message, private_key):
 hashed_message = hash(message) // Hash the message
 signature = Array[n] // One-dimensional array to hold the signature
 for i from 0 to n-1:
 if hashed_message[i] == 0:
 signature[i] = private_key[0][i] // Use the first row of the private key if the corresponding bit in the hash is 0
 else:
 signature[i] = private_key[1][i] // Use the second row of the private key if the corresponding bit in the hash is 1
 return signature

Next, you implement a function called verify to check the validity of a signature. This function takes the original message, the signature, and the public key as inputs. Here's how it works:

You first hash the original message using the same hash function used during signing, resulting in a digest.

Then, for each bit in the signature, you reconstruct the original digest by selecting the corresponding bit from the public key. If the bit in the signature is matched with the corresponding bit from the public key’s first row, it’s considered as a 0-bit; otherwise, it’s considered as a 1-bit.

After reconstructing the digest, you compare it with the hashed message. If they match, the signature is deemed valid; otherwise, it’s considered invalid.

verify(message, signature, public_key):
    hashed_message = hash(message)
    reconstructed_hash = Array[n]  // One-dimensional array to hold the reconstructed hash
    for i from 0 to n-1:
        if signature[i] == public_key[0][i]:
            reconstructed_hash[i] = 0  // Reconstruct the hash using the first row of the public key if the signature matches
        else if signature[i] == public_key[1][i]:
            reconstructed_hash[i] = 1  // Reconstruct the hash using the second row of the public key if the signature matches
        else:
            return false  // If neither matches, the signature is invalid
    return reconstructed_hash == hashed_message  // Return true if the reconstructed hash matches the hashed message, false otherwise

I have written an implementation of this in Rust and you can check it out

https://github.com/amschel99/lsig

While the Lamport signature algorithm offers a basic method for generating and verifying digital signatures, its direct application to blockchain systems may be considered somewhat trivial or naive. This approach has some inherent limitations and challenges that need to be addressed for practical implementation in real-world blockchain applications. In a subsequent article, we will delve deeper into these limitations and explore more sophisticated methods for ensuring the security and efficiency of digital signatures on blockchains.

Distributed consensus

We should begin with the Byzantine Generals Problem. It’s a game theory problem that describes the difficulty decentralized parties encounter in arriving at a consensus without relying on a central authority. A blockchain network should be permissionless, meaning anyone can join without needing to provide their identity. If we’re building a cryptocurrency, we need a globally ordered log, or you can think of it as a distributed database. You can compare this to a traditional bank database that records transaction information. However, this database should be replicated across all nodes in the network. This process is known as state machine replication. In such a network, multiple participants are involved, and they all need to agree on the system’s state. This issue is commonly referred to as distributed consensus.

In such a system, one critical thing needs to be addressed. What if some participants in the network decide to subvert the protocol? Since the system is permissionless, an attacker can decides to create multiple nodes, essentially for free and take over the network? This is what’s known as The Sybil Attack.

Proof of work is one of the most well-known solutions to the Sybil attack. In a nutshell, to create a new block on the blockchain requires a significant amount of computing resources such as electricity.

The difficulty of PoW ensures that adding new blocks to the blockchain requires substantial computational effort. This helps protect the network against Sybil attacks by making it difficult for attackers to overwhelm the network with a flood of fake transactions or blocks.

So how does this work technically?

Each block in the blockchain contains several pieces of information:

The hash of the previous block. A set of transactions that represent changes to the Bitcoin ledger. A nonce, which is a random number used to create a unique hash for the block. The target, which is a specific value that the hash of the block must be below to be considered valid. Miners compete to find a nonce that, when combined with the other block data, produces a hash that is below the target value. This process involves repeatedly hashing the block data with different nonce values until a suitable hash is found. Since the hash function used in Bitcoin (SHA-256) is deterministic and unpredictable, finding the correct nonce requires a significant amount of computational work.

The difficulty of finding the nonce is adjusted regularly to ensure that new blocks are added to the blockchain at a relatively constant rate, approximately every 10 minutes. The difficulty is determined by the target value, which is inversely proportional to the difficulty level. A lower target value means a higher difficulty, requiring more computational work to find a valid nonce.

Miners who successfully find a valid nonce and create a new block are rewarded with a predetermined number of bitcoins, along with any transaction fees included in the block. This incentive encourages miners to participate in the network and secure the blockchain.

Basically, miners use a ton of computer power to solve puzzles and create new blocks in the Bitcoin network. It’s like they’re competing to solve a really hard math problem, and whoever solves it first gets to add a block to the blockchain and earn some bitcoins.

The problem is, this process uses up a crazy amount of electricity and resources.

Ofcourse there are other consensus mechanisms such as Proof of stake used by etheureum but we will dive into thos later on.

Signatures indepth

Apart from hash based Signatures(lamport signature scheme) there are other cool signatures. Hash based signatures are not used in bitcoin.

Disadvantages of lamport signatures

One time use
They are kind of huge (8kilobytes for a signature and keys are 16 kilobytes)

Ellptic curve signatures

Curves of the form y^2= x^3+ax+b Bitcoin curve's is y^2= x^3+7

The Nitty Gritties of Blockchain and Cryptocurrency Engineering