Podcast

You can listen to MyDistributed.Systems podcast on podcast apps.

Spotify

Apple Podcasts

RSS

Comments

The Paxos algorithm, when presented in plain English, is very simple

Consensus is one of the fundamental problems in distributed systems. In this post, we want to see what is consensus and review the most famous consensus algorithm—Paxos. we will see that despite exaggeration about its complexities, Paxos, at least the singe-decree Paxos that aims to achieve consensus on a single value, is actually very intuitive and easy to understand.

P is the name of a domain-specific language for asynchronous event-driven programming. With P, we can specify a system as a set of interacting state machines which talk to each other by sending events. That makes P a suitable language for modeling distributed systems where nodes talk to each other asynchronously via messages. P originally developed by Microsoft and according to its Github repository, it is currently being used extensively inside Amazon AWS for model-checking complex distributed systems. In this post, we want to see how we can use P to catch bugs in our protocols.

Figure 1 shows the main difference between in-memory and on-disk databases: An in-memory database stores the data in memory and uses disk for backup, while an on-disk database stores the data on disk and uses memory for caching. Figure 1. In-memory vs On-disk storage engines

Log-structured storage is a powerful approach to organizing data that prioritizes sequential writes for efficiency and durability. In this post, I delve into the inner workings of log-structured storage, focusing on LSM-trees. We'll explore their write and read paths, the mechanics of compaction, and compare leveled and tiered compaction strategies. Additionally, I'll discuss the differences between LSM-trees and B-trees, and talk about variations like Jungle and Bitcask, highlighting their unique optimizations and trade-offs. Basics Write Path Write are first appended to a Write-Ahead-Log (WAL) for durability, then they are written to a MemTable which is basically an in-memory sorted data structure such as a balanced tree (e.g. red-black or AVL tree) or a skip list (I have a detailed post about skip list on this blog). After writing a write to WAL and the MemTable, we can return success to the client. We can have multiple MemTables, but at any given time, ...

Generative AI is a powerful tool that can significantly enhance developer productivity by generating code quickly and efficiently. However, a major challenge when using generative AI is ensuring the correctness of the generated code, especially in critical systems where errors can have significant consequences. I have heard of companies dealing with critical systems have banned the use of any code generation technologies, including generative AI, to minimize the risk of introducing bugs. In addition, there are reports suggesting using AI coding assistants has led to a 40% increase in the bug rate. A promising way to address the correctness challenges of AI-generated code, particularly in the context of high-level distributed system design, is by combining generative AI with formal verification. This approach could be referred to a...

Search This Blog

MyDistributed.Systems

Podcast

Comments

Post a Comment

Popular posts from this blog

The Paxos algorithm, when presented in plain English, is very simple

Model-based Testing Distributed Systems with P Language

In-memory vs. On-disk Databases

Log Structured Storage

Genefication: Generative AI + Formal Verification