Let’s study distributed systems — 2. Clock

From: Flickr

Did you read my previous post — Let’s study distributed systems 1. Introduction? I tried to tell you some introduction to distributed systems.

In this article, I want to talk about clock. In our daily life, many things are based on time. When we meet our friends, when a train starts… but it assumes that we are sharing the same clock.

Clocks are not accurate

However, in our real world, 1 ms late doesn’t matter. Even if a teacher says that tomorrow’s class will be started from 09:00, it might be 09:01 or 09:00:30. Our clock is not accurate, but we have few chances which introduces us problem caused by it.

Clocks in computers

Why does accurate time matter?

If we have accurate clocks in both client and server, client can send timestamp with ticket ordering request. However, because computers cannot have accurate clock, there is no way to know which request is earlier exactly.

Like above example, we sometimes need to know the event’s happened-before relations, especially in distributed systems. If above example happened in just one process, inaccurate clock won’t be a problem.

Heppened-before relations

There are 3 basic theory about happened-before; transitivity, irreflexivity and antisymmetry. (Note: “a ↛ b” means that a is not before b)

  • if ab and b → c then a → c

If a is before b and b is before c then a is before c. (transitivity)

  • a ↛ a

No event happen before itself. (irreflexivity)

  • a → b then b ↛ a

If a is before b, b is not before a (antisymmetry)

Parallel

  • a ↛ b and b ↛ a

a is not before b and b is not before a.

In this case, event a and event b run in parallel. Parallel doesn’t mean that these 2 events are run actually in parallel. It just mentions that nobody can guarantee which event is earlier, and regarding that they are run in parallel doesn’t have any contradictions.

For example, suppose event a happened on process A, and event b on process B. In this case, if process A and B never connected, we cannot assume which event is before the other one.

Logical Clock

In logical clock algorithm, set t(e) as the virtual time of an event e. t represents the time, but it’s not the one in our real world but just a sequential number. It can define order of events. It works like this;

  • Initially, t = 0 in each process
  • When an event e which is not a message receiving event happens, increment t. Then, set t as t(e)
  • When a message sending event e happens, attach t(e) to the message.
  • When a message receiving event e happens, set max(current time, attached time to e) as t(e)

Using these rules, it can guarantee that if e → e’ then t(e) < t(e’).

How logical clock works

Let’s see above image as an example. Because of if a → b and b → c then a → c, we can know each happend-before relation. Red character represents t(e).

A problem of logical clock, and Vector clock

Vector clock can resolve the problem; in other words, it can guarantee both of above.

How it works?

  • Initially, t = 0 in each process. Therefore, each vector clock V = (0, 0, …, 0).
  • When an event e which is not message receiving event happens, process Pi defines V’ from current vector clock V = (v1, v2, …, vn) by changing it as vi ← vi+1. Then, V’ will be the time V(e) which e happened
  • When an event e which is message sending event happens, attach V(e) to the message
  • When an event e which is message receiving event happens, set vj ← max(vj, vj’) (1 ≤ j ≤ n) and vi ← vi + 1 to attached time then set it as V(e)
How vector clock works

This is how vector clock works in above example. In vector clock, when there are 2 vector clock V = (v1, v2, … vn) and V’ = (v’1, v’2, …, v’n), if arbitary i which is i ∈ {1, 2, …, n} can be vi ≤ v’i and at least one i can be vi < v’i then V < V’.

Conclusion and What’s next?

Next, I’m planning to write something about distributed snapshot. If you like this article please clap, follow me, and share to others!

Hidetatsu loves infrastructures, database, concurrent programming, transactions, distributed systems… https://github.com/dty1er https://dtyler.io