2. Time
• What does time mean for you?
• What does time mean to “software” ?
• What does time mean to “Operating Systems”?
• What does time mean to “Distributed Systems”?
Discuss!
3. Clocks
• What does clock mean for you?
• What does clock mean to “software” ?
• What does clock mean to “Operating Systems”?
• What does clock mean to “Distributed Systems”?
Discuss!
4. Use of time in Distributed Systems
• Distributed Systems rely on time for ;
• Scheduler (also in operating systems)
For Scheduling, Timeouts, Failure Detectors, Retry time , etc..
• Performance measurement statistics, profiling
Time a process had been running, CPU usage, etc
• Log files & databases
Records when an event occurs
• Date with timelimited validity
cache entries DNS / TLS / etc
7. Clocks
Types
of
Clocks
Physical Clocks Counts numbers of seconds elapsed
Logical Clocks Counts events (messages sent, etc)
Clocks in DS is not an oscillator (digital electronics)
In DS -> it means source of timestamp
8. Physical Clocks
• Physical clocks are needed to adjust time of nodes
• All nodes in the system can share their local time with all other
nodes in the system
• In physical synchronization, physical clocks are used to time
stamp an event on that computer
• Keeps the time of the day
• Consistent across systems
9. Implementations of Clocks
• Quartz crystal clocks
• Quartz oscillator is a part of self feedback loop
• It typically oscillates at 32KHz
• Higher frequency will be generated by dividing the clock
• The clock drift is + or -15 seconds per month (6 ppm)
• The regular quartz clock is not suitable for large distributed systems
• Resonator shaped liked turning fork
• Good resonator can have accuracy of 1 second in 10 years
(Frequency changes with age, temperature and acceleration)
11. Implementations of Clocks
• Atomic Clocks
• Caesium 133 used as an oscillator
• Uses a similar feedback based on circuit as the quartz clock
• Accuracy: 10^8 ppm
GPS
• 31 satellites each carrying an atomic clock
• Satellite broadcast current time and location
• Calculate position from speed of light delay between the satellite and receiver
• Corrections for atmospheric effects, relativity, etc
• In data centers, may need antennas outside the data center ??
12. Uses of Atomic Clocks
GPS
• Each satellite broadcast its position (x1,y1,z1) and time t1
obtained through an atomic clock
Finding the Position through GPS
• Current position (x, y, z)
• Drift between the receiver clock and the atomic clocks is d.
• Time at which the receiver receives the message is t
• Setup equation:
p (x − xi) 2 + (y − yi) 2 + (z − zi) 2 = (tr − ti + d) × c [ where c = speed of
light ]
13. Sync Problems of clocks
• Getting two system agree upon time
• Two clocks hardly ever agree
• Quartz oscillators oscillates at sightly different frequencies
• Create ever-widening gap in perceived time
called as Clock drift
• Difference between the two clocks at one point of time is called
as Clock skew
15. Standard Time Systems
Coordinated Universal Time (UTC)
based on the Greenwich Mean Time (GMT)
GMT -> Solar time
** It’s the noon when the sun is in the south, as seen from the
Greenwich meridian
International Atomic Time (TAI – French temps atomique international)
1 day is 24 * 60 * 60 * 9,192,631,770 periods of caesium-133’s
resonant frequency
16. Problem and
Compromise
• Speed of Earth’s rotation is
not constant
• UTC is ATI with corrections to
account for Earth’s rotation
• Time zones, daylight savings
time are offsets to UTC
17. Correction to TAI to UTC
• Makes use of Leap Seconds (Similar to leap years)
• Every year, on 30th June and 31st December at 23:59:59 UTC,
• Option 1 : Clock immediately jumps forward to 00:00:00 skipping one second
(negative leap second)
• Option 2 : Clock moves to 23:59:60 after one second and then moves to
00:00:00 after one further second (neutral)
• Option 3 : Clock moves to 23:59:60 after one second and then moves to
00:00:00 after one further second (positive leap second)
This is announced before hand (few months)
18. Positive Leap Second
• Case Study : Leap Second Glitch (Read : URL)
Screenshot of the UTC clock from time.gov during the leap second on 31 December 2016.
19. Computers Representation of Timestamps
• Unix Time
number of second since a given point in time
1st January 1970 00:00:00 UTC (the “epoch”)
not counting leap seconds
• ISO 8601
year month, day, hour, minute. Second, and timezone offset relative to UTC
eg : 2024-03-19T10:00:00+05:30
https://www.unixtimestamp.com/
20. Conversion Unix to ISO 8601
• Gregorian Calendar
365 days in a year except for the leap years
(year% 4 == 0 && (year % 100 != 0 ) || year % 400 == 0))
Knowledge of past and future seconds ??
(is it taken into consideration?)
21. Software and Leap Seconds
• Software ignores leap seconds
• Most of the applications are not sensitive of the time accuracy
• Operation systems and distributed systems are not the same !
• Time stamps and their accuracies are crucial in OS and DS
22. Example Case Study
30th June 2012 incident (similar events occurred in 2016)
• In the night from 30th June to 1st July 2012, many online services and
systems around the globe crashed simultaneous
• Rebooting didn’t help the situation
• Bug in Linux kernel caused livelock on leap second, causing many internet
services to go down
servers locked up and stopped responding
• Solution : Smear (Spread out)
leap second over the course of a period (likely a day)
more of a hack not a solid solution
https://www.wired.com/2012/07/leap-second-glitch-explained/
23. Clock Synchronization (Sync)
• Computers track physical time/ UTC with a quartz clock
(with Battery, continues running when power is off)
Due to clock drift, clock error gradually increases
(clock skew → difference between two clocks at a point in time)
Reduce the skew as much as possible, not possible to get rid of it totally
24. Solution
• Periodically get current time from a (reliable source) server that has a
more accurate time source (atomic clock / GPS receiver)
• Protocols :
Network Time Protocol (NTP)
Precision Time Protocol (PTP)
27. Network Time Protocol (NTP) Basics
• Many operating systems vendors
run NTP servers
• Usually configure OS to use
them by default
• Hierarchy of clock servers
arranged into Strata
Stratum 0 : Atomic clock or GPS receiver
Stratum 1 : synced directly with stratum 0 device
Stratum 2 : ser that sync with stratum 1, …. They can
go on as required
https://ntp.org/
28. Network Time Protocol (NTP)
• May contact multiple servers,
discard the outliers, average
rest
• Makes multiple requests to
the same server, use statistics
to reduce random error due to
variations in network latency
• Reduce clock skew to a few
milliseconds in good network
conditions, but, can be worse!
31. Precision Clock Protocol – Used in Networks
• Compared with NTP, PTP allows hosts to be synchronized to one
common source of time with much higher precision.
Additional Material
32. Dealing with Clock Skew / Drift
• Go for gradual clock correction when possible
• Fast
Make sure that the clocks run slower until synchronized
• Slow
Make sure that the clocks run faster until synchronized
33. Correct the Clock Skew
• When the client estimates clock skew 𝜃
need to apply clock correction
If clock skew 𝜃 < 125ms, slew
slightly speedup or slowdown by upto 500 ppm (sync within ~ 5 m)
If 125ms ≤ clock skew 𝜃 < 1000 ms, step
suddenly reset client clock to estimated server timestamp
If clock skew 𝜃 > 1000 ms, panic (DO NOTHING)
Human operator should intervene
Systems relying on clock sync should monitor clock skew !!!
35. Monotonic and time-of-day Clocks
• Usual procedure (Java Example)
• Get the time stamp at the beginning and the end of the process to measure
the time it takes
long startTime = System.currentTimeMillis();
doSomething();
long endTime = System.currentTimeMillis();
long elapsedMillis = endTime – startTime;
36. Elapse Time of a Programme
• Usual procedure (Java Example)
• Get the time stamp at the beginning and the end of the process to measure
the time it takes
long startTime = System.currentTimeMillis();
doSomething(); (NTP syncs the time here ?? )
long endTime = System.currentTimeMillis();
long elapsedMillis = endTime – startTime;
What if the value is negative or too large
37. Appropriate Mode
long startTime = System.nanoTime();
doSmething();
long endTime=System.nanoTime();
long elapsedNanos = endTime-StartTime;
This will always result in a monotonic value (positive here)
C example : https://www.tutorialspoint.com/c_standard_library/c_function_time.htm
38. Monotonic and Time-of-day Clocks
• Time-of-day Clocks
• Time since a fixed date (Unix : 1st January 1970 epoch)
• Monotonic Clock
• Time since arbitrary point (eg: since the machine / server booted up)
39. Monotonic and Time-of-day Clocks
• Time-of-day Clocks
• Time since a fixed date (Unix : 1st January 1970 epoch)
• May Suddenly move forward or backward → NTP Stepping
• Subject to leap second adjustment
• Monotonic Clock
• Time since arbitrary point (eg: since the machine / server booted up)
• Always moves forward at near constant rate
40. Usage of Monotonic and Time-of-day Clocks
• Time-of-day
• Comparison across nodes (if synced)
• Linux : clock_gettime(CLOCK_REALTIME)
• Java : system.currentTimeMillis()
• C : time() – returns seconds since 1st January 1970, 0:0:0, +0.0
• Monotonic
• Good for measuring elapse time on a single node
• Linux : clock_gettime(CLOCK_MONOTONIC)
• C : clock() - approximate processor time consumed by the program
(in <ctime> header file)
41. Ordering
• Ordering of messages
C sees m2 first and m1 second
though the logical order is m1 happened before m2
42. Formalizing Ordering of Events
m1 = (t1, message of A)
m2 = (t2, message of B)
Still the problem is not resolved if t1 and t2 are not synced
43. Happens-before Relation
• Event is something happening at one node
sending / receiving a message / local execution steps
• event a happens before event b ( a → b) iff :
• a and b occurred at the same node , and a occurred before b in that node’s
local execution order; or
• event a is the sending of some message m, and event b is the receipt of that
same message m (assuming sent messages are unique ); or
• There exists an event c such that a → c and c → b
Happen-before relation is a partial order:
It is possible that neither a → b nor b → a. In that case, a and b are
concurrent (a ǁ b)
44. Happens-before Relationship
a → b, c → d, and e → f due to process order
b → c and d → f due to messages m1 and m2
a → c, a → d, a → f, b → d, b → f, and c → f due to transitivity
a ǁ e , b ǁ e, c ǁ e, and d ǁ e (independent)
45. Causality
• When a → b, then a might have caused b
• When a ǁ b, known a cannot have caused b
Happens-before relation encodes potential causality
Relativity
46. Formalization of Causality
Let be a strict total order on events
if ( a → b ) (a b) ; then is a causal order
is “consistent with causality”
There is a causal relationship
ϒ
ϒ
ϒ
ϒ
47. Logical Time
• Physical clocks (recap) -> seconds elapsed
• Physical time stamp –> useful / but, inconsistent with causality
• Logical clock count number of events occurred
• Captures the causality (dependencies) a → b T(a) < T(b)
Logical Clocks
Lamport clocks
Vector clocks
48. Lamport Clock
• Each node maintains a counter (t)
counter is incremented on every local events (e)
• Let L(e) be value of t after increment
• Every messages sent over network is appended with current t
• Recent adjusts its internal t to the received t (iff its greater) and then
increment the event count
If a → b L(a) < L(b)
But, L(a) < L(b) does not imply a → b
L(a) < L(b) might/might not a ≠ b ()
49. Lamport Clock Algorithm
on initialization do
t:=0 (each node maintains local variable t)
end initilization
on any event occurring at the local node do
t := t+1
end any event
on request to send message m do
t := t+1; send (t,m) via network link
end request to send message
on receiving (t’,m) via network link do
t := max(t’,t) +1
deliver m to the application
end receiving
50. Lamport Clock Example
• Let N(e) be node at which event e occurred
• Then pair (L(e),N(e)) uniquely identified event e.
A B C
A,1
A,2
A,3
B,1
B,3
B,4
m(2,m1)
m(4,m2)
C,1
C,5
51. Total order using Lamport timestamps:
a b (L(a) < L(b)) V (L(a) = L(b) Λ N(a) < N(b))
(N naming of nodes )
This order is causal (consistent with the causality)
a → b a b
A B C
A,1
A,2
A,3
B,1
B,3
B,4
m(2,m1)
m(4,m2)
C,1
C,5
ϒ
ϒ
ϒ
52. Limitations of Lamport Clocks
• Lamport clocks
Given Lamport timestamps L(a) and L(b) with L(a) < L(b)
We cannot deduce whether a → b or a ǁ b
Vector clocks are used to mitigate this issue
53. Vector Clocks
• Nodes as Vectors ;
N = <𝑁1, 𝑁2, … … 𝑁𝑛>
• Vector timestamp of event a ;
V(a)= <𝑡1, 𝑡2, … … 𝑡𝑛>
• 𝑡𝑖 is number of events observed by node 𝑁𝑖
• Each node has a current vector timestamp T
• On event at node𝑁𝑖, increment vector element T[i]
• Attach current vector timestamp to each message
• Recipient merges message vector into its local vector
54. Vector Clock Algorithm
on initialization at Node 𝑁𝑖 do
T := <0,0,…….,0> local variable at 𝑁𝑖
end initialization
on anyevent occurring at node 𝑁𝑖 no
T[i] := T[i] + 1
end anyevent
on request to send message m at the node 𝑁𝑖 do
T[i] := T[i] +1 ; send (T,m) via network
end request to send message
on receiving (T’,m) at node 𝑁𝑖 via network do
T[j] := max(T[j],T’[j]) for every j ϵ {1,2,….n}
T[i] := T[i] +1; deliver m to the application
end receiving (T’,m)
55. Vector Clock Example
A B C
<1,0,0>
(<2,0,0>,m1)
<2,0,0>
<3,0,0>
<0,1,0> <0,0,1>
<2,3,0>
<2,2,0>
(<2,3,0>,m1)
<2,3,2>
• Vector timestamp of an event e represents a set of events;
e and its causal dependencies : {e} υ {a | a → e }
56. Vector clock ordering
Define following order on vector timestamps
(in a system with n nodes )
T = T’ iff T[i] = T’[i] for all i ϵ {1,…….,n}
T ≤ T’ iff T[i] ≤ T’[i] for all i ϵ {1,…….,n}
T < T’ iff T < T’[i] && T ≠ T’[i]
T ǁ T’ iff T ! ≤ T’[i] && T’[i] ! ≤ T
V(a) ≤ V(b) iff ({a} ᴜ {e | e → a }) ({b} ᴜ {e | e → b } )
Ul
(V(a) < V(b)) (a → b )
(V(a) = V(b)) (a = b )
(V(a) ǁ V(b)) (a ǁ b )