A discussion of Erlang/OTP, based on the "A History of Erlang" paper, with the addition of covering some of the methods used in the paper that enable Riak to be a highly reliable, distributed datastore, while leveraging the work of giants.
6. Requirements
• Handling a very large number of concurrent activities
• Actions to be performed at a certain point of time or within a certain time
• Systems distributed over several computers
• Interaction with hardware must be abstracted
• Very large software systems
• Complex functionality such as feature interaction
• Continuous operation over several years
• Software maintenance (reconfiguration, etc.) without stopping the system
• Stringent quality and reliability requirements
• Fault tolerance both to hardware failures and software errors
8. COP: Concurrency Oriented
Programming
• Systems are built from processes.
• Process share nothing.
• Processes interact by asynchronous message
passing.
• Processes are isolated.
9. AXD301: Success
• Came out in 1998
• ATM switch, control-plane written in Erlang
• 2.6 Million lines of Erlang code
• Nine-nines of uptime: 99.99999999%
• Built highly reliable systems from unreliable
components
• Scale-out
10. Erlang
• Only accessible to the community since December
1998
• Designed for telephony control planes
• Actor Model
• Functional
20. Turning It Into A Server
9>
FServer
=
9>
fun
FactorialServer()
-‐>
9>
receive
9>
{From,
N}
-‐>
9>
From
!
FC(N),
9>
FactorialServer()
9>
end
9>
end.
#Fun<erl_eval.44.90072148>
34. Trigger The Bug
9>
factorial_server:calc(10,
Pid).
=ERROR
REPORT====
5-‐Apr-‐2015::06:04:40
===
Error
in
process
<0.59.0>
with
exit
value:
{{badmatch,2},
[{factorial_server,factorial,1,[{file,"factorial_server.erl"},{line,
15}]},{factorial_server,server,0,[{file,"factorial_server.erl"},{line,
11}]}]}
**
exception
exit:
{badmatch,2}
in
function
factorial_server:factorial/1
(factorial_server.erl,
line
15)
in
call
from
factorial_server:server/0
(factorial_server.erl,
line
11)
35. Ping The Server Again?
10>
10>
factorial_server:calc(10,
Pid).
...Hangs
Forever
38. Behaviours are formalizations of common
patterns. The idea is to divide the code for
a process in a generic part (a behaviour
module) and a specific part (a callback
module).
47. Using The OTP Module
1>
factorial_server_otp:start_link().
{ok,<0.34.0>}
2>
gen_server:call(factorial_server_otp,
9).
362880
48. Triggering The Bug
5>
gen_server:call(factorial_server_otp,
10).
=ERROR
REPORT====
5-‐Apr-‐2015::06:03:43
===
**
Generic
server
factorial_server_otp
terminating
**
Last
message
in
was
10
**
When
Server
state
==
state
**
Reason
for
termination
==
…
62. In OTP, application denotes a component
implementing some specific functionality,
that can be started and stopped as a unit,
and which can be re-used in other systems
as well.
70. Observer, is a graphical tool for observing the
characteristics of erlang systems. Observer displays
system information, application supervisor trees, process
information, ets or mnesia tables and contains a frontend
for erlang tracing.
73. Review
• Factorial Server is now OTP Application
• gen_server
• supervisor
• application
• Factorial Server has a bug where it crashes at
numbers >10.
• Gen_server crashes handled by supervision tree
74. Try Again
8>
gen_server:call(factorial_server_otp,
10).
=ERROR
REPORT====
5-‐Apr-‐2015::07:45:00
===
**
Generic
server
factorial_server_otp
terminating
**
Last
message
in
was
10
**
When
Server
state
==
state
**
Reason
for
termination
==
...
102. Distributed Erlang
• Useful for building highly reliable applications
• Useful for building applications that scale out
• The underlying system of nearly all Riak requests
• Authentication through cookies
108. Tying it Together
Checking whether the PLT out.plt is up-to-date... yes
Proceeding with analysis...
factorial_server_otp.erl:6: Invalid type specification for function
factorial_server_otp:handle_call/3. The success typing is
(pos_integer(),_,'state') -> {'reply',neg_integer(),'state'}
factorial_server_otp.erl:9: Invalid type specification for function
factorial_server_otp:factorial/1. The success typing is
(pos_integer()) -> neg_integer()
done in 0m0.79s
done (warnings were emitted)
110. Tying it Together
3c075477e55e:talk sdhillon$ dialyzer --plt out.plt
factorial_server_otp.erl
Checking whether the PLT out.plt is up-to-date... yes
Proceeding with analysis... done in 0m0.89s
done (passed successfully)
122. Memory Model
• Split memory model
• Useful for building high performance systems
• Predictability is key
• Garbage collected
• Shared heap only GC’d under memory pressure
132. Riak Core is the distributed systems framework
that forms the basis of how Riak distributes
data and scales. More generally, it can be
thought of as a toolkit for building distributed,
scalable, fault-tolerant applications.
143. Troubles In Paradise
• Distributed Erlang can be problematic due to head-
of-line blocking
• NIFs (external code) can be problematic due to
performance and debugging
• BEAM needs a JIT compiler
• Work being done here!