Exception Handling: Designing Robust Software in Ruby (with presentation note)
1. Exception Handling:
Designing Robust Software
ihower@gmail.com
2014/9/27@RailsPacific
Hello everybody,
Today I’m going to talk about exception handling, it’s about how to design robust software.
!
I just realise that I don't always practice my english, but when I do I do it at conference. Q_Q
2. About me
• 張⽂文鈿 a.k.a. ihower
• http://ihower.tw
• http://twitter.com/ihower
• Instructor at ALPHA Camp
• http://alphacamp.tw
• Rails Developer since 2006
• i.e. Rails 1.1.6 era
My name is ihower. Here is my blog and twitter account. You can get my slides URL from my tweet now.
Currently I work for ALPHA Camp. it’s a professional school for startups. I teach ruby on rails bootcamp there.
I have been using Ruby on Rails since 2006. At that time, rails version is 1.1.6. Anyone older than me, raise your hand? cool, we’re exceptional.
3. Agenda
• Why should we care? (5min)
• Exception handling in Ruby (10min)
• Caveat and Guideline (15min)
• Failure handling strategy (15min)
Here is today’s agenda:
I will talk about why should we care exception handling first.
then we will quickly learn how to do exception handling in ruby programming language.
then we will look at some caveats and guideline.
finally we checkout some failure handling strategy.
4. I’m standing on the two
great books, thanks.
Before entering my talk, I want to thanks to these two great books.
First book is a Chinese book by Dr. Teddy Chen. He majors exception handling area professionally. Although his example is java code, I still learn a lot of
concept from this book.
Second book is Exceptional Ruby by Avdi. I learn many ruby tricks from this book. Besides this book, he also had talks about exception handling at
rubyconf.
I highly recommend you to read these books if you want to learn more after my talks.
5. 1. Why should we
care?
OK. Let’s get started. Why should we care exception handling?
6. Feature complete
==
Production ready?
Ask yourself, is feature complete equal to production ready? I think we all know it’s not true.
Production environment is brutal and shit happens.
remote server can go down, data can be incorrect, request can timeout, API call may return error, system may be out of memory…
7. All tests pass?
Does all tests pass mean production ready? It’s not true neither. It just means software is correct based on spec, Doesn’t mean software is robust for
production.
8. Happy Path!
The reason is that we all focus on happy normal path.
We assume that every method call succeeds, data is correct and resources are always available.
10. Time
Pick Two?
Cost Quality
One reason we ignore the unnormal path and sacrifice software quality is that we want to meet our product deadline and budget.
After all, unnormal path is a non-functional requirement, it’s a quality attribute that our manager or our customer doesn’t see it clearly.
11. abstract software
architecture to reduce
development cost
Another reason I think is that we often emphasize high-level abstract architecture, we learn design pattern, refactoring, TDD. That all bring us more
portable, re-use and reduce development cost.
12. Development cost
!=
Operational cost
But one important lesson I learn from my production experience is that “Development cost is not equal to operational cost”
13. “Don't avoid one-time development
expenses at the cost of recurring
operational expenses.”
– Michael T. Nygard, Release It!
This is my favorite quote from Release It!
Michael said:
Don’t avoid one-time development expenses at the cost of recurring operational expenses.
So, your design decisions made during development will greatly affect you quality of life after release.
14. My confession
I have to confess that I easily made decisions based on optimising development cost, not operational cost before.
And I was miserable and suffered it. Our customer got angry and company lost money. So I decide to change my mind. This is why I start to look at failure
handling and give this talk.
So, if your project’s KPI is reducing development cost and deadline, then hope you can have new perspective from my talk.
16. “If code in one routine encounters an
unexpected condition that it doesn’t know how
to handle, it throws an exception, essentially
throwing up its hands and yelling “I don’t know
what to do about this — I sure hope somebody
else knows how to handle it!.”
– Steve McConnell, Code Complete
Steve explain it at Code Complete book.
He said that raise exception is just like yelling “I don’t know what to do about this — I sure hope somebody else knows how to handle it!.”
17. “Raising an exception means stopping normal
execution of the program and either dealing with
the problem that’s been encountered or exiting
the program completely.”
– Devid A. Black, The Well-Grounded Rubyist
David explains more straightforward at the well-grounded rubyist book. He said…
18. 1. raise exception
begin
# do something
raise 'An error has occured.'
rescue => e
puts 'I am rescued.'
puts e.message
puts e.backtrace.inspect
end
1/5
Most modern programmings have exception handling, like java, c#, php, python and of course including ruby. well, except golang, but this’s another story.
!
So, this is the basic exception handling syntax in Ruby,
I think most of you already know it.
The raise method will disrupt the flow and jump to rescue part.
19. raise == fail
begin
# do something
fail 'An error has occured.'
rescue => e
puts 'I am rescued.'
puts e.message
puts e.backtrace.inspect
end
One thing you possibly do not know is that fail method is the same as raise.
20. “I almost always use the "fail" keyword. . .
[T]he only time I use “raise” is when I am
catching an exception and re-raising it,
because here I’m not failing, but explicitly
and purposefully raising an exception.”
– Jim Weirich, Rake author
there’s a interesting coding convention that jim weirich almost use fail method. He said that he only use raise method when he want to re-raise exception.
So if you checkout rake source code, you can see the style. very interesting and subtle difference.
21. “raise” method signature
!
raise(exception_class_or_object,
message,
backtrace)
Back to raise method. let’s take a closer look.
This’s its method signature and it can accept three arguments actually.
22. raise
raise
!
# is equivalent to:
!
raise RuntimeError
If you omit all arguments, then it’s equivalent to RuntimeError
23. raise(string)
raise 'An error has occured.'
!
# is equivalent to :
!
raise RuntimeError, "'An error has occured.'
If you give it a string, then it’s equivalent to RuntimeError with a string parameter
24. raise(exception, string)
raise RuntimeError, 'An error has occured.'
!
# is equivalent to :
!
raise RuntimeError.new('An error has occured.')
if you give it a exception class and string, it’s equivalent to new the exception class and passing the string parameter.
25. backtrace
(Array of string)
raise RuntimeError, 'An error'
!
# is equivalent to :
!
raise RuntimeError, 'An error', caller(0)
The third argument is backtrace, it’s just an array of string.
The default is caller which returns the current call stack.
26. global $! variable
!
• $!
• $ERROR_INFO
• reset to nil if the exception is rescued
raise method also setup a special global variable to keep the current active exception.
It will be reset to nil after the exception is rescued.
27. 2. rescue
rescue SomeError => e
# ...
end
2/5
let’s take a look at rescue method. rescue is where we handle error. rescue keyword is followed by exceptional class, then a hash rocket and the variable
name to which the matching exception will be assigned.
28. Multiple class or module
rescue SomeError, SomeOtherError => e
# ...
end
rescue method supports multiple classes or modules
29. stacking rescue
order matters
rescue SomeError => e
# ...
rescue SomeOtherError => e
# ...
end
It also supports stacking multiple rescue.
And order matters.
30. rescue
# ...
end
!
# is equivalent to:
!
rescue StandardError
# ...
end
If you omit exception class, default it will rescue StandardError,
It doesn’t not include other Exception like NoMemoryError, LoadError, SyntaxError.
31. Ruby Exception Hierarchy
• Exception
• NoMemoryError
• LoadError
• SyntaxError
• StandardError -- default for rescue
• ArgumentError
• IOError
• NoMethodError
• ….
Why not capture the base Exception by default? The exceptions outside of the StandardError hierarchy typically represent conditions that can't reasonably
be handled by a generic catch-all rescue clause. For instance, the SyntaxError represent that programmer typo. So ruby requires that you be explicit when
rescuing these and certain other types of exception.
32. Avoid rescuing Exception
class
rescue Exception => e
# ...
end
Here is one of infamous mistake that programmer try to rescue Exception class. This will make it rescue all exceptions including your typo. In most case
this is bad for debug.
33. rescue => error
# ...
end
# is equivalent to:
!
rescue StandardError => error
# ...
end
You can also omit the exceptional class, the default is StandardError too.
34. support * splatted
active_support/core_ext/kernel/reporting.rb
suppress(IOError, SystemCallError) do
open("NONEXISTENT_FILE")
end
!
puts 'This code gets executed.'
Rescue supports star splatted.
For example, Rails has a suppress method, it will suppress exception you specify.
In the case, it will ignore IOError and SystemCallError.
35. support * splatted (cont.)
active_support/core_ext/kernel/reporting.rb
!
def suppress(*exception_classes)
yield
rescue *exception_classes
# nothing to do
end
Here is the rails source code, you can see it use star after rescue keyword.
36. Like case, it’s support ===
begin
raise "Timeout while reading from socket"
rescue errors_with_message(/socket/)
puts "Ignoring socket error"
end
Last trick about rescue is triple equal sign. It like “case” except it requies Class or Module.
For example, we can customise a errors_with_message method that only rescue exception with “socket” message.
37. def errors_with_message(pattern)
m = Module.new
m.singleton_class.instance_eval do
define_method(:===) do |e|
pattern === e.message
end
end
m
end
And here is our customerized method.
It create a temporary module with a triple equal sign method that using regular expression.
38. arbitrary block predicate
begin
raise "Timeout while reading from socket"
rescue errors_matching{|e| e.message =~ /socket/}
puts "Ignoring socket error"
end
In fact, we can make it more general.
We can pass a arbitrary block predication.
The errors_matching method passing a block to check if it should rescue or not.
39. def errors_matching(&block)
m = Module.new
m.singleton_class.instance_eval do
define_method(:===, &block)
end
m
end
Here is the method implementation.
40. “In practice, the rescue clause should be a
short sequence of simple instructions designed
to bring the object back to a stable state and to
either retry the operation or terminate with
failure.”
– Bertrand Meyer, Object Oriented Software Construction
So, what’s the “rescue”purpose after all?
Bertrand said at Object-Oriented software construction…
41. 3. ensure
begin
# do something
raise 'An error has occured.'
rescue => e
puts 'I am rescued.'
ensure
puts 'This code gets executed always.'
end
3/5
ensure part will always be executed, whether an exception is raised or not. They are useful for cleaning up resources.
42. 4. retry
be careful “giving up” condition
tries = 0
begin
tries += 1
puts "Trying #{tries}..."
raise "Didn't work"
rescue
retry if tries < 3
puts "I give up"
end
4/5
Next is retry keyword. Ruby is one of the few languages that offers a retry feature. retry gives us the ability to deal with an exception by restarting from
the last begin block.
!
The most important thing about retry is that you should define a “giving up” condition. Otherwise it may become an infinite retry loop. And One of the
most simple “giving up” solution is a counter. In this example we give it a counter three. After three time tries it will give up.
43. 5. else
begin
yield
rescue
puts "Only on error"
else
puts "Only on success"
ensure
puts "Always executed"
end
5/5
else keyword is the opposite of rescue; where the rescue clause is only hit when an exception is raised, else is only hit when no exception is raised.
44. Recap
• raise
• rescue
• ensure
• retry
• else
let’s recap what we learn.
We learn five Ruby syntax about exception handling.
They are raise, rescue, ensure, retry and else.
45. 3. Caveat and Guideline
(15min) Next, let’s checkout some caveat and guideline.
46. 1. Is the situation truly
unexpected?
1/6
First important question is.
is the situation truly unexpected?
should we use exception or not?
47. “ask yourself, 'Will this code still run if I
remove all the exception handlers?" If the
answer is "no", then maybe exceptions are
being used in non-exceptional
circumstances.”
– Dave Thomas and Andy Hunt, The Pragmatic Programmer
The answer is it depends!
In The pragmatic programmer book. Dave and Andy ask “Will this code still run if I remove all the exception handlers?”
For example, imagine that we’re going to open a file,
if the file should have been there, then using exception to make sure is good. But if you have no idea whether the file should exist or not, then it’s not
exceptional situation if you can’t find it.
48. User Input error?
User input error is a classic example that we can expect user will make mistake. We can 100% predict it will happened during normal operation. our user is
either fool or evil.
49. This is bad
def create
@user = User.new params[:user]
@user.save!
redirect_to user_path(@user)
rescue ActiveRecord::RecordNotSaved
flash[:notice] = 'Unable to create user'
render :action => :new
end
Take a concrete rails example. Since we can “expect” invalid input, so we should NOT be handling it via exceptions because exceptions should only be
used for unexpected situations.
50. def create
@user = User.new params[:user]
if @user.save
redirect_to user_path(@user)
else
flash[:notice] = 'Unable to create user'
render :action => :new
end
end
So rather then using exception, we should use normal if else control flow. This will make our code more readable.
Exceptions should only be used for unexpected situations
51. Record not found?
Another classic example is record not found.
If you can “expect” that record can be not found, then you should not use “exception” for not found situation.
52. begin
user = User.find(params[:id)
user.do_this
rescue ActiveRecord::RecordNotFound
# ???
end
This is bad
One good way to deal with record not found problem is using null object. So rather raising exception when record not found, you can just return a null
object
53. Use Null object
user = User.find_by_id(params[:id) || NullUser.new
user.do_this
Like this, you can design a NullUser object.
This will make our code more simple.
54. Replace Exception with Test
def execute(command)
command.prepare rescue nil
command.execute
end
!
# =>
!
def execute(command)
command.prepare if command.respond_to? :prepare
command.execute
end
Here is another example about exception abuse, it’s from refactoring book.
!
You’re raising an exception on a condition the caller could have checked first. This exception is unnecessary because you can just check it.
55. “Exceptions should be used for exceptional
behaviour. They should not acts as substitute
for conditional tests. If you can reasonably
expect the caller to check the condition before
calling the operation, you should provide a test,
and the caller should use it.”
– Martin Folwer, Refactoring
Martin in refactoring book said that “exception should not acts as substitute for conditional tests.”
56. Spare Handler?
begin
# main implementation
rescue
# alternative solution
end
Spare handler mean you provide alternative implementation inside rescue part.
!
This is bad smell. you just make rescue part more complex.
57. begin
user = User.find(params[:id)
rescue ActiveRecord::RecordNotFound
user = NullUser.new
ensure
user.do_this
end
This is bad
We can use the same null object example.
In fact, the null object is just the alternative solution when record not found.
So in the example exception handling is totally unnecessary.
58. user = User.find_by_id(params[:id) ||
NullUser.new
!
user.do_this
We can just check it at normal implementation without exception handling.
59. 2. raise during raise
begin
raise "Error A" # this will be thrown
rescue
raise "Error B"
end
2/6
The second big issue about exception handling is raising during raise. In this code example, the original exception Error A is thrown away by Error B.
We lost Error A information.
This is a real problem if you’re not intent to do this.
It will make debug harder because we can’t trace Error A.
60. Wrapped exception
!
begin
begin
raise "Error A"
rescue => error
raise MyError, "Error B"
end
rescue => e
puts "Current failure: #{e.inspect}"
puts "Original failure: #{e.original.inspect}"
end
I’m not saying we should not raise during raise always. Because sometimes we want to convert low-level exception to high-level exception. It can reduce
implementation dependency and easier to understand for the high-level client.
So how to fix this? the solution is we wrap it to keep original exception.
In this example, we raise Error B during exception Error A. and we define our exception class called MyError.
61. Wrapped exception (cont.)
class MyError < StandardError
attr_reader :original
def initialize(msg, original=$!)
super(msg)
@original = original;
set_backtrace original.backtrace
end
end
Here is MyError implementation.
We keep original exception here.
62. Example: Rails uses the
technique a lot
• ActionDispatch::ExceptionWrapper
• ActionControllerError::BadRequest,
ParseError,SessionRestoreError
• ActionView Template::Error
In fact, Rails use the technique a lot too.
Like ExceptionWrapper and ActionControllerError, and Template::Error
!
63. 3. raise during ensure
begin
raise "Error A"
ensure
raise "Error B”
puts "Never execute"
end
3/6
Next issue is similar but worse, it’s possible raise exception during ensure part. Not only it will overwrite original Error, but also it will leave its cleanup
unperformed.
Remember that we know this “ensure” mission is to cleanup resource, so if it raises exception during ensure part, then it’s possible that there’re resources
leaking and this’s hard to debug.
!
So please avoid raise exception during ensure, try to make ensure part very simple.
64. begin
file1.open
file2.open
raise "Error"
ensure
file1.close # if there's an error
file2.close
end
Let’s take a more concrete example. if file1.close raise error. then files2 will not close.
This causes resource leaking.
!
So try to confine your ensure clauses to safe and simple operations.
65. a more complex example
begin
r1 = Resource.new(1)
r2 = Resource.new(2)
r2.run
r1.run
rescue => e
raise "run error: #{e.message}"
ensure
r2.close
r1.close
end
Combine above issues, I give you a challenge here.
Suppose we have two resource r1 and r2 objects.
we run it and there’s ensure part which will close resource.
66. class Resource
attr_accessor :id
!
def initialize(id)
self.id = id
end
!
def run
puts "run #{self.id}"
raise "run error: #{self.id}"
end
!
def close
puts "close #{self.id}"
raise "close error: #{self.id}"
end
end
Here is a resource implementation will make it have errors.
67. begin
r1 = Resource.new(1)
r2 = Resource.new(2)
r2.run
r1.run # raise exception!!!
rescue => e
raise "run error: #{e.message}"
ensure
r2.close # raise exception!!!
r1.close # never execute!
end
bascially, It will make
r2.run successfully
r1.run failed
r2.close failed
!
So, we can deduce that r1 will not close sadly.
!
68. Result
lost original r1 exception and fail to close r2
run 1
run 2
close 1
double_raise.rb:15:in `close': close error: 1 (RuntimeError)
Here is the output result.
We not only lost original r1 exception, but also fail to close r2.
How we fix this? the basic idea is to implement a wrapping class and try to stack all exceptions. I left the challenge to you.
69. 4. Exception is your method
interface too
• For library, either you document your exception, or
you should consider no-raise library API
• return values (error code)
• provide fallback callback
4/6
I think all of you have the experience that exception is one of the most common source of surprise and frustration with 3rd-party library. because you can’t
sure what’s kind of exceptions may be raised when I call these library. In java , they try to solve this issue via checked and unchecked exceptions, but it
seems not very successful. so unfortunately, there’s no perfect way to solve. What we can do it provide documentation, or you can consider no-raise library
API. No-raise library API means it either return error code or provide fallback callback. We will check this later.
70. 5. Exception classification
module MyLibrary
class Error < StandardError
end
end
5/6
Every library codebase should provide their error exception a namespace. This provides client code something to match on when calling into the library.
71. exception hierarchy
example
• StandardError
• ActionControllerError
• BadRequest
• RoutingError
• ActionController::UrlGenerationError
• MethodNotAllowed
• NotImplemented
• …
so client code can choose rescue parent class or child class easily. Here is ActionController Error from Rails.
72. smart exception
adding more information
class MyError < StandardError
attr_accessor :code
!
def initialize(code)
self.code = code
end
end
!
begin
raise MyError.new(1)
rescue => e
puts "Error code: #{e.code}"
end
Another simple way is we can make our exception more smart via adding more information.
In this example, we add an code attributes into our error class.
73. 6. Readable exceptional
code
begin
try_something
rescue
begin
try_something_else
rescue
# ...
end
end
end
6/6
Last advise is that we should keep our exceptional code readable and isolated. One single failure handling section leads to cleaner ruby code.
!
We can refactor this nested exception handling via extracting method.
74. Extract it!
def foo
try_something
rescue
bar
end
!
def bar
try_something_else
# ...
rescue
# ...
end
It will become like this. We can omit begin keyword too. Quite simple and easy to read.
75. Recap
• Use exception when you need
• Wrap exception when re-raising
• Avoid raising during ensure
• Exception is your method interface too
• Classify your exceptions
• Readable exceptional code
Recap what we learn:
76. 4. Failure handling
strategy
(30min) OK, let’s see some failure handling strategies.
77. 1. Exception Safety
• no guarantee
• The weak guarantee (no-leak): If an exception is raised,
the object will be left in a consistent state.
• The strong guarantee (a.k.a. commit-or-rollback, all-or-nothing):
If an exception is raised, the object will be rolled
back to its beginning state.
• The nothrow guarantee (failure transparency): No
exceptions will be raised from the method. If an exception
is raised during the execution of the method it will be
handled internally.
1/12
First, there’s a concept we need to learn. how we call a “exception handling” is safe or not? “Exception Safety” definition comes from C++ standard
library.
If you never notice your method is exception safe or not, and your methods are no guarantee mostly, then no wonder your code will crash because you
don’t know.
The weak guarantee is no leaking, the strong guarantee is commit-or-rollback
The nothrow guarantee is no exceptions will be raised.
78. 2. Operational errors
v.s.
programmer errors
https://www.joyent.com/developers/node/design/errors 2/12
Another great perspective is what’s kind of errors we can handle? we can divide them into two kinds: operational and programmer errors.
79. Operational errors
• failed to connect to server
• failed to resolve hostname
• request timeout
• server returned a 500 response
• socket hang-up
• system is out of memory
Operational errors represent run-time problems experienced by correctly-written programs. These are not bugs in the program.
For example, failed to connect to server, timeout, out of memory
80. Programmer errors
like typo
Programmer errors are bugs in the program. These are things that can always be fixed by changing the code.
81. We can handle operational
errors
• Restore and Cleanup Resource
• Handle it directly
• Propagate
• Retry
• Crash
• Log it
we have many way to handle operational errors.
82. But we can not handle
programmer errors
But we can not handle programmer errors.
There's nothing you can do town. By definition, the code that was supposed to do something was broken, so you can't fix the problem with more code.
83. 3. Robust levels
• Robustness: the ability of a system to resist change
without adapting its initial stable configuration.
• There’re four robust levels
http://en.wikipedia.org/wiki/Robustness 3/12
Third concept you should know is robust levels.
There’re four rebuts levels which can be your non-functional requirement.
84. Level 0: Undefined
• Service: Failing implicitly or explicitly
• State: Unknown or incorrect
• Lifetime: Terminated or continued
Level 0 is undefined, which means you do nothing.
everything is unknown or incorrect when something goes wrong.
85. Level 1: Error-reporting
(Failing fast)
• Service: Failing explicitly
• State: Unknown or incorrect
• Lifetime: Terminated or continued
• How-achieved:
• Propagating all unhandled exceptions, and
• Catching and reporting them in the main program
Level 1 is error reporting, or we call it failing fast.
All unhanded exceptions will be caught and reported.
86. Anti-pattern:
Dummy handler (eating exceptions)
begin
#...
rescue => e
nil
end
Here is a anti-pattern that rescue errors but doesn’t report or propagate.
It hides the problem and make debug harder.
87. Level 2: State-recovery
(weakly tolerant)
• Service: Failing explicitly
• State: Correct
• Lifetime: Continued
• How-achieved:
• Backward recovery
• Cleanup
Level 2 is state recovery, or we call it “weakly tolerant”
your system state will be correct after error.
we can achieve this via backward recovery and cleanup resource.
Basically it means restore original status.
88. require 'open-uri'
page = "titles"
file_name = "#{page}.html"
web_page = open("https://pragprog.com/#{page}")
output = File.open(file_name, "w")
begin
while line = web_page.gets
output.puts line
end
output.close
rescue => e
STDERR.puts "Failed to download #{page}"
output.close
File.delete(file_name)
raise
end
We already talk about cleanup, let’s emphasise backward recovery.
Here is another concrete example from programming ruby book.
It fetches a webpage and save it as file.
The critical part in rescue is it will do backward recovery, that is delete the file if there’s exception.
89. Level 3: Behavior-recovery
(strongly tolerant)
• Service: Delivered
• State: Correct
• Lifetime: Continued
• How-achieved:
• retry, and/or
• design diversity, data diversity, and functional
diversity
Level 3 is behavior-recovery, or we call it “strongly tolerant”
Your service will be still available. To achieve this, you have to implement retry and/or design alternative solutions.
Retry primary implementation may work for some temporary errors, like network problem.
But more often you have to implement alternative solution for possible operational errors.
!
In fact, in this morning Zero downtime payment platforms talk, Prem just give a great talk about how they design a failure tolerant system.
90. Improve exception handling
incrementally
• Level 0 is bad
• it’s better we require all method has Level 1
• Level 2 for critical operation. e.g storage/database
operation
• Level 3 for critical feature or customer requires. it
means cost * 2 because we must have two solution
everywhere.
What we can learn from these robust levels? We can define our system robust level from beginning.
We know Level 0 is bad, and it’s better we requires every method is Level 1. For critical operation like storage or database operations, we can make it be
Level 2. Finally, for very critical feature or customer requires, we can achieve Level 3. But Level 3 means you will double your development cost at least
because we must implement two solution for the critical feature.
91. 4. Use timeout
for any external call
begin
Timeout::timeout(3) do
#...
end
rescue Timeout::Error => e
# ...
end
4/12
Strictly, timeout is not about handling exception.
It’s an exception you should raise when you have external call. it can help your software more robust.
Usually HTTP library already implement some timeout, but please keep in mind that you have it.
For example, if you implement your RPC call via AMQP like RabbitMQ, then you have to implement your timeout yourself.
92. 5. retry with circuit breaker
http://martinfowler.com/bliki/CircuitBreaker.html 5/12
We mentioned retry with simple counter. Here is another elegant trick about retry: circuit breaker
!
The basic idea is wrapping a protected function call in a circuit breaker object which monitors for failures. Once the failures reach a certain threshold, the
circuit breaker trips, and all further calls to the circuit breaker return with an error. More details implementation you can checkout martin fowler's blog.
93. 6. bulkheads
for external service and process
begin
SomeExternalService.do_request
rescue Exception => e
logger.error "Error from External Service"
logger.error e.message
logger.error e.backtrace.join("n")
end
6/12
bulkheads is a strategy that keep a failure in one part of the system from destroying everything. Basically it stops the exception propagating. it can help
your software more robust, especially for distributed system.
94. !
7. Failure reporting
• A central log server
• Email
• exception_notification gem
• 3-party exception-reporting service
• Airbrake, Sentry New Relic…etc
7/12
Reporting exceptions to some central location is a common sense now.
Besides building your own central log server, you can also use email or any 3-party service like airbrake.
95. 8. Exception collection
class Item
def process
#...
[true, result]
rescue => e
[false, result]
end
end
!
collections.each do |item|
results << item.process
end
8/12
exception collection is a very useful strategy when you bulk processing data. If you have thousands of data want to be processed, then fail fast is not good
for this scenario.
For example: running tests, provision…etc
We will want to get a report at the end telling you what parts succeeded and what parts failed.
96. 9. caller-supplied fallback
strategy
h.fetch(:optional){ "default value" }
h.fetch(:required){ raise "Required not found" }
!
arr.fetch(999){ "default value" }
arr.detect( lambda{ "default value" }) { |x|
x == "target" }
9/12
caller-supplied fallback strategy is a useful trick when you design your library API.
If library itself doesn’t have enough information to decide how to handle exception, then client code can inject a failure policy into the process.
Ruby standard API use this trick too, for example, fetch and detect. What’s happened when key doesn’t exist? we can inject a block to return default
value.
97. “In most cases, the caller should determine
how to handle an error, not the callee.”
– Brian Kernighan and Rob Pike, The Practice of Programming
In the practice of programming book, brain said: ….
so you can delegate the decision to the caller in some way.
99. 10. Avoid unexpected
termination
• rescue all exceptions at the outermost call stacks.
• Rails does a great job here
• developer will see all exceptions at development
mode.
• end user will not see exceptions at production
mode
10/12
This strategy is saying you should rescue all exceptions at the outermost call stacks, so end-user will see a nice termination.
I think rails does great jobs here.
It provides detailed backtraces at development mode.
And it provides error page to end user at production mode.
100. 11. Error code v.s. exception
• error code problems
• it mixes normal code and exception handling
• programmer can ignore error code easily
11/12
Error code is the traditional way to deal with error.
In most case it’s anti-pattern, because it mixes normal code and exception handing.
!
And more importantly, programmer can easily ignore error code.
101. Error code problem
prepare_something # return error code
trigger_someaction # still run
http://yosefk.com/blog/error-codes-vs-exceptions-critical-code-vs-typical-code.html
For example, if the prepare_something fails and return an error code.
and programer ignore them, then trigger some action will still perform.
102. Replace Error Code with
Exception (from Refactoring)
def withdraw(amount)
return -1 if amount > @balance
@balance -= amount
0
end
!
# =>
!
def withdraw(amount)
raise BalanceError.new if amount > @balance
@balance -= amount
end
In refactoring book, there’s even a refactor called replace error code with exception.
103. Why does Go not have
exceptions?
• We believe that coupling exceptions to a control
structure, as in the try-catch-finally idiom, results in
convoluted code. It also tends to encourage
programmers to label too many ordinary errors, such as
failing to open a file, as exceptional. Go takes a different
approach. For plain error handling, Go's multi-value
returns make it easy to report an error without
overloading the return value. A canonical error type,
coupled with Go's other features, makes error handling
pleasant but quite different from that in other languages.
https://golang.org/doc/faq#exceptions
But there’s counterview too. it’s golang.
golang doesn’t have exception. in its FAQ, it said that they believe error code is better.
104. “It's really hard to write good exception-based
code since you have to check every single line
of code (indeed, every sub-expression) and
think about what exceptions it might raise and
how your code will react to it.”
– Raymond Chen
http://blogs.msdn.com/b/oldnewthing/archive/2005/01/14/352949.aspx
one of exception opponent also points out the shortcoming of exception, that every piece of code can raise exception.
105. When use error code?
!
• In low-level code which you must know the
behavior in every possible situation
• error codes may be better
• Otherwise we have to know every exception that
can be thrown by every line in the method to
know what it will do
http://stackoverflow.com/questions/253314/exceptions-or-error-codes
So, any suggestion when to use error code instead of exception? one answer I found one answer at stackoverflow is relevant. He recommends using error
code for lower-level code which you must know the behaviour of in every possible situation. Otherwise we have to know every exception that can be
raised by every line in the method to know what it will do.
106. 12. throw/catch flow
def halt(*response)
#....
throw :halt, response
end
def invoke
res = catch(:halt) { yield }
#...
end
12/12
Sometimes if you really want the kind of GOTO feature, then ruby provides throw/catch syntax.
So we can distinguish it’s not exception handling. This example comes from sinatra. I can’t find Rails source code use it.
107. Recap
!
• exception safety
• operational errors v.s. programmer errors
• robust levels
Let’s recap what we learn.
we learn how to decide a method is exception safe or not.
we learn there’re operational errors and programmer errors.
we learn robust levels and we can improve our exception handling incrementally.
108. Recap (cont.)
• use timeout
• retry with circuit breaker pattern
• bulkheads
• failure reporting
• exception collection
• caller-supplied fallback strategy
• avoid unexpected termination
• error code
• throw/catch
we learn we should use timeout for external call. and learn some strategies like retry with circuit breaker, bulkheads, failure reporting, exception collection,
caller-supplied fallback strategy and avoid unexpected termination.
we also learn when to use error code and throw/catch flow.
!
That's all I have to share with you today.
110. QUESTIONS!
begin
thanks!
raise if question?
rescue
puts "Please ask @ihower at twitter"
ensure
follow(@ihower).get_slides
applause!
end
I don’t expect Q&A time, the time should already run out. please tweet me if you have question, or catch me at teatime.
Thanks again.