SlideShare une entreprise Scribd logo
1  sur  84
Télécharger pour lire hors ligne
The history of
fire escapes
Tanya Reilly @whereistanya
Abstract:
When a datacenter goes offline, a server gets overloaded, or a binary hits a crashing bug,
we usually have a contingency plan. We reduce damage, redirect traffic, page someone,
drop low-priority requests, follow documented procedures. But why do many failures still
come as a surprise? In this talk, we look at some real life analogs to preventing and
managing software failures. Fire partitions. Public safety campaigns. Smoke alarms.
Sprinkler systems. Doors that say “This is not an exit”. And fire escapes. What can we
learn from the real world about expecting failure and designing for it?
---
https://commons.wikimedia.org/wiki/Category:Fire_escapes#/media/File:ISO_7010_E
016.svg Public domain.
Slide template started as Oivia from SlidesCarnival and then drifted into something
very else.
"When we first dropped
our bags on apartment
floors…"
Welcome To New York
Taylor Swift
Good morning! So, I'm a New Yorker. I'm not from the US -- I'm an immigrant -- but
one of the many things I love about New York City is that you move here, and it’s
immediately your city. The number one criterion for being a New Yorker is wanting to
be a New Yorker. It's a welcoming place. So good morning to my fellow New Yorkers,
wherever you're originally from, and, if you're travelled to be here, welcome to New
York. We're glad to have you.
I work in Site Reliability and I'm especially interested in what happens when things
fail, the contingency plans we use to recover when something breaks. And last year I
was thinking about that a lot and walking around the city and I started really noticing
that New York is *covered* in fire escapes. They’re a contingency plan too. They’re
for incident response. You don’t use them until all of your regular methods of getting
out of the building have failed.
So I started reading about fire escapes.
----
https://unsplash.com/photos/Iyd__3m4XF8 CC0
content warning: fire
Before I say more about that, let’s talk content. This talk is about at disaster
prevention and disaster recovery in software, by looking at parallels in building fires.
This will include stories of some of the worst fires in the history of new york city.
We'll be looking at the reasons fires started, the stuff that helped them spread and
how people died. There's also some pictures of buildings on fire. Nothing lurid, but
there are pictures.
If you have raw feelings related to recent fires, this could be rough.
If you'd be more comfortable skipping this one, you should do that with my blessing.
While you're packing up, I'll even tell you what I'm going to say, so you don't miss
anything:
Tony Fischer
CC BY-2.0
Fireproof buildings are more
effective than fire escapes.
Fireproof software is more
effective than incident
response.
Where's our fire code?
Here's my thesis
● fire escapes are a hacky bit of afterthought tacked on to the outside of a
building after the building is finished. If you're using fire escapes, it's worth
making them as good as possible, but you’ll prevent more fires if you build
better buildings.
● Similarly, incident response is often a hacky bit of afterthought tacked on long
after software is released. Again, great incident response can help you recover
faster than if you don’t have it but… you’ll prevent more outages if you build
better software.
● Finally, buildings have an extremely detailed fire code, but we don't really have
an extremely detailed systems engineering code for software, and I think we
should have.
Now I'm going to say the same thing but take 35 minutes.
---
How Much is that Doggie in the Window? https://flic.kr/p/72Lhz1 CC BY 2.0
Claudia Heidelberger
CC BY-ND 2.0
Greenwich
Village
Fire escapes were really only built in New York City for a hundred years. They weren't
common until the 1860s, and in the 1960s they stopped being allowed on new
construction.
There's some debate now about whether we should start removing them in places
where the building has been upgraded, or whether they should be preserved as part
of the city's history.
I think at least some of them should be preserved. Look how beautiful that is!
--
Claudia Heidelberger CC BY-ND 2.0. https://flic.kr/p/oqYYv1
Dan DeLuca
CC BY-2.0
East
Village
And here's another lovely one. They made an effort to have it match the style of the
building, not feel like a separate thing tacked on at the end. And I think that's key.
--
Dan DeLuca CC BY 2.0
https://flic.kr/p/76Jmb2
“
"fire escapes were
haphazardly
attached to the
most elaborately
designed facades"
Richard Plunz, a History of
Housing in New York City
7
But most of the time, the people adding the fire escape didn't think of it as part of the
building .As this quote says, fire escapes were haphazardly attached to the most
elaborately designed facades. The facade of the building was architecture but the
fire escape was law.
It was an external contingency plan, not part of the main structure. And I think that's
part of why fire escapes ended up not being successful.
---
https://books.google.com/books?id=fcKlDAAAQBAJ&pg=PA24
A brief history of
New York City fires
(With apologies to actual historians)
But I'm jumping to the end. Let's look at the evolution of New York City's fire code.
By the way, my great fear now is that there’s a building historian in the room who
will listen to this and be like “Nope, that is really not what happened." Please forgive
any errors, building historian! If i made mistakes, I would love if you would come tell
me at the end!
The
Financial
District
1835
On to the history. We’re skipping the great fire of 1776, and jumping straight to 1835
and the Financial district.
This was a commercial, not residential area, and as a result the number of fatalities
was comparatively low -- two people -- I mean, still, two too many, but this is mostly
remembered as a fire that cost a LOT of money. Almost 700 buildings were
destroyed. The city had 26 fire insurance companies. This fire put 23 of them out of
business.
---
https://en.wikipedia.org/wiki/Great_Fire_of_New_York#/media/File:The_Great_Fire_of
_the_City_of_New_York_Dec_16_1835.jpg Public domain.
no failure domains
contingency plans failed
exhausted incident responders
what happened?
1835
The fire was caused by a burst gas pipe in a maze of wooden warehouses. Wood
burns easily so there were no failure domains: the fire spread very quickly. Inside two
hours it covered 17 city blocks, most of the financial district.
The city's water supplies were low and the typical contingency plan was to pull water
from the rivers, but it was a freezing night in December and first the firefighters had to
cut through ice.
At the time it was also common to use gunpowder to level buildings and stop the fire
spreading. But they had used up all their gunpowder on a fire two days earlier. That
fire involved the entire fire department of 1500 people, and they were still exhausted.
Still, they fought the fire for 15 hours until marines from the Brooklyn Navy Yard
arrived with more gunpowder and blew up some buildings along Wall Street to make
a barrier.
dedicated incident responders: a
professional fire department
new infrastructure: the Croton Aqueduct
better incident response
1835
As a result of the fire, the city stopped using volunteer firefighters and moved to a
professional force with better equipment.
And they built the Croton Dam and Aqueduct. It was built because of the fire,
but a reliable water source is good for lots of reasons!
---
No longer in use, btw. It was replaced with the New Croton Dam, which still
supplies a small fraction of the city's water. The old one is on the National
Register of Historic Places.
robust structures: they rebuilt in
stone
better buildings
1835
But more importantly, as well as better incident response, they took
the opportunity to make a more resilient city. The fire spread fast
because the buildings were made of wood. They rebuilt with stone and
brick.
And this paid off, ten years later, when there was another enormous fire. The
great fire of 1845 was very bad -- thirty people died -- but it didn’t spread
as far or as fast, because it slowed down when it hit those new brick buildings.
1860
Tenements
Let’s jump forward 25 years and talk about tenements. Tenements were extremely
dense, extremely terrible housing. I'd read about tenements but hasn't realised the
scale of them. In the 1860s, nearly 500 thousand people -- more than half the city --
lived in tenements.
The population of New York City doubled every decade between 1800 and 1880.
Maybe you've seen this with teams and software systems: when you grow rapidly,
you can build some culture problems and some technical debt. This was certainly the
case here. Landlords made more accommodation by splitting big rooms into many
smaller ones, mostly with no light or ventilation. These were really awful places to live.
They were crime riddled, filthy and filled with disease. Every report about them
mentioned that they were fire traps.
In 1860, two tenement fires happened back to back.
--
https://en.wikipedia.org/wiki/Tenement#/media/File:LowerEastSideTenements.JPG
Public domain.
Elm St:
http://www.nytimes.com/1860/02/03/news/calamitous-fire-tenement-house-elm-street-
destroyed-thirty-persons-supposed-have.html?mtrref=www.google.com
45th street fire:
http://www.nytimes.com/1860/03/29/news/destructive-fires-four-tenement-houses-des
troyed-two-mothers-eight-children.html?pagewanted=all
Quote about the buildings from that second article:
“If a skillful man, with a deadly hatred of his race in his heart, sat down to plan a
human residence in which to entrap and destroy those who should dwell in it, it is
extremely probable that if he had seen these houses in West Forty-fifth-street he
would take them as a model. “
what happened?
1860
no isolation
obsolete contingency plans
no failure domains
The first one, on Elm Street, started in a bakery on the ground floor of a large
residential building. Terrible place for a bakery, but that's where it was. The baker was
storing a lot of hay and wood shavings, and when they burned they made dense
smoke, killing some of the people who lived in the higher floors before the fire even
got up there.
The wooden stairway quickly burned away, trapping people on the top floors.
Firefighters arrived with ladders, but the ladders only went to the fourth floor and this
was a six storey building. At least 10 people died.
A month later four houses burned on west 45th street. These houses had roof
hatches called scuttles, which should have let people escape across the roofs, but
they all were missing their ladders so nobody could get up there. Another ten people
died.
An optimistic disaster plan is a useless
disaster plan
These escape plans -- the ladders and scuttles and the roof -- had worked fine for a
previous iteration of shorter NYC buildings, but they hadn't been updated for the new
shape of the city.
Just like with the water and the gunpowder, there was a plan in place for a fire
disaster. And just like them, the plan only worked in the most optimistic
circumstance.
We see that all the time. Backups that will work if we lose the database in a very
specific way. Failover plans that only work if we have two weeks notice of the failover
and the old data center doesn't lose power.
better buildings
1860
new law: an Act to Provide Against
Unsafe Buildings in the City of
New York
The city immediately passed a law to make the tenements more robust against fire.
They even put an injunction on new tenement construction until the law was passed.
Now houses for more than eight families (kind of specific) had to have fire-proof
stairs either inside or outside the building.
What’s frustrating about this is that four years earlier a commission had reported that,
if there was a fire, tenants on the 6th and 7th floors of tenements had basically zero
chance of survival. They recommended fire proof stairs. But nothing happened until
a bunch of people died.
---
●
Tenements must
have fire escapes...
The
Tenement
House Act
1867
Seven years later, the Draft Riots (which are a whole separate awful thing in which a
whole bunch of people died) led to another law: the Tenement House act. This act
had good goals but it was extremely unsuccessful.
Buildings had to have a fire escape, but they didn't have to make anyone safer! So
landlords put up fire escapes that couldn’t hold the number of people in the house, or
that weren’t well attached to the walls or that were just a rusty ladder. And what even
was a fire escape? Well, it wasn't well defined.
Let's take a diversion and look at some fire escape patents.
As we look at them, you might want to think of disaster recovery plans you have
known and loved.
---
The picture’s actually from 1900 but whatever :-D
https://commons.wikimedia.org/wiki/File:New_York,_N.Y.,_yard_of_tenement_LOC_d
et.4a18586.jpg Public domain.
William
Houghton's
fire escape
1891
This is a ladder with a counterweight. Imagine climbing down from the 7th floor of
your building on one of these. With your six children. In the rain. In a dress that went
to your ankles.
--
https://en.wikipedia.org/wiki/Fire_escape#/media/File:Houghton%27s_Fire_Escape_1
877.jpg Public domain.
Mary
McArthur's
fire escape
1904
This is a kind of rope ladder that attaches to a window sill.
http://www.google.com.pg/patents/US800934
William
Bedinger's
fire escape
1915
This is a parachute that rolls up very small. The idea was that you'd carry it with you
everywhere in case you were in any tall building fire situations.
https://www.google.com/patents/US1168465
Henry
Vieregg's
fire escape
1902
According to this patent, and I quote: "A person desiring to escape seizes one
member of the cord, rope, or chain, as shown in Fig. 1, and forthwith jumps out of the
window. [...]"
Like, I am looking at this thing and do not feel like I could forthwith jump out of
anything.
https://www.google.com/patents/US708846
Anna
Gonnelly's
fire escape
1887
Anna Gonnelly's fire escape was a bridge that you could sling from your roof to
another building. It had side rails, so it was only moderately terrifying.
https://www.google.com/patents/US368816
Pasquale
Nigro's
fire escape
1909
This one is just fantastically ludicrous. But good if you want to fight supervillain crime?
All of these patents were granted, btw.
GOOGLE PATENT US 912152 A
BB
Oppenheimer's
fire escape
1879
And this one… You might think that this is just a parachute helmet. It is not. It is a
parachute helmet and a pair of very bouncy shoes.
GOOGLE PATENT US 221855 A
.
Nicholas
Borgfeldt's
fire escape
1882
Finally, I've read this patent three times and I'm fairly convinced that the guy invented
a rope. It's the most silicon valley invention of 1882.
Though, let's be clear, rope was a popular kind of fire escape. In fact, it was the state
of the art for hotels.
https://www.google.com/patents/US267399
Every hotel's
fire escape
1887
Puck Magazine, 1887
I don't mean a ladder made of rope, I mean literally a rope. Every hotel room had to
have a rope and that was the only fire escape. Even at the time, people found that
pretty terrible.
This is part of a snarky cartoon from a magazine called Puck, published in 1887, of a
whole lot of people trying to use the ropes.
Like most of those other parents, it's designed for the easiest case: someone
with upper body strength and agility who isn't wearing a skirt or carrying a
child. If your disaster plan only works for the easiest case, it's not a good plan.
I want to emphasise here that a rope is better than nothing. In fact, probably every
one of these fire escapes, even mister parachute hat, is better than nothing. But these
escape plans are not where I would put my efforts if I wanted to have fewer people
die in fires. But this is what the law focused on.
--
https://books.google.com/books?id=XwAjAQAAMAAJ&pg=PA48
Pre 1923 so public domain
1867
Tenements must also
have windows...
The
Tenement
House Act
(continued)
Anyway! The Tenement House Act.
Even with fire escapes, tenements were still terrible. They were badly constructed,
overcrowded, and -- I find this amazing -- it was perfectly legal to store lots of
combustible materials in them.
One other thing the tenement act said, was that every room now had to have a
window. And just like “what even is a fire escape” it didn’t define “what even is a
window”. So the landlords cut holes in interior walls between rooms and called them
"interior windows".
A decade later, the law said sigh, ok, exterior windows. So landlords started
constructing buildings with air shafts, little narrow gaps between buildings. Now,
picture it, you have no indoor plumbing and the bathroom is down six flights of stairs
and now you have an air shaft. You can imagine how that goes. One article I read
described the air shaft as “festering tubes of disease”. Very poetic!
And many of the fire escapes just led down to these air shafts and there was no way
out from there.
---
https://en.wikipedia.org/wiki/Old_Law_Tenement#/media/File:Airshaft_of_a_dumbbell
_tenement,_New_York_City,_taken_from_the_roof,_ca._1900_-_NARA_-_535468.jp
g Public domain.
1871
Carla Geisser CC BY THANK YOU CARLA <3
Tenements
must have
usable fire
escapes.
By 1871, iron fire escapes were becoming common and of course people were using
them as extra space. You still see that now -- they're used for bikes and gardening
and barbecues and cat runs. All of that has been illegal since 1871. Because it makes
the fire escape very hard to use in a fire!
A later law said that every fire escapes had to have a cast-iron sign saying that you
could be fined for obstructing your fire escape. And it was fair, because usable fire
escapes are better than unusable ones.
But, again, it was still perfectly legal to run your explosive business out of a tenement
basement and tons of residential fires started because of deep frying crullers. And
anyway, the regulations were mostly not enforced, so people didn't pay much
attention.
------------
The encumbrance sign thing is from 1885, but encumbrances were illegal from 1871
and mentioning this many dates makes *my* ears glaze over and I'm already
interested in this. So we're conflating two things to keep it moving along.
Image by Carla Geisser, used with permission.
1876
The
Brooklyn
Theater
Fire
In 1876, the Brooklyn Theater on Cadman Plaza.
The final act of the play was about to start and the stage manager noticed a very tiny
fire on the left of the stage.
---
https://en.wikipedia.org/wiki/Brooklyn_Theatre_fire#/media/File:BrooklynTheatre_Fro
m_Johnson_Street_Looking_East.jpg Public domain.
obsolete contingency plans
encumbrances
unpracticed incident response
delayed escalation
restricted access
what happened?
1876
It was typical to keep buckets of water next to the stage, but there weren't any. There
was a fire hose, but too much scenery was piled beside the stage and he couldn't get
to it. There's those encumbrances again.
The stage manager asked a couple of carpenters to put the fire out by beating it with
poles. This didn't work and actually spread some sparks, setting fire to the loft.
The actors -- laudably -- wanted to avoid a panic, so they announced that the fire was
part of the show, and that people shouldn't freak out, but once the audience realised,
they stampeded. And they had trouble getting out. We have a real stampeding herd
problem here: there was only one stairway down from the cheap seats at the top, and
everyone trying to use it at once. It filled with smoke. There were no fire escapes and
some exits were locked to prevent against gate crashers so people couldn't get out
that way.
278 people died. At the time, it was the worst theater fire in US history. It's now the
third worst because we really don't learn.
accountability: prosecutions
new laws for exits and encumbrances
automated response: sprinklers!
better buildings
1876
The jury blamed the theater owners for not obeying a bunch of existing fire laws, and
new laws were written, including widening exits and not storing stuff on the stage. In
1882, the building code said that theatres had to have automatic sprinklers: it's the
first type of building in the city to require sprinklers. The first automated response.
What I find remarkable is that this fire happened nine years after regulation said that
tenements had to have safe exits, but those laws didn't carry over to theatres, or to
other types of buildings like: hotels, schools, factories, ships, offices. I'm going to
spare you most of the horror stories, but we'll look at factories in a minute, after….
1890-
1901
Even more
Tenement
House Acts!
...we get proper no-kidding tenement regulation at last! And we even do it without a
bunch of people dying!. Thank you Jacob Riis!
In 1890, this guy called Jacob Riis published a book about tenement life called How
the Other Half Lives and did a lecture tour on it. And up until now the upper and
middle class people of New York City had sort of known the tenements were awful,
but for the first time ever, there were photographs. It was harder to ignore. Well, it was
probably part empathy, part fear of smallpox coming out of there but, whatever, over
the next decade, people started to care.
I was really reassured when I read this, because until then it had been all “there was a
horrific fire and we added a very specific law and then there was a different horrific
fire and we added a different very specific law”. And it was mostly like that! But this
Tenement House Act came from someone saying “wow, look how much this sucks” in
a compelling way. And that gives me hope!
Anyway, the next couple of Tenement House Acts included having to have actual
windows, not air shafts, and fire escapes couldn't be ladders any more: they had to
have open balconies and stairs and be properly attached to the wall. Even better:
your neighbours can no longer boil oil in the basement! Hurray! And all new
construction has to have interior fire partitions. Failure domains!
We're finally looking at stopping fires from starting and spreading, not just escaping
from them. And, best of all, it’s all actually going to be enforced. Welcome to the 20th
century!
But, oh yeah, it still sucks in factories.
--
https://commons.wikimedia.org/wiki/Category:How_the_Other_Half_Lives#/media/File
:How_the_Other_Half_Lives_front_cover.png Public domain.
http://www.americanyawp.com/text/how-the-other-half-lived-photographs-of-jacob-riis/
Public domain because pre 1923.
https://commons.wikimedia.org/wiki/File:Jacob_Riis_portrait.jpg Public domain
because pre 1923.
The Newark
Factory Fire
1910
The triangle shirtwaist is the famous one, but the Newark factory fire a few months
earlier is a textbook disaster waiting to happen so I wanted to talk about it.
This building had two fire escapes -- look at the size of this building! One of them was
a really heavy ladder that needed to be lifted into place. Another emergency plan that
only worked for people with good upper body strength. In the fire, the young women
who worked in this factory weren't able to lift down the ladder. So.. only one fire
escape.
--
http://www.oldnewark.com/histories/factoryfirearticle.php
no isolation
no monitoring
ignored warnings
delayed escalation
what happened?
1910
blameful culture
restricted access
untested contingency plans
no drills
etc, etc
The building was shared by a couple of paper box companies, a nightgown factory
and a lamp manufacturer. It had previously been used by machine companies and the
floors were soaked in oil.
A fire started in the lamp factory. There was no fire alarm, and the bottom three floors
had evacuated before they realised that 116 people up on the 4th didn't know there
was a fire.
This building had had ten fires in ten years and the buildings department had
condemned this factory three times, but the factory owners basically ignored them
and kept running. All of that was expensive for insurance and they didn't want another
fire on their record, so they delayed calling in the firefighters, even though the
firehouse was just across the street.
The firehouse had a policy of reprimanding their firefighters for false alarms -- no
blameless post-mortems here! -- so before raising a general alarm, they sent a
couple of guys over with a fire extinguisher, delaying the real response even more.
The only door up to the 4th floor was kept locked, which was against the law. The
windows wouldn't open and the victims had to break glass with their hands. The
window sills were four feet off the ground and the platform up to them broke under the
weight of people trying to get out.
And the victims had never been in a fire drill and they had no idea what to do. They,
quite reasonably, freaked out.
25 people died, 32 more were very badly injured.
I feel like I could spend an hour just talking about this fire. There's so much to learn
from it.
---
http://www.oldnewark.com/histories/factoryfirearticle.php is really good and I
recommend it, if you don't mind being angry)
Human error is never the root cause
When officials investigated, they said the root cause was not the walls soaked in
grease, or delaying calling fire fighters, or the locked door, or the lack of smoke
alarms or the unusable fire escapes. It was that "the victims merely succumbed to
panic"
The way humans react to a disaster can definitely make the situation worse --
remember those carpenters with sticks in the theater -- but that is in no way their fault.
Humans will act in human ways. If your systems can't handle that, and you haven't
invested a lot of time in training the humans to act in some other way, your systems
are crap.
---State Farm CC BY 2.0
Ref: https://www.uvm.edu/histpres/HPJ/AndreThesis.pdf
“They died from misadventure and
accident.”
outcome...?
1910
Coroner's Jury,
December 1910
So what happened? Nothing. The jury didn't convict, though at least one juror later
said he regretted it. New Yorkers did look a bit at their factories and say "huh, I
wonder if we should care about that"..., but nothing changed. Is it because it
happened ten miles away instead of on the island of Manhattan? No idea. The New
York Fire Chief said "This city may have a fire as deadly as the one in Newark at any
time".
Four months later…
---
"They died from misadventure and accident" from
http://www.nytimes.com/2011/02/24/nyregion/24towns.html
"This city may have a fire as deadly as the one in Newark at any time." from
http://trianglefire.ilr.cornell.edu/primary/testimonials/tf_warnings.html
1911
The Triangle
Shirtwaist
Factory
146 people died inside 18 minutes. The famous Triangle Shirtwaist Fire.
--
https://timesmachine.nytimes.com/timesmachine/1911/03/26/104859694.html
https://en.wikipedia.org/wiki/Triangle_Shirtwaist_Factory_fire#/media/File:Image_of_T
riangle_Shirtwaist_Factory_fire_on_March_25_-_1911.jpg Public domain.
http://www.baruch.cuny.edu/nycdata/disasters/fires-triangle_shirtwaist.html
what happened?
no isolation
obsolete contingency plans
restricted access
ignored warnings
1911
This building was considered fireproof. They had done it right. They built a good
building. But it was packed with garments hanging so tightly together that the building
might as well have been made out of cloth.
The building should have had three fire escapes; it had one and that collapsed under
the weight of people escaping. Fire fighters came but the fire ladders and the water
could only get to the 6th floor and the city had gotten taller again: the factory was on
the 7th to 9th.
One exit was locked; the guy with the key escaped without unlocking it.
And the employers already knew about the problems. Employees had organised a
strike the previous year to protest the working conditions, and they'd been fired. The
building had had a recent warning notice from the department of sanitary control, but
they hadn't fixed their violations.
better tools: stronger pump, longer
ladder
better incident response
1911
The fire department developed a stronger water pump and a longer ladder, so
they could reach taller buildings.
laws: 60 in three years
automated response: sprinklers
accountability: the American Society of
Safety Engineers
better buildings
1911
But more importantly, building conditions took a big step forwards. There were 60
new laws over the next three years. Again, everyone knew factories were bad. But,
again, the law didn't change until a bunch of people died ON THE ISLAND OF
MANHATTAN.
Sprinklers started to be required in factories. (But only factories over seven stories
tall. Very specific again.)
A professional organisation, the American Society of Safety Engineers (which still
exists), was founded.
--
After the fire, the owners of Triangle Shirtwaist factory, Harris and Blanck, were
brought to court on charges of manslaughter but were eventually acquitted. They
were fined $75 for each life lost. However their insurance policy paid them a total of
$60,000, at the rate of $400 per life lost, so they actually profited from the tragedy.
After two years, they continued to lock the doors to exits and were fined for several
safety code violations. The worst people :-(
Phil Roeder
CC BY-2.0
"...a type of exit condemned by
the experience of many fires"
NFPA report, 1914
And at last, people started to look at fire escapes differently. After the disaster, a
report called them "a pitiful delusion." and "a type of exit condemned by the
experience of many fires".
---
http://www.nfpa.org/News-and-Research/Publications/NFPA-Journal/2014/September
-October-2014/Features/Fire-Escapes/1914-Sound-the-Alarm
Barbara L Hanson CC BY 2.0
Dan DeLuca CC BY 2.0
Eden, Janine and Jim CC BY 2.0
don toye CC-BY-ND 2,0
Kristine Paulus CC-BY-ND 2.0
"...a type of exit condemned by
the experience of many fires"
NFPA report, 1914
The report called out a lot of reasons fire escapes are terrible:
● the platforms are too small
● people put stuff on them
● they don't get a lot of maintenance
● snow and ice makes them slippy and dangerous
But most importantly
● they never, ever get tested.
---
Images:
Kristine Paulus CC BY 2.0. https://flic.kr/p/fszEDf (plants)
Dan DeLuca CC BY 2.0. https://flic.kr/p/5hsnTM (chairs)
Eden, Janine and Jim. CC BY 2.0. https://flic.kr/p/7G1tWZ (snow)
Barbara L. Hanson. CC BY 2.0. https://flic.kr/p/8uxpcf (rain)
Don toye, CC BY 2.0 https://flic.kr/p/9XrAs (bike)
“ ... fire escape collapses during
times of intense use – such as
during actual fires.
John W. Cramer, The Story
of a Tenement House
Fire escapes were known to collapse during times of intense use. But they
pretty much have one time of intense use. If they're going to collapse, it's
going to be during a fire.
So what do we do?
We have a couple of options here. We can add more regulations around fire escapes:
you have to maintain them, you have to try them out every year! There actually was a
law about regularly painting your fire escape. To prevent against slipping you have to
build a textured floor into the fire escape and leave a pair of shoes with good grips on
the top of each one… Or we could step back and ask whether we're optimising for the
wrong thing.
----
Quote via
http://www.boweryboogie.com/2014/10/favorite-pastime-tenement-fire-escapes/
A photo called "Fire Escape Collapse" received a Pulitzer in 1976. It's fairly
harrowing, so I'm not linking it here -- extreme content warning if decide
you go look at it -- but it made Boston rewrite its fire escape safety laws.
Journalists are amazing.
“
New York Times, February 25th, 1923
1923
In 1923, the New York Times had an article praising fireproof interior walls: "For six
years there has been no loss of life by fire in the 200 buildings so treated."
It blows my mind that a group of 206 buildings having no fire deaths in six years was
considered newsworthy.
In 1929 those fireproof walls became code: all new buildings over 75 feet in height
had to have them, and also had to have two fully enclosed staircases! Failure
domains are part of the code at last!
---
https://timesmachine.nytimes.com/timesmachine/1923/02/25/105849722.html?pageN
umber=141
1968
"Fire escapes
shall not be
permitted on new
construction"
John VanderHaagen CC BY 2.0
The idea of building better buildings gained traction and in 1968 fire escapes stopped
being allowed at all. The code still says "Fire escapes shall not be permitted on new
construction".
The 1968 code also required sprinklers for hotels and high-rise office buildings, but
not nightclubs or residential buildings.
----
" Fire escapes shall not be permitted on new construction, with the exception of group
homes. Fire escapes may be used as exits on buildings existing on December sixth,
nineteen hundred sixty-eight when such buildings are altered, subject to the approval
of the commissioner, or as provided in subdivision (b) hereof. "
https://commons.wikimedia.org/wiki/File:New_York,_New_York,_April_1968.jpg CC
BY 2.0
More fires. More very specific laws.
1975 - 2018
● In 1975, seven people died in a nightclub, so, sprinklers for required for
nightclubs.
● In 1998 there were two bad residential fires, and now you have to have
sprinklers for residences with four or more units.
● And I'm sure this story is not over and the code will be expanded many more
times in response to very specific things in which a bunch of people die.
Btw, there's no retrofitting of existing buildings. The laws only apply to new buildings
and existing buildings get better as they're renovated. So buildings in NYC comply to
the safety standard of whenever they were renovated last. Think about that, wherever
you sleep tonight.
---
https://pxhere.com/en/photo/900057 CC0
Fire deaths
decreased
because we
built better
buildings.
So that was 150 years of fire codes. For decades we considered it inevitable that
fires would start and spread, and we optimised for escaping from them. And we
definitely got good at responding to massive fire disasters. But slowly we made
progress on other, more important parts of the fire life cycle. Which I'm going to
describe in four stages:
https://commons.wikimedia.org/wiki/File:An_Old_Rear-Tenement_in_Roosevelt_Stree
t.png
Public domain.
1prevention
making it harder for the fire to start
We prevented sparks. A certain amount of sparks are ok! We need to cook food and
have birthday candles. But by becoming more deliberate about when we make
sparks, we made it harder for the fire to start at all. We moved bakeries out of
residential buildings, began doing wiring inspections, did public safety campaigns
about cooking and smoking.
---
https://www.pexels.com/photo/fire-match-smoke-flame-54627/ CC0
2detection
stopping it while it's small
We worked on detection and immediate amateur response: smoke alarms, fire
blankets, fire extinguishers, and more public safety campaigns. And we introduced
sprinklers.
---
https://commons.wikimedia.org/wiki/File:Fire-blanket-on-display.jpg
Public domain
3
50
isolation
preventing it from spreading
3. We introduced failure domains, to keep the fire to one small part of the building or
city. We started using materials that were hard to ignite so the fire would spread
slowly. And we did fire drills, to move humans quickly and safely away from the
danger area and to prevent the kind of panic that makes things worse.
--
https://unsplash.com/photos/MApjpqu9V7E CC0
4response
okay, we're fighting a fire
And only then, 4, emergency response. We also got better at responding to massive
fires. The New York Fire Department is *very good*.
But step 4, this is our last resort and we should try not to rely on our last resort. We
gained more from stopping the fire from getting to this point.
And, if you missed my extremely subtle metaphor here, it's the same for
software.
---
Image: skeeze. CC0. https://pixabay.com/en/firefighters-training-live-fire-696167/
reliability is everyone's job
1 prevention
2 detection
3 isolation
4 response
The most important reliability work is making problems stop before they get to that
fourth stage.
This means that reliability is everyone's problem. Everyone who's writing code or
designing systems should have reliability in mind.
Yeah, some people have a site reliability team. Just as we have people who specialise in
UI or security, both of which we should all care about, we can have people who specialise
in reliability and advocate for it. But, while SREs may occasionally act as firefighters, the
more important part of their job is to be the fire safety engineers, handing out smoke
alarms, legislating fire partitions, pointing out buildings that are made of wood,
advocating for the removal of clutter, educating everyone.
The part of their job which is being last resort firefighters? That skillset should be used
rarely. You don't want the NYFD running into your kitchen every time you burn toast. If
you're calling them in, it's a sign that something's gone horribly wrong. But it's still very
common to have firefighters reacting to every software problem.
There's a really nice tradition in the ops and SRE communities, where if a site is
down, people send #hugops on twitter to the people working on it. I want to
particularly call out Baron Schwartz sending hugops in advance to people running
mail servers on GDPR day :-D
I love #hugops. I send #hugops. But one thing you'll notice if you follow the hashtag is
that… a lot of things break and nobody is really surprised.
We're at the stage of software evolution where we expect software to fail. We need
to build better buildings in software too.
And that means we think about those same four stages.
---
Tweets used with permission.
1prevention
making it harder for the fire to start
Just like with buildings, a certain amount of sparks are fine for us too! We need to
make changes. Maybe something gets overloaded or a user does something we
didn't plan for. Many of us use the concept of error budgets: depending on how close
we are to missing our SLAs, we make more or fewer changes.
We can reduce our sparks:
---
https://www.pexels.com/photo/fire-match-smoke-flame-54627/ CC0
hiding the matches
55
Michael Chen CC BY 2.0
We can think about how users use our tools and provide clean, safe, validated
interfaces that are hard to get wrong. We can restrict their access to functionality or
data they don't need. A stove igniter is a better tool than a box of matches.
---
https://flic.kr/p/LdPYz Michael Chen CC BY 2.0
operating with care
56
Reproduced from NFPA's website, © NFPA (2018).
The fire department recommends that you don't operate a stove while drunk or
sleepy, and the same goes for a root prompts or code merges. Many outages are
caused by changes, so we can make them deliberately and carefully, with design
review, code review and change management.
---
http://www.nfpa.org/termsofuse
Liberal use of NFPA fact sheets and news releases is allowable with attribution. Please use
the following: "Reproduced from NFPA's website, © NFPA (year)." NFPA does not grant
permission for its content to be displayed on other Web sites.
57
State Farm CC BY 2.0
wiring inspections
We can make it a standard to inspect our systems, looking for regressions, looking for
what has bitrotted or become overloaded. A thorough test suite is like a wiring
inspection that runs on every deploy.
And we can do chaos engineering: continually testing the system's resilience against
chaotic events.
--
https://flic.kr/p/duWtgw State Farm CC BY 2.0
detection
stopping it while it's small
2
But, ok, sometimes, inevitably, things go wrong. We have an opportunity to put this
fire out while it's tiny.
---
https://commons.wikimedia.org/wiki/File:Fire-blanket-on-display.jpg Public domain
59
topquark22 CC BY 2.0
smoke alarms,
fire extinguishers
Humans can react quickest if the right fire extinguishers are available. Provide a
one-click rollback for all your changes. Use canaries: push the change to one
instance before we push all the instances. And launch with feature flags to push out
new features in a way that makes it very fast to turn them off if you need to.
Alerts need a fine balance, as everyone knows who’s ever had an over-enthusiastic
smoke alarm in their kitchen. An occasional false alarm is ok, but having humans
continuously react to small problems can burn them out. It's using up your gunpowder
on small fires and not having enough left for the big ones! So aim to keep your false
alarms low.
---
https://flic.kr/p/6AcBru topquark22 CC BY-2.0
https://pixabay.com/en/fire-extinguisher-fire-delete-99915/ Public domain.
HomeSpot HQ CC BY 2.0
sprinklers
But even better, don't get humans involved at all for small things. Add automatic
recovery. If a machine dies, it should automatically be replaced. If a backend goes
missing, we should be able to coast for a while. Health checking and load balancing
should move traffic from an unhealthy region to a healthy one.
Maybe you want to let humans know, but the message they should get is "everything
is under control but you might want to look at this when you get a chance". Not
"WELCOME TO 3AM! A MACHINE DID A THING".
--
https://flic.kr/p/fmr7a7 HomeSpot HQ www.homespothq.com
3
61
isolation
preventing it from spreading
Stage 3: Ok, there's a fire, it's happening. Now we want to not let it get on anything it's
not already on.
--
https://unsplash.com/photos/MApjpqu9V7E CC0
62
Achim Hering CC BY 3.0
fire barriers
Failure domains split our systems up so that only one part of it should be affected by
any given outage. And if the problem's going to move as components get overloaded,
we want that to be slow enough that we can control it, not an immediate cascade. And
we have our own version of moving bakeries out of residential buildings: we can
isolate risky customers on their own replicas or shards.
---
https://commons.m.wikimedia.org/wiki/File:Durasteel_fire_barrier.jpg
State Farm CC BY 2.0
fire drills
Just like we make it incredibly common to hear a smoke alarm and find our way
outside, make it so that a disaster is never a surprise. Humans will panic the first time
they hit a situation that's outside their comfort zone. At intervals, tell people you're
doing a controlled outage, and take a system offline.
---
https://pixabay.com/en/safety-helmet-construction-hat-295057/ CC0
avoiding
encumbrances
64
You know the phenomenon where you're fixing something and you hit a bunch of
unintuitive commands, or out of date documentation, and it ends up taking you much
longer to do something simple? Or you even end up breaking something else? These
traps are a basement full of straw, or a fire hose with cluttered scenery on top of it. It's
making it very, very hard for you to move around safely as you try to fix the real
problem. Push back on technical debt and clutter.
Fatigue is an encumbrance too. You're way more likely to make a mistake if you're
exhausted. Set rules about how long a person should deal with an incident before
their on call shift is over and someone else needs to swap in. Enforce those rules.
--
photo by me.
4response
okay, we're fighting a fire
And sometimes we will still get to stage 4, fighting a massive outage. But we should
aim to not get here often. Firefighting is not good for your SLAs and it's also not great
for the health of the humans involved.
---
Image: skeeze. CC0. https://pixabay.com/en/firefighters-training-live-fire-696167/
controlled burns
Jereme Rauckman CC BY 2.0
Ideally we'll get to a point where our firefighters mostly train using controlled outages,
like many real fire departments do. But we're not there yet.
Many of us are still fixing unreliable software by focusing on this fourth stage, with
human response and escape routes...
--
https://flic.kr/p/pjPGD6
Software
without
built-in
reliability?
That's a
tenement.
..., that means they're building tenements. Foul air is coming in through the air
shafts, and it's not somewhere humans should live. Reliability can't be added after
the building is finished. It needs to be built in. Failure needs to be built in.
Building better buildings makes a huge difference.
---
https://commons.wikimedia.org/wiki/File:Two_officials_of_the_New_York_City_Tenem
ent_House_Department_inspect_a_cluttered_basement_living_room,_ca._1900_-_N
ARA_-_535469.jpg Public domain.
NYC had 48 civilian
fire deaths in 2016.
That's the lowest in
100 years.
Reprinted with permission from NFPA Report U.S. Fire
Death Rates by State copyright © 2017,
National Fire Protection Association, Quincy, MA. All
rights reserved.
In 2016, 48 people died by fires in New York City. This is still a lot of people! But 2016
was the lowest number since they started recording a hundred years ago, even
though the population of the city continues to grow.
That Bronx fire in December that killed 12 people was the deadliest in 25 years. How
did we get from the fire traps of the 1800s to here?
---
https://www.nfpa.org/News-and-Research/Fire-statistics-and-reports/Fire-statistics/Fires-i
n-the-US/Overall-fire-problem/Fire-deaths-by-state Used with permission from
copyrightrequests@nfpa.org
444 →
pages!
69
Well, this helped. This is the New York City fire code. It has 444 pages and costs
$140 dollars, which I know because I really wanted to bring one in here today and
dramatically wave it at everyone.The guy at the library was really confused about why
I'd want a physical copy. He was like "Look, do you have access to the internet?"
---
Book: http://shop.iccsafe.org/2014-new-york-city-fire-code.html
Fire code:
https://www1.nyc.gov/site/fdny/about/resources/code-and-rules/nyc-fire-code.page
70
444 →
pages!
And fire safety is also mentioned plenty in the city building code, the city construction
code, the state building code, the National Fire Prevention Agency electrical code and
I’m sure plenty of other dense legislation. Don’t ask me what's in each of these.
There’s a lot of code, that’s all I’m saying.
But we don't have a fire code for software. We have a bunch of O’Reilly books and
they're great. But nothing makes us adhere to our best practices, or prioritises one set
of rules over the others. Why don't we have a fire code yet?
“
"No computer
software failure has
killed or injured a
large number of
people.
It is just conceivable
that such a tragedy
could occur." Software: A Vital Key to UK Competitiveness
(C) Crown Copyright 1986
via Risks Digest (https://catless.ncl.ac.uk/Risks)
h/t joe Thompson @caffeinepresent
1986
It has been proposed from time to time!
I found this report from 1986 called "Software: a vital key to UK competitiveness",
which had a whole appendix on safety critical software. It starts with “No computer
software failure has killed or injured a large number of people. It is just
conceivable that such a tragedy could occur.”
----
https://catless.ncl.ac.uk/Risks/4/14#subj3.1
https://twitter.com/caffeinepresent/status/945079032445620226
https://pxhere.com/en/photo/1111021 CC0
“
"Each life-critical system
must be operated by a
Certified Software
Engineer who is named as
being personally
responsible for the
system."
Proposal from the UK
Advisory Council for
Applied Research and
Development, 1986
1986
The Advisory Council predicted a time when it wouldn’t be possible to recover from
software failure by just switching off the computer and doing the thing manually -- this
was written in 1986, remember. We're there now. They wanted certification: you
would only be able to operate a life-critical computer system if you had a license and
a Certified Software Engineer to sign off on it -- and they would be personally
liable! -- and a bunch of other stuff, and you'd have to get re-certified every five
years.
They also proposed what’s basically on call shifts, disaster recovery practice drills,
and post-mortems, including post-mortems for near misses. A lot of this feels
prescient and we ended up doing it, but we never required certification.
---
https://catless.ncl.ac.uk/Risks/4/14#subj3.1
https://twitter.com/caffeinepresent/status/945079032445620226
https://pxhere.com/en/photo/1111021 CC0
73
slide from @jkuroda's
amazing LISA 2017 keynote.
Used with permission.
If you were at LISA in November, you might have seen Jon Kuroda's fantastic closing
keynote about aviation safety. Like buildings, plane travel got safer only after a lot of
bad accidents.
Jon pointed out that, while we might think of computing as a new field, it's the same
age as a bunch of others. Software, aviation, power, emergency medicine all took a
big jump forward after world war 2. But our industry is significantly less mature than
any of the others.
----
https://people.eecs.berkeley.edu/~jkuroda/talks/jkuroda-systemcrash-planecrash-lisa2
017.pdf
Image by me.
The stakes are lower?
Is that because the stakes are lower? It's at least part of the reason. Mostly, the
stakes have have been lower. Software mostly hasn't had the ability to cause
massive disasters.
Researching this talk, I read a ton about deaths from software -- it really was a
cheerful time creating this talk -- and found surprisingly few. Most of the new about
software and deaths were about how software is IMPROVING things. By making
processes repeatable and precise, we're saving lives.
But we have had some famously dangerous software bugs.
The stakes are lower?
Ars Technica, August 2013
The Independent, October 1992
New York Times, June 1986
The Therac-25 radiation therapy machine had a concurrent programming bug that
made it occasionally give its patients radiation doses that were hundreds of times
greater than they should have been. Three people died.
In college I remember studying the London Ambulance dispatch failure. A new
software system was deployed that hadn't been load tested, and it had a memory
leak. It couldn't keep track of where the ambulances were, which led to them arriving
hours late. 46 people died who might have been ok if the ambulance had arrived on
time.
And some near misses. Like, I haven't heard of any actual negative outcomes from
the OCR bug that went around in 2013, but you can see how it might print end up with
numbers in prescriptions or structural engineering documents being catastrophically
wrong.
And the news is full of software concerns in vehicles, self-driving or otherwise.
--
https://www.nytimes.com/1986/06/21/us/fatal-radiation-dose-in-therapy-attributed-to-c
omputer-mistake.html
https://www.independent.co.uk/news/ambulance-chief-quits-after-patients-die-in-comp
uter-failure-1560111.html
https://www.wired.com/2009/10/1026london-ambulance-computer-meltdown/
https://arstechnica.com/information-technology/2013/08/confused-photocopiers-rando
mly-rewriting-scanned-documents
https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=426747
“"It took a Newark fire
and a Triangle fire to
bring New York State's
fire legislation to its
present inefficiency."
Inis Weed, New Outlook
volume 104, 1913
1913
But none of those has been our Triangle fire. So far software has been able to kill
people one or a few at a time. We haven’t had the wide-scale disasters that have
shocked other industries into growing up.
Aviation regulations came from a bunch of people dying. Mining regulations came
from a bunch of people dying. Professional engineering organisations came from a
bunch of people dying. To quote my new favourite 1910s journalist, Inis Weed, "It took
a Titanic disaster to improve the safety of vessels. It took a Newark Fire and a
Triangle fire to bring New York State's fire legislation to its present inefficiency".
The use of software for life-critical systems grows every year. And every day we send
#hugops on Twitter to the people working on the latest massive software outage. At
some point these will overlap. Hope is not a strategy.
Are we ready for this kind of responsibility?
We, all of us here, are people who are responsible for software. The world will need a
lot of software over the next few decades. Some people in this room will run life
critical systems. We are 1890s landlords looking at a whole lot of new opportunity. We
know, there's money to be made from cutting all of the corners, but we have a choice.
I don't want us to wait for a disaster...
------------------
New Outlook, volume 104. https://books.google.com/books?id=URCzNkpDZp0C
Inis Weed or Inis Weed Jones made topics like medicine, sociology and science
exciting for regular people. She wrote extensively for Harper’s, Schibner’s and the
Reader’s Digest. She lived, at least for a while, at 337 West 22nd St. She wrote tons
about working conditions and humanised anonymous workers. She was an
investigator for the US Commission on Industrial Relations. She wrote articles like
The Reasons Why The Copper Miners Struck (about a strike), and Safer Childbirth
with Less Pain, and Acne: the Plague of Youth and Not By Bread Alone (about young
people returning to farming). She also published a book called "Peetie: the story of a
real cat", which is $72 on abebooks.com and I won't deny, I'm tempted. She reads like
a tremendously compassionate person who wrote about things people needed to care
about in an engaging way and made them care. (Please don't be a milkshake duck,
Inis).
Let's choose not to build tenements.
...to decide not to build tenements.
Remember, some regulations didn't come from fires! Some came from a lot of people
deciding to care about the same thing at the same time.
We can decide now what good systems look like. We can create professional
standards and industry safety codes, and create and opt in to a professional
organisation to keep ourselves honest. And then, like the fire code, we can keep
revising and improving it until huge software outages are rare and shocking.
The entire industry should learn from every major outage. No secrets.
78
http://noidea.dog/fires
● Escapes in Urban America: History and
Preservation, Elizabeth Mary Andre
● No exit: the rise and demise of the outside fire
escape: Sara E Wermiel
● How Fire Disaster Shaped the Evolution of the
New York City Building Code, Charles Shelhamer
● The Creative and forgotten fire escape designs of
the 1800s, Lauren Young
● New Outlook vol 104 (May-August 1913)
● RISKS Digest
● 1910 Newark Factory Fire, Mary Alden Hopkins
● New York City (NYC) Disasters, Baruch College
● Presentation template by SlidesCarnival
Questions? Comments?
Find me at @whereistanya
or fires@noidea.dog
#GetAlarmedNYC
Before I finish: if you're in New York, the NYFD and the Red Cross have a
shared campaign to give people free smoke alarms and free batteries. They'll
even come install it for you. If you don't have a smoke alarm, please search for
#GetAlarmedNYC and fill in their form. http://fw.to/Kzv1G4f
(Two SREs live in my apartment, so we already have two redundant meshes of
networked alarms from different manufacturers and also a few standalone alarms.)
This slide lists a few references that I found especially useful or interesting while
writing this talk. That first one contains a list of all the others, so hit up
http://noidea.dog/fires if you want a lot of links to read more about fires and fire
escapes.
If you have comments on the talk, or questions or you're a building historian who is
willing to tell me what I got wrong, you can find me at @whereistanya on Twitter or
fires@noidea.dog.
---
https://commons.wikimedia.org/wiki/File:Smoke_alarm.JPG CC0

Contenu connexe

Dernier

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburgmasabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...Nitya salvi
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 

Dernier (20)

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 

En vedette

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

En vedette (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

The History of Fire Escapes

  • 1. The history of fire escapes Tanya Reilly @whereistanya Abstract: When a datacenter goes offline, a server gets overloaded, or a binary hits a crashing bug, we usually have a contingency plan. We reduce damage, redirect traffic, page someone, drop low-priority requests, follow documented procedures. But why do many failures still come as a surprise? In this talk, we look at some real life analogs to preventing and managing software failures. Fire partitions. Public safety campaigns. Smoke alarms. Sprinkler systems. Doors that say “This is not an exit”. And fire escapes. What can we learn from the real world about expecting failure and designing for it? --- https://commons.wikimedia.org/wiki/Category:Fire_escapes#/media/File:ISO_7010_E 016.svg Public domain. Slide template started as Oivia from SlidesCarnival and then drifted into something very else.
  • 2. "When we first dropped our bags on apartment floors…" Welcome To New York Taylor Swift Good morning! So, I'm a New Yorker. I'm not from the US -- I'm an immigrant -- but one of the many things I love about New York City is that you move here, and it’s immediately your city. The number one criterion for being a New Yorker is wanting to be a New Yorker. It's a welcoming place. So good morning to my fellow New Yorkers, wherever you're originally from, and, if you're travelled to be here, welcome to New York. We're glad to have you. I work in Site Reliability and I'm especially interested in what happens when things fail, the contingency plans we use to recover when something breaks. And last year I was thinking about that a lot and walking around the city and I started really noticing that New York is *covered* in fire escapes. They’re a contingency plan too. They’re for incident response. You don’t use them until all of your regular methods of getting out of the building have failed. So I started reading about fire escapes. ---- https://unsplash.com/photos/Iyd__3m4XF8 CC0
  • 3. content warning: fire Before I say more about that, let’s talk content. This talk is about at disaster prevention and disaster recovery in software, by looking at parallels in building fires. This will include stories of some of the worst fires in the history of new york city. We'll be looking at the reasons fires started, the stuff that helped them spread and how people died. There's also some pictures of buildings on fire. Nothing lurid, but there are pictures. If you have raw feelings related to recent fires, this could be rough. If you'd be more comfortable skipping this one, you should do that with my blessing. While you're packing up, I'll even tell you what I'm going to say, so you don't miss anything:
  • 4. Tony Fischer CC BY-2.0 Fireproof buildings are more effective than fire escapes. Fireproof software is more effective than incident response. Where's our fire code? Here's my thesis ● fire escapes are a hacky bit of afterthought tacked on to the outside of a building after the building is finished. If you're using fire escapes, it's worth making them as good as possible, but you’ll prevent more fires if you build better buildings. ● Similarly, incident response is often a hacky bit of afterthought tacked on long after software is released. Again, great incident response can help you recover faster than if you don’t have it but… you’ll prevent more outages if you build better software. ● Finally, buildings have an extremely detailed fire code, but we don't really have an extremely detailed systems engineering code for software, and I think we should have. Now I'm going to say the same thing but take 35 minutes. --- How Much is that Doggie in the Window? https://flic.kr/p/72Lhz1 CC BY 2.0
  • 5. Claudia Heidelberger CC BY-ND 2.0 Greenwich Village Fire escapes were really only built in New York City for a hundred years. They weren't common until the 1860s, and in the 1960s they stopped being allowed on new construction. There's some debate now about whether we should start removing them in places where the building has been upgraded, or whether they should be preserved as part of the city's history. I think at least some of them should be preserved. Look how beautiful that is! -- Claudia Heidelberger CC BY-ND 2.0. https://flic.kr/p/oqYYv1
  • 6. Dan DeLuca CC BY-2.0 East Village And here's another lovely one. They made an effort to have it match the style of the building, not feel like a separate thing tacked on at the end. And I think that's key. -- Dan DeLuca CC BY 2.0 https://flic.kr/p/76Jmb2
  • 7. “ "fire escapes were haphazardly attached to the most elaborately designed facades" Richard Plunz, a History of Housing in New York City 7 But most of the time, the people adding the fire escape didn't think of it as part of the building .As this quote says, fire escapes were haphazardly attached to the most elaborately designed facades. The facade of the building was architecture but the fire escape was law. It was an external contingency plan, not part of the main structure. And I think that's part of why fire escapes ended up not being successful. --- https://books.google.com/books?id=fcKlDAAAQBAJ&pg=PA24
  • 8. A brief history of New York City fires (With apologies to actual historians) But I'm jumping to the end. Let's look at the evolution of New York City's fire code. By the way, my great fear now is that there’s a building historian in the room who will listen to this and be like “Nope, that is really not what happened." Please forgive any errors, building historian! If i made mistakes, I would love if you would come tell me at the end!
  • 9. The Financial District 1835 On to the history. We’re skipping the great fire of 1776, and jumping straight to 1835 and the Financial district. This was a commercial, not residential area, and as a result the number of fatalities was comparatively low -- two people -- I mean, still, two too many, but this is mostly remembered as a fire that cost a LOT of money. Almost 700 buildings were destroyed. The city had 26 fire insurance companies. This fire put 23 of them out of business. --- https://en.wikipedia.org/wiki/Great_Fire_of_New_York#/media/File:The_Great_Fire_of _the_City_of_New_York_Dec_16_1835.jpg Public domain.
  • 10. no failure domains contingency plans failed exhausted incident responders what happened? 1835 The fire was caused by a burst gas pipe in a maze of wooden warehouses. Wood burns easily so there were no failure domains: the fire spread very quickly. Inside two hours it covered 17 city blocks, most of the financial district. The city's water supplies were low and the typical contingency plan was to pull water from the rivers, but it was a freezing night in December and first the firefighters had to cut through ice. At the time it was also common to use gunpowder to level buildings and stop the fire spreading. But they had used up all their gunpowder on a fire two days earlier. That fire involved the entire fire department of 1500 people, and they were still exhausted. Still, they fought the fire for 15 hours until marines from the Brooklyn Navy Yard arrived with more gunpowder and blew up some buildings along Wall Street to make a barrier.
  • 11. dedicated incident responders: a professional fire department new infrastructure: the Croton Aqueduct better incident response 1835 As a result of the fire, the city stopped using volunteer firefighters and moved to a professional force with better equipment. And they built the Croton Dam and Aqueduct. It was built because of the fire, but a reliable water source is good for lots of reasons! --- No longer in use, btw. It was replaced with the New Croton Dam, which still supplies a small fraction of the city's water. The old one is on the National Register of Historic Places.
  • 12. robust structures: they rebuilt in stone better buildings 1835 But more importantly, as well as better incident response, they took the opportunity to make a more resilient city. The fire spread fast because the buildings were made of wood. They rebuilt with stone and brick. And this paid off, ten years later, when there was another enormous fire. The great fire of 1845 was very bad -- thirty people died -- but it didn’t spread as far or as fast, because it slowed down when it hit those new brick buildings.
  • 13. 1860 Tenements Let’s jump forward 25 years and talk about tenements. Tenements were extremely dense, extremely terrible housing. I'd read about tenements but hasn't realised the scale of them. In the 1860s, nearly 500 thousand people -- more than half the city -- lived in tenements. The population of New York City doubled every decade between 1800 and 1880. Maybe you've seen this with teams and software systems: when you grow rapidly, you can build some culture problems and some technical debt. This was certainly the case here. Landlords made more accommodation by splitting big rooms into many smaller ones, mostly with no light or ventilation. These were really awful places to live. They were crime riddled, filthy and filled with disease. Every report about them mentioned that they were fire traps. In 1860, two tenement fires happened back to back. -- https://en.wikipedia.org/wiki/Tenement#/media/File:LowerEastSideTenements.JPG Public domain. Elm St: http://www.nytimes.com/1860/02/03/news/calamitous-fire-tenement-house-elm-street- destroyed-thirty-persons-supposed-have.html?mtrref=www.google.com 45th street fire: http://www.nytimes.com/1860/03/29/news/destructive-fires-four-tenement-houses-des troyed-two-mothers-eight-children.html?pagewanted=all
  • 14. Quote about the buildings from that second article: “If a skillful man, with a deadly hatred of his race in his heart, sat down to plan a human residence in which to entrap and destroy those who should dwell in it, it is extremely probable that if he had seen these houses in West Forty-fifth-street he would take them as a model. “
  • 15. what happened? 1860 no isolation obsolete contingency plans no failure domains The first one, on Elm Street, started in a bakery on the ground floor of a large residential building. Terrible place for a bakery, but that's where it was. The baker was storing a lot of hay and wood shavings, and when they burned they made dense smoke, killing some of the people who lived in the higher floors before the fire even got up there. The wooden stairway quickly burned away, trapping people on the top floors. Firefighters arrived with ladders, but the ladders only went to the fourth floor and this was a six storey building. At least 10 people died. A month later four houses burned on west 45th street. These houses had roof hatches called scuttles, which should have let people escape across the roofs, but they all were missing their ladders so nobody could get up there. Another ten people died.
  • 16. An optimistic disaster plan is a useless disaster plan These escape plans -- the ladders and scuttles and the roof -- had worked fine for a previous iteration of shorter NYC buildings, but they hadn't been updated for the new shape of the city. Just like with the water and the gunpowder, there was a plan in place for a fire disaster. And just like them, the plan only worked in the most optimistic circumstance. We see that all the time. Backups that will work if we lose the database in a very specific way. Failover plans that only work if we have two weeks notice of the failover and the old data center doesn't lose power.
  • 17. better buildings 1860 new law: an Act to Provide Against Unsafe Buildings in the City of New York The city immediately passed a law to make the tenements more robust against fire. They even put an injunction on new tenement construction until the law was passed. Now houses for more than eight families (kind of specific) had to have fire-proof stairs either inside or outside the building. What’s frustrating about this is that four years earlier a commission had reported that, if there was a fire, tenants on the 6th and 7th floors of tenements had basically zero chance of survival. They recommended fire proof stairs. But nothing happened until a bunch of people died. ---
  • 18. ● Tenements must have fire escapes... The Tenement House Act 1867 Seven years later, the Draft Riots (which are a whole separate awful thing in which a whole bunch of people died) led to another law: the Tenement House act. This act had good goals but it was extremely unsuccessful. Buildings had to have a fire escape, but they didn't have to make anyone safer! So landlords put up fire escapes that couldn’t hold the number of people in the house, or that weren’t well attached to the walls or that were just a rusty ladder. And what even was a fire escape? Well, it wasn't well defined. Let's take a diversion and look at some fire escape patents. As we look at them, you might want to think of disaster recovery plans you have known and loved. --- The picture’s actually from 1900 but whatever :-D https://commons.wikimedia.org/wiki/File:New_York,_N.Y.,_yard_of_tenement_LOC_d et.4a18586.jpg Public domain.
  • 19. William Houghton's fire escape 1891 This is a ladder with a counterweight. Imagine climbing down from the 7th floor of your building on one of these. With your six children. In the rain. In a dress that went to your ankles. -- https://en.wikipedia.org/wiki/Fire_escape#/media/File:Houghton%27s_Fire_Escape_1 877.jpg Public domain.
  • 20. Mary McArthur's fire escape 1904 This is a kind of rope ladder that attaches to a window sill. http://www.google.com.pg/patents/US800934
  • 21. William Bedinger's fire escape 1915 This is a parachute that rolls up very small. The idea was that you'd carry it with you everywhere in case you were in any tall building fire situations. https://www.google.com/patents/US1168465
  • 22. Henry Vieregg's fire escape 1902 According to this patent, and I quote: "A person desiring to escape seizes one member of the cord, rope, or chain, as shown in Fig. 1, and forthwith jumps out of the window. [...]" Like, I am looking at this thing and do not feel like I could forthwith jump out of anything. https://www.google.com/patents/US708846
  • 23. Anna Gonnelly's fire escape 1887 Anna Gonnelly's fire escape was a bridge that you could sling from your roof to another building. It had side rails, so it was only moderately terrifying. https://www.google.com/patents/US368816
  • 24. Pasquale Nigro's fire escape 1909 This one is just fantastically ludicrous. But good if you want to fight supervillain crime? All of these patents were granted, btw. GOOGLE PATENT US 912152 A
  • 25. BB Oppenheimer's fire escape 1879 And this one… You might think that this is just a parachute helmet. It is not. It is a parachute helmet and a pair of very bouncy shoes. GOOGLE PATENT US 221855 A .
  • 26. Nicholas Borgfeldt's fire escape 1882 Finally, I've read this patent three times and I'm fairly convinced that the guy invented a rope. It's the most silicon valley invention of 1882. Though, let's be clear, rope was a popular kind of fire escape. In fact, it was the state of the art for hotels. https://www.google.com/patents/US267399
  • 27. Every hotel's fire escape 1887 Puck Magazine, 1887 I don't mean a ladder made of rope, I mean literally a rope. Every hotel room had to have a rope and that was the only fire escape. Even at the time, people found that pretty terrible. This is part of a snarky cartoon from a magazine called Puck, published in 1887, of a whole lot of people trying to use the ropes. Like most of those other parents, it's designed for the easiest case: someone with upper body strength and agility who isn't wearing a skirt or carrying a child. If your disaster plan only works for the easiest case, it's not a good plan. I want to emphasise here that a rope is better than nothing. In fact, probably every one of these fire escapes, even mister parachute hat, is better than nothing. But these escape plans are not where I would put my efforts if I wanted to have fewer people die in fires. But this is what the law focused on. -- https://books.google.com/books?id=XwAjAQAAMAAJ&pg=PA48 Pre 1923 so public domain
  • 28. 1867 Tenements must also have windows... The Tenement House Act (continued) Anyway! The Tenement House Act. Even with fire escapes, tenements were still terrible. They were badly constructed, overcrowded, and -- I find this amazing -- it was perfectly legal to store lots of combustible materials in them. One other thing the tenement act said, was that every room now had to have a window. And just like “what even is a fire escape” it didn’t define “what even is a window”. So the landlords cut holes in interior walls between rooms and called them "interior windows". A decade later, the law said sigh, ok, exterior windows. So landlords started constructing buildings with air shafts, little narrow gaps between buildings. Now, picture it, you have no indoor plumbing and the bathroom is down six flights of stairs and now you have an air shaft. You can imagine how that goes. One article I read described the air shaft as “festering tubes of disease”. Very poetic! And many of the fire escapes just led down to these air shafts and there was no way out from there. --- https://en.wikipedia.org/wiki/Old_Law_Tenement#/media/File:Airshaft_of_a_dumbbell
  • 30. 1871 Carla Geisser CC BY THANK YOU CARLA <3 Tenements must have usable fire escapes. By 1871, iron fire escapes were becoming common and of course people were using them as extra space. You still see that now -- they're used for bikes and gardening and barbecues and cat runs. All of that has been illegal since 1871. Because it makes the fire escape very hard to use in a fire! A later law said that every fire escapes had to have a cast-iron sign saying that you could be fined for obstructing your fire escape. And it was fair, because usable fire escapes are better than unusable ones. But, again, it was still perfectly legal to run your explosive business out of a tenement basement and tons of residential fires started because of deep frying crullers. And anyway, the regulations were mostly not enforced, so people didn't pay much attention. ------------ The encumbrance sign thing is from 1885, but encumbrances were illegal from 1871 and mentioning this many dates makes *my* ears glaze over and I'm already interested in this. So we're conflating two things to keep it moving along. Image by Carla Geisser, used with permission.
  • 31. 1876 The Brooklyn Theater Fire In 1876, the Brooklyn Theater on Cadman Plaza. The final act of the play was about to start and the stage manager noticed a very tiny fire on the left of the stage. --- https://en.wikipedia.org/wiki/Brooklyn_Theatre_fire#/media/File:BrooklynTheatre_Fro m_Johnson_Street_Looking_East.jpg Public domain.
  • 32. obsolete contingency plans encumbrances unpracticed incident response delayed escalation restricted access what happened? 1876 It was typical to keep buckets of water next to the stage, but there weren't any. There was a fire hose, but too much scenery was piled beside the stage and he couldn't get to it. There's those encumbrances again. The stage manager asked a couple of carpenters to put the fire out by beating it with poles. This didn't work and actually spread some sparks, setting fire to the loft. The actors -- laudably -- wanted to avoid a panic, so they announced that the fire was part of the show, and that people shouldn't freak out, but once the audience realised, they stampeded. And they had trouble getting out. We have a real stampeding herd problem here: there was only one stairway down from the cheap seats at the top, and everyone trying to use it at once. It filled with smoke. There were no fire escapes and some exits were locked to prevent against gate crashers so people couldn't get out that way. 278 people died. At the time, it was the worst theater fire in US history. It's now the third worst because we really don't learn.
  • 33. accountability: prosecutions new laws for exits and encumbrances automated response: sprinklers! better buildings 1876 The jury blamed the theater owners for not obeying a bunch of existing fire laws, and new laws were written, including widening exits and not storing stuff on the stage. In 1882, the building code said that theatres had to have automatic sprinklers: it's the first type of building in the city to require sprinklers. The first automated response. What I find remarkable is that this fire happened nine years after regulation said that tenements had to have safe exits, but those laws didn't carry over to theatres, or to other types of buildings like: hotels, schools, factories, ships, offices. I'm going to spare you most of the horror stories, but we'll look at factories in a minute, after….
  • 34. 1890- 1901 Even more Tenement House Acts! ...we get proper no-kidding tenement regulation at last! And we even do it without a bunch of people dying!. Thank you Jacob Riis! In 1890, this guy called Jacob Riis published a book about tenement life called How the Other Half Lives and did a lecture tour on it. And up until now the upper and middle class people of New York City had sort of known the tenements were awful, but for the first time ever, there were photographs. It was harder to ignore. Well, it was probably part empathy, part fear of smallpox coming out of there but, whatever, over the next decade, people started to care. I was really reassured when I read this, because until then it had been all “there was a horrific fire and we added a very specific law and then there was a different horrific fire and we added a different very specific law”. And it was mostly like that! But this Tenement House Act came from someone saying “wow, look how much this sucks” in a compelling way. And that gives me hope! Anyway, the next couple of Tenement House Acts included having to have actual windows, not air shafts, and fire escapes couldn't be ladders any more: they had to have open balconies and stairs and be properly attached to the wall. Even better: your neighbours can no longer boil oil in the basement! Hurray! And all new construction has to have interior fire partitions. Failure domains!
  • 35. We're finally looking at stopping fires from starting and spreading, not just escaping from them. And, best of all, it’s all actually going to be enforced. Welcome to the 20th century! But, oh yeah, it still sucks in factories. -- https://commons.wikimedia.org/wiki/Category:How_the_Other_Half_Lives#/media/File :How_the_Other_Half_Lives_front_cover.png Public domain. http://www.americanyawp.com/text/how-the-other-half-lived-photographs-of-jacob-riis/ Public domain because pre 1923. https://commons.wikimedia.org/wiki/File:Jacob_Riis_portrait.jpg Public domain because pre 1923.
  • 36. The Newark Factory Fire 1910 The triangle shirtwaist is the famous one, but the Newark factory fire a few months earlier is a textbook disaster waiting to happen so I wanted to talk about it. This building had two fire escapes -- look at the size of this building! One of them was a really heavy ladder that needed to be lifted into place. Another emergency plan that only worked for people with good upper body strength. In the fire, the young women who worked in this factory weren't able to lift down the ladder. So.. only one fire escape. -- http://www.oldnewark.com/histories/factoryfirearticle.php
  • 37. no isolation no monitoring ignored warnings delayed escalation what happened? 1910 blameful culture restricted access untested contingency plans no drills etc, etc The building was shared by a couple of paper box companies, a nightgown factory and a lamp manufacturer. It had previously been used by machine companies and the floors were soaked in oil. A fire started in the lamp factory. There was no fire alarm, and the bottom three floors had evacuated before they realised that 116 people up on the 4th didn't know there was a fire. This building had had ten fires in ten years and the buildings department had condemned this factory three times, but the factory owners basically ignored them and kept running. All of that was expensive for insurance and they didn't want another fire on their record, so they delayed calling in the firefighters, even though the firehouse was just across the street. The firehouse had a policy of reprimanding their firefighters for false alarms -- no blameless post-mortems here! -- so before raising a general alarm, they sent a couple of guys over with a fire extinguisher, delaying the real response even more. The only door up to the 4th floor was kept locked, which was against the law. The windows wouldn't open and the victims had to break glass with their hands. The window sills were four feet off the ground and the platform up to them broke under the weight of people trying to get out. And the victims had never been in a fire drill and they had no idea what to do. They,
  • 38. quite reasonably, freaked out. 25 people died, 32 more were very badly injured. I feel like I could spend an hour just talking about this fire. There's so much to learn from it. --- http://www.oldnewark.com/histories/factoryfirearticle.php is really good and I recommend it, if you don't mind being angry)
  • 39. Human error is never the root cause When officials investigated, they said the root cause was not the walls soaked in grease, or delaying calling fire fighters, or the locked door, or the lack of smoke alarms or the unusable fire escapes. It was that "the victims merely succumbed to panic" The way humans react to a disaster can definitely make the situation worse -- remember those carpenters with sticks in the theater -- but that is in no way their fault. Humans will act in human ways. If your systems can't handle that, and you haven't invested a lot of time in training the humans to act in some other way, your systems are crap. ---State Farm CC BY 2.0 Ref: https://www.uvm.edu/histpres/HPJ/AndreThesis.pdf
  • 40. “They died from misadventure and accident.” outcome...? 1910 Coroner's Jury, December 1910 So what happened? Nothing. The jury didn't convict, though at least one juror later said he regretted it. New Yorkers did look a bit at their factories and say "huh, I wonder if we should care about that"..., but nothing changed. Is it because it happened ten miles away instead of on the island of Manhattan? No idea. The New York Fire Chief said "This city may have a fire as deadly as the one in Newark at any time". Four months later… --- "They died from misadventure and accident" from http://www.nytimes.com/2011/02/24/nyregion/24towns.html "This city may have a fire as deadly as the one in Newark at any time." from http://trianglefire.ilr.cornell.edu/primary/testimonials/tf_warnings.html
  • 41. 1911 The Triangle Shirtwaist Factory 146 people died inside 18 minutes. The famous Triangle Shirtwaist Fire. -- https://timesmachine.nytimes.com/timesmachine/1911/03/26/104859694.html https://en.wikipedia.org/wiki/Triangle_Shirtwaist_Factory_fire#/media/File:Image_of_T riangle_Shirtwaist_Factory_fire_on_March_25_-_1911.jpg Public domain. http://www.baruch.cuny.edu/nycdata/disasters/fires-triangle_shirtwaist.html
  • 42. what happened? no isolation obsolete contingency plans restricted access ignored warnings 1911 This building was considered fireproof. They had done it right. They built a good building. But it was packed with garments hanging so tightly together that the building might as well have been made out of cloth. The building should have had three fire escapes; it had one and that collapsed under the weight of people escaping. Fire fighters came but the fire ladders and the water could only get to the 6th floor and the city had gotten taller again: the factory was on the 7th to 9th. One exit was locked; the guy with the key escaped without unlocking it. And the employers already knew about the problems. Employees had organised a strike the previous year to protest the working conditions, and they'd been fired. The building had had a recent warning notice from the department of sanitary control, but they hadn't fixed their violations.
  • 43. better tools: stronger pump, longer ladder better incident response 1911 The fire department developed a stronger water pump and a longer ladder, so they could reach taller buildings.
  • 44. laws: 60 in three years automated response: sprinklers accountability: the American Society of Safety Engineers better buildings 1911 But more importantly, building conditions took a big step forwards. There were 60 new laws over the next three years. Again, everyone knew factories were bad. But, again, the law didn't change until a bunch of people died ON THE ISLAND OF MANHATTAN. Sprinklers started to be required in factories. (But only factories over seven stories tall. Very specific again.) A professional organisation, the American Society of Safety Engineers (which still exists), was founded. -- After the fire, the owners of Triangle Shirtwaist factory, Harris and Blanck, were brought to court on charges of manslaughter but were eventually acquitted. They were fined $75 for each life lost. However their insurance policy paid them a total of $60,000, at the rate of $400 per life lost, so they actually profited from the tragedy. After two years, they continued to lock the doors to exits and were fined for several safety code violations. The worst people :-(
  • 45. Phil Roeder CC BY-2.0 "...a type of exit condemned by the experience of many fires" NFPA report, 1914 And at last, people started to look at fire escapes differently. After the disaster, a report called them "a pitiful delusion." and "a type of exit condemned by the experience of many fires". --- http://www.nfpa.org/News-and-Research/Publications/NFPA-Journal/2014/September -October-2014/Features/Fire-Escapes/1914-Sound-the-Alarm
  • 46. Barbara L Hanson CC BY 2.0 Dan DeLuca CC BY 2.0 Eden, Janine and Jim CC BY 2.0 don toye CC-BY-ND 2,0 Kristine Paulus CC-BY-ND 2.0 "...a type of exit condemned by the experience of many fires" NFPA report, 1914 The report called out a lot of reasons fire escapes are terrible: ● the platforms are too small ● people put stuff on them ● they don't get a lot of maintenance ● snow and ice makes them slippy and dangerous But most importantly ● they never, ever get tested. --- Images: Kristine Paulus CC BY 2.0. https://flic.kr/p/fszEDf (plants) Dan DeLuca CC BY 2.0. https://flic.kr/p/5hsnTM (chairs) Eden, Janine and Jim. CC BY 2.0. https://flic.kr/p/7G1tWZ (snow) Barbara L. Hanson. CC BY 2.0. https://flic.kr/p/8uxpcf (rain) Don toye, CC BY 2.0 https://flic.kr/p/9XrAs (bike)
  • 47. “ ... fire escape collapses during times of intense use – such as during actual fires. John W. Cramer, The Story of a Tenement House Fire escapes were known to collapse during times of intense use. But they pretty much have one time of intense use. If they're going to collapse, it's going to be during a fire. So what do we do? We have a couple of options here. We can add more regulations around fire escapes: you have to maintain them, you have to try them out every year! There actually was a law about regularly painting your fire escape. To prevent against slipping you have to build a textured floor into the fire escape and leave a pair of shoes with good grips on the top of each one… Or we could step back and ask whether we're optimising for the wrong thing. ---- Quote via http://www.boweryboogie.com/2014/10/favorite-pastime-tenement-fire-escapes/ A photo called "Fire Escape Collapse" received a Pulitzer in 1976. It's fairly harrowing, so I'm not linking it here -- extreme content warning if decide you go look at it -- but it made Boston rewrite its fire escape safety laws. Journalists are amazing.
  • 48. “ New York Times, February 25th, 1923 1923 In 1923, the New York Times had an article praising fireproof interior walls: "For six years there has been no loss of life by fire in the 200 buildings so treated." It blows my mind that a group of 206 buildings having no fire deaths in six years was considered newsworthy. In 1929 those fireproof walls became code: all new buildings over 75 feet in height had to have them, and also had to have two fully enclosed staircases! Failure domains are part of the code at last! --- https://timesmachine.nytimes.com/timesmachine/1923/02/25/105849722.html?pageN umber=141
  • 49. 1968 "Fire escapes shall not be permitted on new construction" John VanderHaagen CC BY 2.0 The idea of building better buildings gained traction and in 1968 fire escapes stopped being allowed at all. The code still says "Fire escapes shall not be permitted on new construction". The 1968 code also required sprinklers for hotels and high-rise office buildings, but not nightclubs or residential buildings. ---- " Fire escapes shall not be permitted on new construction, with the exception of group homes. Fire escapes may be used as exits on buildings existing on December sixth, nineteen hundred sixty-eight when such buildings are altered, subject to the approval of the commissioner, or as provided in subdivision (b) hereof. " https://commons.wikimedia.org/wiki/File:New_York,_New_York,_April_1968.jpg CC BY 2.0
  • 50. More fires. More very specific laws. 1975 - 2018 ● In 1975, seven people died in a nightclub, so, sprinklers for required for nightclubs. ● In 1998 there were two bad residential fires, and now you have to have sprinklers for residences with four or more units. ● And I'm sure this story is not over and the code will be expanded many more times in response to very specific things in which a bunch of people die. Btw, there's no retrofitting of existing buildings. The laws only apply to new buildings and existing buildings get better as they're renovated. So buildings in NYC comply to the safety standard of whenever they were renovated last. Think about that, wherever you sleep tonight. --- https://pxhere.com/en/photo/900057 CC0
  • 51. Fire deaths decreased because we built better buildings. So that was 150 years of fire codes. For decades we considered it inevitable that fires would start and spread, and we optimised for escaping from them. And we definitely got good at responding to massive fire disasters. But slowly we made progress on other, more important parts of the fire life cycle. Which I'm going to describe in four stages: https://commons.wikimedia.org/wiki/File:An_Old_Rear-Tenement_in_Roosevelt_Stree t.png Public domain.
  • 52. 1prevention making it harder for the fire to start We prevented sparks. A certain amount of sparks are ok! We need to cook food and have birthday candles. But by becoming more deliberate about when we make sparks, we made it harder for the fire to start at all. We moved bakeries out of residential buildings, began doing wiring inspections, did public safety campaigns about cooking and smoking. --- https://www.pexels.com/photo/fire-match-smoke-flame-54627/ CC0
  • 53. 2detection stopping it while it's small We worked on detection and immediate amateur response: smoke alarms, fire blankets, fire extinguishers, and more public safety campaigns. And we introduced sprinklers. --- https://commons.wikimedia.org/wiki/File:Fire-blanket-on-display.jpg Public domain
  • 54. 3 50 isolation preventing it from spreading 3. We introduced failure domains, to keep the fire to one small part of the building or city. We started using materials that were hard to ignite so the fire would spread slowly. And we did fire drills, to move humans quickly and safely away from the danger area and to prevent the kind of panic that makes things worse. -- https://unsplash.com/photos/MApjpqu9V7E CC0
  • 55. 4response okay, we're fighting a fire And only then, 4, emergency response. We also got better at responding to massive fires. The New York Fire Department is *very good*. But step 4, this is our last resort and we should try not to rely on our last resort. We gained more from stopping the fire from getting to this point. And, if you missed my extremely subtle metaphor here, it's the same for software. --- Image: skeeze. CC0. https://pixabay.com/en/firefighters-training-live-fire-696167/
  • 56. reliability is everyone's job 1 prevention 2 detection 3 isolation 4 response The most important reliability work is making problems stop before they get to that fourth stage. This means that reliability is everyone's problem. Everyone who's writing code or designing systems should have reliability in mind. Yeah, some people have a site reliability team. Just as we have people who specialise in UI or security, both of which we should all care about, we can have people who specialise in reliability and advocate for it. But, while SREs may occasionally act as firefighters, the more important part of their job is to be the fire safety engineers, handing out smoke alarms, legislating fire partitions, pointing out buildings that are made of wood, advocating for the removal of clutter, educating everyone. The part of their job which is being last resort firefighters? That skillset should be used rarely. You don't want the NYFD running into your kitchen every time you burn toast. If you're calling them in, it's a sign that something's gone horribly wrong. But it's still very common to have firefighters reacting to every software problem.
  • 57. There's a really nice tradition in the ops and SRE communities, where if a site is down, people send #hugops on twitter to the people working on it. I want to particularly call out Baron Schwartz sending hugops in advance to people running mail servers on GDPR day :-D I love #hugops. I send #hugops. But one thing you'll notice if you follow the hashtag is that… a lot of things break and nobody is really surprised. We're at the stage of software evolution where we expect software to fail. We need to build better buildings in software too. And that means we think about those same four stages. --- Tweets used with permission.
  • 58. 1prevention making it harder for the fire to start Just like with buildings, a certain amount of sparks are fine for us too! We need to make changes. Maybe something gets overloaded or a user does something we didn't plan for. Many of us use the concept of error budgets: depending on how close we are to missing our SLAs, we make more or fewer changes. We can reduce our sparks: --- https://www.pexels.com/photo/fire-match-smoke-flame-54627/ CC0
  • 59. hiding the matches 55 Michael Chen CC BY 2.0 We can think about how users use our tools and provide clean, safe, validated interfaces that are hard to get wrong. We can restrict their access to functionality or data they don't need. A stove igniter is a better tool than a box of matches. --- https://flic.kr/p/LdPYz Michael Chen CC BY 2.0
  • 60. operating with care 56 Reproduced from NFPA's website, © NFPA (2018). The fire department recommends that you don't operate a stove while drunk or sleepy, and the same goes for a root prompts or code merges. Many outages are caused by changes, so we can make them deliberately and carefully, with design review, code review and change management. --- http://www.nfpa.org/termsofuse Liberal use of NFPA fact sheets and news releases is allowable with attribution. Please use the following: "Reproduced from NFPA's website, © NFPA (year)." NFPA does not grant permission for its content to be displayed on other Web sites.
  • 61. 57 State Farm CC BY 2.0 wiring inspections We can make it a standard to inspect our systems, looking for regressions, looking for what has bitrotted or become overloaded. A thorough test suite is like a wiring inspection that runs on every deploy. And we can do chaos engineering: continually testing the system's resilience against chaotic events. -- https://flic.kr/p/duWtgw State Farm CC BY 2.0
  • 62. detection stopping it while it's small 2 But, ok, sometimes, inevitably, things go wrong. We have an opportunity to put this fire out while it's tiny. --- https://commons.wikimedia.org/wiki/File:Fire-blanket-on-display.jpg Public domain
  • 63. 59 topquark22 CC BY 2.0 smoke alarms, fire extinguishers Humans can react quickest if the right fire extinguishers are available. Provide a one-click rollback for all your changes. Use canaries: push the change to one instance before we push all the instances. And launch with feature flags to push out new features in a way that makes it very fast to turn them off if you need to. Alerts need a fine balance, as everyone knows who’s ever had an over-enthusiastic smoke alarm in their kitchen. An occasional false alarm is ok, but having humans continuously react to small problems can burn them out. It's using up your gunpowder on small fires and not having enough left for the big ones! So aim to keep your false alarms low. --- https://flic.kr/p/6AcBru topquark22 CC BY-2.0 https://pixabay.com/en/fire-extinguisher-fire-delete-99915/ Public domain.
  • 64. HomeSpot HQ CC BY 2.0 sprinklers But even better, don't get humans involved at all for small things. Add automatic recovery. If a machine dies, it should automatically be replaced. If a backend goes missing, we should be able to coast for a while. Health checking and load balancing should move traffic from an unhealthy region to a healthy one. Maybe you want to let humans know, but the message they should get is "everything is under control but you might want to look at this when you get a chance". Not "WELCOME TO 3AM! A MACHINE DID A THING". -- https://flic.kr/p/fmr7a7 HomeSpot HQ www.homespothq.com
  • 65. 3 61 isolation preventing it from spreading Stage 3: Ok, there's a fire, it's happening. Now we want to not let it get on anything it's not already on. -- https://unsplash.com/photos/MApjpqu9V7E CC0
  • 66. 62 Achim Hering CC BY 3.0 fire barriers Failure domains split our systems up so that only one part of it should be affected by any given outage. And if the problem's going to move as components get overloaded, we want that to be slow enough that we can control it, not an immediate cascade. And we have our own version of moving bakeries out of residential buildings: we can isolate risky customers on their own replicas or shards. --- https://commons.m.wikimedia.org/wiki/File:Durasteel_fire_barrier.jpg State Farm CC BY 2.0
  • 67. fire drills Just like we make it incredibly common to hear a smoke alarm and find our way outside, make it so that a disaster is never a surprise. Humans will panic the first time they hit a situation that's outside their comfort zone. At intervals, tell people you're doing a controlled outage, and take a system offline. --- https://pixabay.com/en/safety-helmet-construction-hat-295057/ CC0
  • 68. avoiding encumbrances 64 You know the phenomenon where you're fixing something and you hit a bunch of unintuitive commands, or out of date documentation, and it ends up taking you much longer to do something simple? Or you even end up breaking something else? These traps are a basement full of straw, or a fire hose with cluttered scenery on top of it. It's making it very, very hard for you to move around safely as you try to fix the real problem. Push back on technical debt and clutter. Fatigue is an encumbrance too. You're way more likely to make a mistake if you're exhausted. Set rules about how long a person should deal with an incident before their on call shift is over and someone else needs to swap in. Enforce those rules. -- photo by me.
  • 69. 4response okay, we're fighting a fire And sometimes we will still get to stage 4, fighting a massive outage. But we should aim to not get here often. Firefighting is not good for your SLAs and it's also not great for the health of the humans involved. --- Image: skeeze. CC0. https://pixabay.com/en/firefighters-training-live-fire-696167/
  • 70. controlled burns Jereme Rauckman CC BY 2.0 Ideally we'll get to a point where our firefighters mostly train using controlled outages, like many real fire departments do. But we're not there yet. Many of us are still fixing unreliable software by focusing on this fourth stage, with human response and escape routes... -- https://flic.kr/p/pjPGD6
  • 71. Software without built-in reliability? That's a tenement. ..., that means they're building tenements. Foul air is coming in through the air shafts, and it's not somewhere humans should live. Reliability can't be added after the building is finished. It needs to be built in. Failure needs to be built in. Building better buildings makes a huge difference. --- https://commons.wikimedia.org/wiki/File:Two_officials_of_the_New_York_City_Tenem ent_House_Department_inspect_a_cluttered_basement_living_room,_ca._1900_-_N ARA_-_535469.jpg Public domain.
  • 72. NYC had 48 civilian fire deaths in 2016. That's the lowest in 100 years. Reprinted with permission from NFPA Report U.S. Fire Death Rates by State copyright © 2017, National Fire Protection Association, Quincy, MA. All rights reserved. In 2016, 48 people died by fires in New York City. This is still a lot of people! But 2016 was the lowest number since they started recording a hundred years ago, even though the population of the city continues to grow. That Bronx fire in December that killed 12 people was the deadliest in 25 years. How did we get from the fire traps of the 1800s to here? --- https://www.nfpa.org/News-and-Research/Fire-statistics-and-reports/Fire-statistics/Fires-i n-the-US/Overall-fire-problem/Fire-deaths-by-state Used with permission from copyrightrequests@nfpa.org
  • 73. 444 → pages! 69 Well, this helped. This is the New York City fire code. It has 444 pages and costs $140 dollars, which I know because I really wanted to bring one in here today and dramatically wave it at everyone.The guy at the library was really confused about why I'd want a physical copy. He was like "Look, do you have access to the internet?" --- Book: http://shop.iccsafe.org/2014-new-york-city-fire-code.html Fire code: https://www1.nyc.gov/site/fdny/about/resources/code-and-rules/nyc-fire-code.page
  • 74. 70 444 → pages! And fire safety is also mentioned plenty in the city building code, the city construction code, the state building code, the National Fire Prevention Agency electrical code and I’m sure plenty of other dense legislation. Don’t ask me what's in each of these. There’s a lot of code, that’s all I’m saying. But we don't have a fire code for software. We have a bunch of O’Reilly books and they're great. But nothing makes us adhere to our best practices, or prioritises one set of rules over the others. Why don't we have a fire code yet?
  • 75. “ "No computer software failure has killed or injured a large number of people. It is just conceivable that such a tragedy could occur." Software: A Vital Key to UK Competitiveness (C) Crown Copyright 1986 via Risks Digest (https://catless.ncl.ac.uk/Risks) h/t joe Thompson @caffeinepresent 1986 It has been proposed from time to time! I found this report from 1986 called "Software: a vital key to UK competitiveness", which had a whole appendix on safety critical software. It starts with “No computer software failure has killed or injured a large number of people. It is just conceivable that such a tragedy could occur.” ---- https://catless.ncl.ac.uk/Risks/4/14#subj3.1 https://twitter.com/caffeinepresent/status/945079032445620226 https://pxhere.com/en/photo/1111021 CC0
  • 76. “ "Each life-critical system must be operated by a Certified Software Engineer who is named as being personally responsible for the system." Proposal from the UK Advisory Council for Applied Research and Development, 1986 1986 The Advisory Council predicted a time when it wouldn’t be possible to recover from software failure by just switching off the computer and doing the thing manually -- this was written in 1986, remember. We're there now. They wanted certification: you would only be able to operate a life-critical computer system if you had a license and a Certified Software Engineer to sign off on it -- and they would be personally liable! -- and a bunch of other stuff, and you'd have to get re-certified every five years. They also proposed what’s basically on call shifts, disaster recovery practice drills, and post-mortems, including post-mortems for near misses. A lot of this feels prescient and we ended up doing it, but we never required certification. --- https://catless.ncl.ac.uk/Risks/4/14#subj3.1 https://twitter.com/caffeinepresent/status/945079032445620226 https://pxhere.com/en/photo/1111021 CC0
  • 77. 73 slide from @jkuroda's amazing LISA 2017 keynote. Used with permission. If you were at LISA in November, you might have seen Jon Kuroda's fantastic closing keynote about aviation safety. Like buildings, plane travel got safer only after a lot of bad accidents. Jon pointed out that, while we might think of computing as a new field, it's the same age as a bunch of others. Software, aviation, power, emergency medicine all took a big jump forward after world war 2. But our industry is significantly less mature than any of the others. ---- https://people.eecs.berkeley.edu/~jkuroda/talks/jkuroda-systemcrash-planecrash-lisa2 017.pdf Image by me.
  • 78. The stakes are lower? Is that because the stakes are lower? It's at least part of the reason. Mostly, the stakes have have been lower. Software mostly hasn't had the ability to cause massive disasters. Researching this talk, I read a ton about deaths from software -- it really was a cheerful time creating this talk -- and found surprisingly few. Most of the new about software and deaths were about how software is IMPROVING things. By making processes repeatable and precise, we're saving lives. But we have had some famously dangerous software bugs.
  • 79. The stakes are lower? Ars Technica, August 2013 The Independent, October 1992 New York Times, June 1986 The Therac-25 radiation therapy machine had a concurrent programming bug that made it occasionally give its patients radiation doses that were hundreds of times greater than they should have been. Three people died. In college I remember studying the London Ambulance dispatch failure. A new software system was deployed that hadn't been load tested, and it had a memory leak. It couldn't keep track of where the ambulances were, which led to them arriving hours late. 46 people died who might have been ok if the ambulance had arrived on time. And some near misses. Like, I haven't heard of any actual negative outcomes from the OCR bug that went around in 2013, but you can see how it might print end up with numbers in prescriptions or structural engineering documents being catastrophically wrong. And the news is full of software concerns in vehicles, self-driving or otherwise. -- https://www.nytimes.com/1986/06/21/us/fatal-radiation-dose-in-therapy-attributed-to-c omputer-mistake.html https://www.independent.co.uk/news/ambulance-chief-quits-after-patients-die-in-comp uter-failure-1560111.html https://www.wired.com/2009/10/1026london-ambulance-computer-meltdown/ https://arstechnica.com/information-technology/2013/08/confused-photocopiers-rando mly-rewriting-scanned-documents
  • 81. “"It took a Newark fire and a Triangle fire to bring New York State's fire legislation to its present inefficiency." Inis Weed, New Outlook volume 104, 1913 1913 But none of those has been our Triangle fire. So far software has been able to kill people one or a few at a time. We haven’t had the wide-scale disasters that have shocked other industries into growing up. Aviation regulations came from a bunch of people dying. Mining regulations came from a bunch of people dying. Professional engineering organisations came from a bunch of people dying. To quote my new favourite 1910s journalist, Inis Weed, "It took a Titanic disaster to improve the safety of vessels. It took a Newark Fire and a Triangle fire to bring New York State's fire legislation to its present inefficiency". The use of software for life-critical systems grows every year. And every day we send #hugops on Twitter to the people working on the latest massive software outage. At some point these will overlap. Hope is not a strategy. Are we ready for this kind of responsibility? We, all of us here, are people who are responsible for software. The world will need a lot of software over the next few decades. Some people in this room will run life critical systems. We are 1890s landlords looking at a whole lot of new opportunity. We know, there's money to be made from cutting all of the corners, but we have a choice. I don't want us to wait for a disaster... ------------------ New Outlook, volume 104. https://books.google.com/books?id=URCzNkpDZp0C
  • 82. Inis Weed or Inis Weed Jones made topics like medicine, sociology and science exciting for regular people. She wrote extensively for Harper’s, Schibner’s and the Reader’s Digest. She lived, at least for a while, at 337 West 22nd St. She wrote tons about working conditions and humanised anonymous workers. She was an investigator for the US Commission on Industrial Relations. She wrote articles like The Reasons Why The Copper Miners Struck (about a strike), and Safer Childbirth with Less Pain, and Acne: the Plague of Youth and Not By Bread Alone (about young people returning to farming). She also published a book called "Peetie: the story of a real cat", which is $72 on abebooks.com and I won't deny, I'm tempted. She reads like a tremendously compassionate person who wrote about things people needed to care about in an engaging way and made them care. (Please don't be a milkshake duck, Inis).
  • 83. Let's choose not to build tenements. ...to decide not to build tenements. Remember, some regulations didn't come from fires! Some came from a lot of people deciding to care about the same thing at the same time. We can decide now what good systems look like. We can create professional standards and industry safety codes, and create and opt in to a professional organisation to keep ourselves honest. And then, like the fire code, we can keep revising and improving it until huge software outages are rare and shocking. The entire industry should learn from every major outage. No secrets.
  • 84. 78 http://noidea.dog/fires ● Escapes in Urban America: History and Preservation, Elizabeth Mary Andre ● No exit: the rise and demise of the outside fire escape: Sara E Wermiel ● How Fire Disaster Shaped the Evolution of the New York City Building Code, Charles Shelhamer ● The Creative and forgotten fire escape designs of the 1800s, Lauren Young ● New Outlook vol 104 (May-August 1913) ● RISKS Digest ● 1910 Newark Factory Fire, Mary Alden Hopkins ● New York City (NYC) Disasters, Baruch College ● Presentation template by SlidesCarnival Questions? Comments? Find me at @whereistanya or fires@noidea.dog #GetAlarmedNYC Before I finish: if you're in New York, the NYFD and the Red Cross have a shared campaign to give people free smoke alarms and free batteries. They'll even come install it for you. If you don't have a smoke alarm, please search for #GetAlarmedNYC and fill in their form. http://fw.to/Kzv1G4f (Two SREs live in my apartment, so we already have two redundant meshes of networked alarms from different manufacturers and also a few standalone alarms.) This slide lists a few references that I found especially useful or interesting while writing this talk. That first one contains a list of all the others, so hit up http://noidea.dog/fires if you want a lot of links to read more about fires and fire escapes. If you have comments on the talk, or questions or you're a building historian who is willing to tell me what I got wrong, you can find me at @whereistanya on Twitter or fires@noidea.dog. --- https://commons.wikimedia.org/wiki/File:Smoke_alarm.JPG CC0