SlideShare une entreprise Scribd logo
1  sur  19
Télécharger pour lire hors ligne
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Making better spreadsheets
Why 95% of spreadsheets contain errors,
and what we can do about it
www.i-nth.com
February 2017
www.i-nth.com
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Spreadsheets are riddled with errors
To reduce the risks, our practices must improve
Introduction Stories Why errors happen ConclusionWhat we can do
Objectives:
 Highlight the risks of heedless
spreadsheet development.
 Outline ways to reduce those risks.
Based on:
 Hard-fought practical experience.
 Academic literature on spreadsheet
risks and best practice.
www.i-nth.com/resources/bibliography
“ Developing an error-free spreadsheet has been a problem
since the beginning of end-user computing.”
Mireault, 2015 www.i-nth.com
This early spreadsheet, a
Babylonian clay tablet
from c.1800 BC, contains
several errors.
Plimpton 322
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Spreadsheet risk is real, with substantial impacts
These are actual news stories
Introduction Stories Why errors happen ConclusionWhat we can do
www.i-nth.com
Reinhart, Rogoff, and the Excel error that
changed history
Bloomberg, 18 April 2013
$1.5M went missing as staff managed
“monstrous spreadsheets”
Metro West Daily News, 15 October 2011
£4.3M spreadsheet error leads to
resignation of Mouchel chief executive
Daily Express, 7 October 2011
Clallam County cashier hides rows in
a spreadsheet to cover up theft
Peninsula Daily News, 21 July 2011
$15 million mistake: that representative
doesn't work for the company anymore
Technology Marketing Corporation, 21 November 2009
Accountant omits minus sign on a net
capital loss of $1.3 billion
The Risks Digest, November 1994
$182M blunder in cashflow forecasting
spreadsheet model
Stuff, 1 March 2012
An alarming number of scientific papers
contain Excel errors
Washington Post, 26 August 2016
“ Few incidents of spreadsheet errors are made public
and these are usually not revealed by choice.”
Kruck & Sheetz, 2001
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Spreadsheet errors have many causes
But spreadsheets are not to blame – we are
Introduction Stories Why errors happen ConclusionWhat we can do
Why spreadsheet errors happen:
 Most of a spreadsheet’s complexity is hidden.
 Spreadsheet complexity leads to cognitive overload.
 We’re human, so we made mistakes.
 Calculations, and errors, cascade from cell to cell.
 We’re overconfident, despite complexity & overload.
 Management fail to recognise spreadsheet risk.
www.i-nth.com
“ The results given by spreadsheets
are often just wrong.”
Sajaniemi, 1998
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Most of a spreadsheet’s complexity is hidden
We see only one formula and the formatted values
Introduction Stories Why errors happen ConclusionWhat we can do
www.i-nth.com
Consider this simple model:
 Looks OK (typically the
only assessment we do).
 But the formulae in cells
F2:F5 exclude column E.
 The formula in E6 is also wrong (doesn’t deduct the tax in E5).
 The repeated values in Row 5 are suspicious – are some values
hard-coded rather than being calculated?
 Are the formulae, references, logic, formats, etc. correct?
 Even a simple spreadsheet can be wrong in many ways.
“ Even obvious, elementary errors in very simple, clearly
documented spreadsheets are... difficult to find.”
Galletta et al, 1993
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Spreadsheet complexity leads to cognitive overload
The layers of a spreadsheet are hard to understand
Introduction Stories Why errors happen ConclusionWhat we can do
www.i-nth.com
Spreadsheets have multiple layers:
 Presentation layer – ie. what you see.
 Formulae (one visible at a time).
 Logic (Data  Formulae  Results).
 Formatting.
 Data types (text, number, date, Boolean).
 VBA, charts, PivotTables, Solver, Data Validation, Tables, Slicers,
Filters, Queries, Outlines, Protection, Names, Print Range, etc.
Attempting to form a mental model of all the layer interactions can
produce cognitive overload, which makes errors more likely.
“ Spreadsheets are often hard, if not impossible,
to understand.”
Mireault & Gresham, 2015
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
We’re human, so we make mistakes
Typical spreadsheets have errors in 1% to 5% of cells
Introduction Stories Why errors happen ConclusionWhat we can do
We make errors; they are inherent in how we think.
Typical error rates for simple, nontrivial activities:
 Type short number: 1.0% (per number).
 Grammatical errors: 1.1% (of words).
 Simple arithmetic: 2.0% (of calculations).
 Software development: 3.7% (per line of code).
 Type 10 digits: 5.0% (per number).
www.i-nth.com
Experiments in spreadsheet development observe
similar rates, with errors in 1% to 5% of cells:
 Called the “cell error rate” (CER).
“ The issue is not whether there is an error but how many errors
there are and how serious they are.”
Panko, 2007
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Calculations, and errors, cascade from cell to cell
95% probability of error overall for just 100 cells and 3% CER
Introduction Stories Why errors happen ConclusionWhat we can do
Spreadsheet calculations are
linked, so errors cascade from
cell to cell.
If a spreadsheet contains:
 100 cells (small).
 3% CER (moderate).
Then, the probability of at
least one error is about:
 95% (ie. almost certain).
More cells  more errors.
www.i-nth.com
“ Most large spreadsheets have dozens
or even hundreds of errors.”
Panko & Ordway, 2005
The probability of at least one error
= 1 – (1 – Cell Error Rate)Number of cells
Calculation cascade: A common cause of catastrophe
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
We’re overconfident, despite complexity & overload
Ignorance of the risk reinforces our overconfidence
Introduction Stories Why errors happen ConclusionWhat we can do
Our typical approach to spreadsheets creates a vicious cycle:
 We underestimate the incidence of spreadsheet errors.
 If a spreadsheet produces a result, we assume it is correct.
 We see little need to test our spreadsheets.
 So we do little or no testing of our spreadsheets.
 Consequently, we find few errors.
 When we find an error, our inflated perception of our
error-finding prowess is reinforced.
 With little evidence to the contrary, we are overconfident
that our spreadsheets are correct.
www.i-nth.com
“ Overconfidence is one of the most substantial
causes of spreadsheet errors.”
Sakal et al, 2015
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Management fail to recognise spreadsheet risk
Leading to inadequate training and insufficient oversight
Introduction Stories Why errors happen ConclusionWhat we can do
Managers underestimate the risks of spreadsheets:
 It is “just a spreadsheet”.
 But actually, spreadsheets can be highly complex analytical
applications that are an essential part of decision making.
Because spreadsheet risk is not well understood, managers:
 Provide inadequate training, focusing only on software features
rather than on good development practices and quality.
 Perform insufficient oversight, mistaking proficiency with the
tools for robustness of results.
Poor spreadsheet quality leads to poor decision making.
www.i-nth.com
“ Despite overwhelming and unanimous evidence...
companies have continued to ignore spreadsheet error risks.”
Panko, 2014
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
What we can do
The solutions are well understood, but not well used
Introduction Stories ConclusionWhat we can do
www.i-nth.com
Why errors happen
The prescription for reducing spreadsheet errors is clear:
 Learn from our software development colleagues.
 Spreadsheet users need to recognise the risk.
 Managers need to recognise the risk.
 Control development lifecycle of critical spreadsheets.
 Adopt good development practices to reduce risk.
 Enhance spreadsheet training to focus on quality.
 Test, test, test.
“ The software that end users are creating…
is riddled with errors.”
Burnett & Myers, 2014
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Learn from our software development colleagues
Spreadsheet development = Software development
Introduction Stories ConclusionWhat we can do
In software development:
 Typically 40% to 60% of total time is devoted to testing.
 Initial average error rate: 3.7% (per line of code).
 Extensive testing reduces errors by 90%, but some bugs remain.
Building a spreadsheet is equivalent to writing software:
 Spreadsheet formulae are like program code.
 Software development issues of bugs, data integrity, version
control, error handling, and testing also apply to spreadsheets.
 We wouldn’t accept software built like we build spreadsheets.
www.i-nth.com
Why errors happen
“ The untested spreadsheet is as dangerous
and untrustworthy as an untested program.”
Price, 2006
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Spreadsheet users need to recognise the risk
Adopt a professional approach to spreadsheet development
Introduction Stories ConclusionWhat we can do
Spreadsheet users need to:
 Recognise that spreadsheet error is a real and substantial risk.
 Acknowledge and seek to mitigate the effects of overconfidence.
 Seek knowledge that goes beyond just the software features.
 Learn and apply good development practices.
 Recognise and manage factors that increase risk, such as poor
development practices and excessive complexity.
 Test spreadsheets before releasing them for use.
www.i-nth.com
Why errors happen
“ Never assume a spreadsheet is right,
even your own.”
Raffensperger, 2001
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Managers need to recognise the risk
Spreadsheet quality requires management action
Introduction Stories ConclusionWhat we can do
Managers need to:
 Recognise that spreadsheet error is a real and substantial risk.
 Recognise and mitigate the hazard created by overconfidence.
 Provide appropriate training for spreadsheet users.
 Insist on good development practices.
 Recognise and manage factors that increase risk, such as tight
deadlines and inadequate processes.
 Ensure that spreadsheets are properly tested before being used.
www.i-nth.com
Why errors happen
“ Most executives do not really check or verify the accuracy or
validity of the spreadsheets before they use the solutions.”
Teo & Tan, 1999
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Control development lifecycle of critical spreadsheets
Formalise management and control where appropriate
Introduction Stories ConclusionWhat we can do
Spreadsheets generally go through a lifecycle:
 Planning  Analysis  Design  Construction
 Testing  Implementation  Archive.
 Often the emphasis is almost exclusively on
the Construction phase, with insufficient time
spent in other phases (especially testing).
For most spreadsheets, informal management of
the lifecycle, with minimal control, is sufficient.
www.i-nth.com
Why errors happen
“ The principal objective of a structured and disciplined methodology…
is to reduce the occurrence of user-generated errors in the models.”
Rajalingham et al, 2002
For critical spreadsheets, more formal management and control
of the lifecycle may be appropriate – just like for other software.
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Adopt good development practices to reduce risk
Make spreadsheets easier to use, understand, and test
Introduction Stories ConclusionWhat we can do
Spreadsheets are very flexible – that is part of their appeal.
But they are often hard to use, harder to maintain, and have errors.
To minimise risk, adopt good practices such as:
www.i-nth.com
Why errors happen
“ Most practitioners would agree on the basic aims of best practice, as
being to make understanding and testing reasonably straightforward.”
Murphy, 2007
 Include documentation.
 Separate data, analysis, and
results.
 Use consistent structure.
 Use short formulae.
 Avoid hard-coded values in
formulae.
 Include self-checks.
 Use formatting for a purpose,
not decoration.
 Setup printing for each sheet.
 Use cell protection.
 Use Data Validation & Tables.
 Always test your spreadsheet.
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Enhance spreadsheet training to focus on quality
Learn about good practice and mitigating risks
Introduction Stories ConclusionWhat we can do
Most training is about features – eg. how to build a PivotTable.
While such training is necessary, it is not sufficient.
We also need to learn about building good spreadsheets:
 Principles of spreadsheet design, including techniques for
building robust and reliable spreadsheets.
 Being aware of risks and how to mitigate those risks.
 Recognising and managing overconfidence.
 Methods for testing spreadsheets.
 Understanding and using the spreadsheet development life cycle.
www.i-nth.com
Why errors happen
“ Training specifically aimed at teaching spreadsheet design principles…
can significantly help to reduce the incidence of spreadsheet errors.”
Beaman et al, 2005
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Test, test, test
There is no substitute for inspecting every cell
Introduction Stories ConclusionWhat we can do
Extensive spreadsheet testing is essential:
 Recognise that spreadsheets are like any other software; that is,
they contain bugs and need to be tested.
 Testing is a formal process, requiring training and practice.
 Inspect every cell and object (chart, PivotTable, etc.).
 Best done in teams, as different people find different errors.
 Testing will find errors, which can be fed back into the
development and training process to improve future quality.
 As suggested by the Spreadsheet Development Life Cycle, testing
typically requires as much time and effort as construction.
www.i-nth.com
Why errors happen
“ To reduce spreadsheet errors... only one technique, cell-by-cell
code inspection, has been demonstrated to be effective.”
Panko, 2000
 A B C D E F G H I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
We know how to make better spreadsheets
Improve our spreadsheets and make better decisions
Introduction Stories Why errors happen ConclusionWhat we can do
The problem:
 Spreadsheet quality is poor: 95% contain errors.
The solution:
 Recognise that spreadsheet errors are a problem.
 Reduce overconfidence through awareness.
 Better control of critical spreadsheets.
 Use good spreadsheet development practices.
 Training to focus on quality.
 Greatly increase the amount of testing that we do.
www.i-nth.com
“ Good software (this applies to spreadsheets as well) does not
happen by chance, but is engineered/designed.”
Sakal et al, 2014
Reinforce
better
practice
through
feedback

Contenu connexe

En vedette

Lifecycle admissions overview
Lifecycle admissions overviewLifecycle admissions overview
Lifecycle admissions overviewJim Goecker
 
Trinity Daily Oct 1, 2016
Trinity Daily Oct 1, 2016Trinity Daily Oct 1, 2016
Trinity Daily Oct 1, 2016Arun Surendran
 
Target audience profile
Target audience profileTarget audience profile
Target audience profileibrahim95
 
Autopista del mar báltico 1
Autopista del mar báltico 1Autopista del mar báltico 1
Autopista del mar báltico 1Adrián Aguado
 
Excel 2010 training presentation create your first spreadsheet (revised)
Excel 2010 training presentation create your first spreadsheet (revised)Excel 2010 training presentation create your first spreadsheet (revised)
Excel 2010 training presentation create your first spreadsheet (revised)MFMinickiello
 
презантация кафедры вариант щербаковой
презантация кафедры вариант щербаковойпрезантация кафедры вариант щербаковой
презантация кафедры вариант щербаковойshdp1
 

En vedette (9)

Google Docs
Google DocsGoogle Docs
Google Docs
 
Lifecycle admissions overview
Lifecycle admissions overviewLifecycle admissions overview
Lifecycle admissions overview
 
Trinity Daily Oct 1, 2016
Trinity Daily Oct 1, 2016Trinity Daily Oct 1, 2016
Trinity Daily Oct 1, 2016
 
Target audience profile
Target audience profileTarget audience profile
Target audience profile
 
Psicomotricidad3b
Psicomotricidad3bPsicomotricidad3b
Psicomotricidad3b
 
Infinite Spreadsheet
Infinite SpreadsheetInfinite Spreadsheet
Infinite Spreadsheet
 
Autopista del mar báltico 1
Autopista del mar báltico 1Autopista del mar báltico 1
Autopista del mar báltico 1
 
Excel 2010 training presentation create your first spreadsheet (revised)
Excel 2010 training presentation create your first spreadsheet (revised)Excel 2010 training presentation create your first spreadsheet (revised)
Excel 2010 training presentation create your first spreadsheet (revised)
 
презантация кафедры вариант щербаковой
презантация кафедры вариант щербаковойпрезантация кафедры вариант щербаковой
презантация кафедры вариант щербаковой
 

Dernier

Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...ttt fff
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一F sss
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 

Dernier (20)

Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 

Making better spreadsheets

  • 1.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Making better spreadsheets Why 95% of spreadsheets contain errors, and what we can do about it www.i-nth.com February 2017 www.i-nth.com
  • 2.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Spreadsheets are riddled with errors To reduce the risks, our practices must improve Introduction Stories Why errors happen ConclusionWhat we can do Objectives:  Highlight the risks of heedless spreadsheet development.  Outline ways to reduce those risks. Based on:  Hard-fought practical experience.  Academic literature on spreadsheet risks and best practice. www.i-nth.com/resources/bibliography “ Developing an error-free spreadsheet has been a problem since the beginning of end-user computing.” Mireault, 2015 www.i-nth.com This early spreadsheet, a Babylonian clay tablet from c.1800 BC, contains several errors. Plimpton 322
  • 3.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Spreadsheet risk is real, with substantial impacts These are actual news stories Introduction Stories Why errors happen ConclusionWhat we can do www.i-nth.com Reinhart, Rogoff, and the Excel error that changed history Bloomberg, 18 April 2013 $1.5M went missing as staff managed “monstrous spreadsheets” Metro West Daily News, 15 October 2011 £4.3M spreadsheet error leads to resignation of Mouchel chief executive Daily Express, 7 October 2011 Clallam County cashier hides rows in a spreadsheet to cover up theft Peninsula Daily News, 21 July 2011 $15 million mistake: that representative doesn't work for the company anymore Technology Marketing Corporation, 21 November 2009 Accountant omits minus sign on a net capital loss of $1.3 billion The Risks Digest, November 1994 $182M blunder in cashflow forecasting spreadsheet model Stuff, 1 March 2012 An alarming number of scientific papers contain Excel errors Washington Post, 26 August 2016 “ Few incidents of spreadsheet errors are made public and these are usually not revealed by choice.” Kruck & Sheetz, 2001
  • 4.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Spreadsheet errors have many causes But spreadsheets are not to blame – we are Introduction Stories Why errors happen ConclusionWhat we can do Why spreadsheet errors happen:  Most of a spreadsheet’s complexity is hidden.  Spreadsheet complexity leads to cognitive overload.  We’re human, so we made mistakes.  Calculations, and errors, cascade from cell to cell.  We’re overconfident, despite complexity & overload.  Management fail to recognise spreadsheet risk. www.i-nth.com “ The results given by spreadsheets are often just wrong.” Sajaniemi, 1998
  • 5.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Most of a spreadsheet’s complexity is hidden We see only one formula and the formatted values Introduction Stories Why errors happen ConclusionWhat we can do www.i-nth.com Consider this simple model:  Looks OK (typically the only assessment we do).  But the formulae in cells F2:F5 exclude column E.  The formula in E6 is also wrong (doesn’t deduct the tax in E5).  The repeated values in Row 5 are suspicious – are some values hard-coded rather than being calculated?  Are the formulae, references, logic, formats, etc. correct?  Even a simple spreadsheet can be wrong in many ways. “ Even obvious, elementary errors in very simple, clearly documented spreadsheets are... difficult to find.” Galletta et al, 1993
  • 6.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Spreadsheet complexity leads to cognitive overload The layers of a spreadsheet are hard to understand Introduction Stories Why errors happen ConclusionWhat we can do www.i-nth.com Spreadsheets have multiple layers:  Presentation layer – ie. what you see.  Formulae (one visible at a time).  Logic (Data  Formulae  Results).  Formatting.  Data types (text, number, date, Boolean).  VBA, charts, PivotTables, Solver, Data Validation, Tables, Slicers, Filters, Queries, Outlines, Protection, Names, Print Range, etc. Attempting to form a mental model of all the layer interactions can produce cognitive overload, which makes errors more likely. “ Spreadsheets are often hard, if not impossible, to understand.” Mireault & Gresham, 2015
  • 7.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 We’re human, so we make mistakes Typical spreadsheets have errors in 1% to 5% of cells Introduction Stories Why errors happen ConclusionWhat we can do We make errors; they are inherent in how we think. Typical error rates for simple, nontrivial activities:  Type short number: 1.0% (per number).  Grammatical errors: 1.1% (of words).  Simple arithmetic: 2.0% (of calculations).  Software development: 3.7% (per line of code).  Type 10 digits: 5.0% (per number). www.i-nth.com Experiments in spreadsheet development observe similar rates, with errors in 1% to 5% of cells:  Called the “cell error rate” (CER). “ The issue is not whether there is an error but how many errors there are and how serious they are.” Panko, 2007
  • 8.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Calculations, and errors, cascade from cell to cell 95% probability of error overall for just 100 cells and 3% CER Introduction Stories Why errors happen ConclusionWhat we can do Spreadsheet calculations are linked, so errors cascade from cell to cell. If a spreadsheet contains:  100 cells (small).  3% CER (moderate). Then, the probability of at least one error is about:  95% (ie. almost certain). More cells  more errors. www.i-nth.com “ Most large spreadsheets have dozens or even hundreds of errors.” Panko & Ordway, 2005 The probability of at least one error = 1 – (1 – Cell Error Rate)Number of cells Calculation cascade: A common cause of catastrophe
  • 9.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 We’re overconfident, despite complexity & overload Ignorance of the risk reinforces our overconfidence Introduction Stories Why errors happen ConclusionWhat we can do Our typical approach to spreadsheets creates a vicious cycle:  We underestimate the incidence of spreadsheet errors.  If a spreadsheet produces a result, we assume it is correct.  We see little need to test our spreadsheets.  So we do little or no testing of our spreadsheets.  Consequently, we find few errors.  When we find an error, our inflated perception of our error-finding prowess is reinforced.  With little evidence to the contrary, we are overconfident that our spreadsheets are correct. www.i-nth.com “ Overconfidence is one of the most substantial causes of spreadsheet errors.” Sakal et al, 2015
  • 10.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Management fail to recognise spreadsheet risk Leading to inadequate training and insufficient oversight Introduction Stories Why errors happen ConclusionWhat we can do Managers underestimate the risks of spreadsheets:  It is “just a spreadsheet”.  But actually, spreadsheets can be highly complex analytical applications that are an essential part of decision making. Because spreadsheet risk is not well understood, managers:  Provide inadequate training, focusing only on software features rather than on good development practices and quality.  Perform insufficient oversight, mistaking proficiency with the tools for robustness of results. Poor spreadsheet quality leads to poor decision making. www.i-nth.com “ Despite overwhelming and unanimous evidence... companies have continued to ignore spreadsheet error risks.” Panko, 2014
  • 11.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 What we can do The solutions are well understood, but not well used Introduction Stories ConclusionWhat we can do www.i-nth.com Why errors happen The prescription for reducing spreadsheet errors is clear:  Learn from our software development colleagues.  Spreadsheet users need to recognise the risk.  Managers need to recognise the risk.  Control development lifecycle of critical spreadsheets.  Adopt good development practices to reduce risk.  Enhance spreadsheet training to focus on quality.  Test, test, test. “ The software that end users are creating… is riddled with errors.” Burnett & Myers, 2014
  • 12.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Learn from our software development colleagues Spreadsheet development = Software development Introduction Stories ConclusionWhat we can do In software development:  Typically 40% to 60% of total time is devoted to testing.  Initial average error rate: 3.7% (per line of code).  Extensive testing reduces errors by 90%, but some bugs remain. Building a spreadsheet is equivalent to writing software:  Spreadsheet formulae are like program code.  Software development issues of bugs, data integrity, version control, error handling, and testing also apply to spreadsheets.  We wouldn’t accept software built like we build spreadsheets. www.i-nth.com Why errors happen “ The untested spreadsheet is as dangerous and untrustworthy as an untested program.” Price, 2006
  • 13.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Spreadsheet users need to recognise the risk Adopt a professional approach to spreadsheet development Introduction Stories ConclusionWhat we can do Spreadsheet users need to:  Recognise that spreadsheet error is a real and substantial risk.  Acknowledge and seek to mitigate the effects of overconfidence.  Seek knowledge that goes beyond just the software features.  Learn and apply good development practices.  Recognise and manage factors that increase risk, such as poor development practices and excessive complexity.  Test spreadsheets before releasing them for use. www.i-nth.com Why errors happen “ Never assume a spreadsheet is right, even your own.” Raffensperger, 2001
  • 14.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Managers need to recognise the risk Spreadsheet quality requires management action Introduction Stories ConclusionWhat we can do Managers need to:  Recognise that spreadsheet error is a real and substantial risk.  Recognise and mitigate the hazard created by overconfidence.  Provide appropriate training for spreadsheet users.  Insist on good development practices.  Recognise and manage factors that increase risk, such as tight deadlines and inadequate processes.  Ensure that spreadsheets are properly tested before being used. www.i-nth.com Why errors happen “ Most executives do not really check or verify the accuracy or validity of the spreadsheets before they use the solutions.” Teo & Tan, 1999
  • 15.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Control development lifecycle of critical spreadsheets Formalise management and control where appropriate Introduction Stories ConclusionWhat we can do Spreadsheets generally go through a lifecycle:  Planning  Analysis  Design  Construction  Testing  Implementation  Archive.  Often the emphasis is almost exclusively on the Construction phase, with insufficient time spent in other phases (especially testing). For most spreadsheets, informal management of the lifecycle, with minimal control, is sufficient. www.i-nth.com Why errors happen “ The principal objective of a structured and disciplined methodology… is to reduce the occurrence of user-generated errors in the models.” Rajalingham et al, 2002 For critical spreadsheets, more formal management and control of the lifecycle may be appropriate – just like for other software.
  • 16.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Adopt good development practices to reduce risk Make spreadsheets easier to use, understand, and test Introduction Stories ConclusionWhat we can do Spreadsheets are very flexible – that is part of their appeal. But they are often hard to use, harder to maintain, and have errors. To minimise risk, adopt good practices such as: www.i-nth.com Why errors happen “ Most practitioners would agree on the basic aims of best practice, as being to make understanding and testing reasonably straightforward.” Murphy, 2007  Include documentation.  Separate data, analysis, and results.  Use consistent structure.  Use short formulae.  Avoid hard-coded values in formulae.  Include self-checks.  Use formatting for a purpose, not decoration.  Setup printing for each sheet.  Use cell protection.  Use Data Validation & Tables.  Always test your spreadsheet.
  • 17.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Enhance spreadsheet training to focus on quality Learn about good practice and mitigating risks Introduction Stories ConclusionWhat we can do Most training is about features – eg. how to build a PivotTable. While such training is necessary, it is not sufficient. We also need to learn about building good spreadsheets:  Principles of spreadsheet design, including techniques for building robust and reliable spreadsheets.  Being aware of risks and how to mitigate those risks.  Recognising and managing overconfidence.  Methods for testing spreadsheets.  Understanding and using the spreadsheet development life cycle. www.i-nth.com Why errors happen “ Training specifically aimed at teaching spreadsheet design principles… can significantly help to reduce the incidence of spreadsheet errors.” Beaman et al, 2005
  • 18.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Test, test, test There is no substitute for inspecting every cell Introduction Stories ConclusionWhat we can do Extensive spreadsheet testing is essential:  Recognise that spreadsheets are like any other software; that is, they contain bugs and need to be tested.  Testing is a formal process, requiring training and practice.  Inspect every cell and object (chart, PivotTable, etc.).  Best done in teams, as different people find different errors.  Testing will find errors, which can be fed back into the development and training process to improve future quality.  As suggested by the Spreadsheet Development Life Cycle, testing typically requires as much time and effort as construction. www.i-nth.com Why errors happen “ To reduce spreadsheet errors... only one technique, cell-by-cell code inspection, has been demonstrated to be effective.” Panko, 2000
  • 19.  A B C D E F G H I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 We know how to make better spreadsheets Improve our spreadsheets and make better decisions Introduction Stories Why errors happen ConclusionWhat we can do The problem:  Spreadsheet quality is poor: 95% contain errors. The solution:  Recognise that spreadsheet errors are a problem.  Reduce overconfidence through awareness.  Better control of critical spreadsheets.  Use good spreadsheet development practices.  Training to focus on quality.  Greatly increase the amount of testing that we do. www.i-nth.com “ Good software (this applies to spreadsheets as well) does not happen by chance, but is engineered/designed.” Sakal et al, 2014 Reinforce better practice through feedback