SlideShare a Scribd company logo
1 of 88
Oh Boy! 
These A/B tests appear to 
be bullshit!
@OptimiseOrDi 
e 
• UX and Analytics (1999) 
• User Centred Design (2001) 
• Agile, Startups, No budget (2003) 
• Funnel optimisation (2004) 
• Multivariate & A/B (2005) 
• Conversion Optimisation (2005) 
• Persuasive Copywriting (2006) 
• Joined Twitter (2007) 
• Lean UX (2008) 
• Holistic Optimisation (2009) 
Was : Consulting all over the place 
Now : Optimiser of Everything, 
Spareroom.co.uk
@OptimiseOrDie 
Hands on!
AB Test Hype Cycle 
Zen Plumbing 
@OptimiseOrDie 
Timeline 
Tested stupid ideas, 
lots 
Most AB or MVT tests are 
bullshit 
Discovered AB 
testing 
Triage, 
Triangulation, 
Prioritisation, Maths
Craig’s Cynical Quadrant 
Improves 
revenue 
Yes Client delighted 
No Yes 
Improves UX 
No 
(and fires you for another agency) 
Client fucking 
delighted 
Client absolutely 
fucking furious 
Client fires you 
(then wins an award for your work)
#1 : You’re doing it in the wrong 
place 
@OptimiseOrDie
#1 : You’re doing it in the wrong place 
There are 4 areas a CRO expert always looks at: 
1. Inbound attrition (medium, source, landing page, 
keyword, intent and many more…) 
2. Key conversion points (product, basket, registration) 
3. Processes, lifecycles and steps (forms, logins, 
registration, checkout, onboarding, emails, push) 
4. Layers of engagement (search, category, product, add) 
1. Use visitor flow reports for attrition – very useful. 
2. For key conversion points, look at loss rates & 
interactions 
3. Processes and steps – look at funnels or make your own 
4. Layers and engagement – make a ring model 
@OptimiseOrDie
Examples – Concept 
Bounce 
Engage 
Outcome 
@OptimiseOrDie
Examples – 16-25Railcard.co.uk 
Bounce 
Login to 
Account 
Content 
Engage 
Start 
Application 
Type and 
Details 
Eligibility 
Photo 
Complete 
@OptimiseOrDie
Examples – Guide Dogs 
Bounce 
Content 
Engage 
Donation 
Pathway 
Donation 
Page 
Starts 
process 
Funnel 
steps 
Complete 
@OptimiseOrDie
Within a layer 
Page 1 
Page 2 
Page 3 
Page 4 Page 5 
Exit 
Deeper 
Layer 
Email 
Wishlist 
Contact Like 
Micro 
Conversions 
@OptimiseOrDie
#1 : Make a Money Model 
• Get to know the flow and loss (leaks) inbound, inside and 
through key processes or conversion points. 
• Once you know the key steps you’re losing people at and how 
much traffic you have – make a money model. 
• 20,000 see the basket page – what’s the basket page to 
checkout page ratio? 
• Estimate how much you think you can shift the key metric 
(e.g. basket adds, basket -> checkout) 
• What downstream revenue or profit would that generate? 
• Sort by the money column 
• Congratulations – you’ve now built the worlds first IT plan for 
growth with a return on investment estimate attached! 
• I’ll talk more about prioritising later – but a good real world 
analogy for you to use: 
@OptimiseOrDie
Think like a 
store owner! 
If you can’t 
refurbish the 
entire store, 
which floors or 
departments will 
you invest in 
optimising? 
Wherever there 
is: 
• Footfall 
• Low return 
@OptimiseOrDie
#2 : Your hypothesis is 
crap! 
Insight - Inputs 
#FAIL 
Competitor 
copying 
Guessing 
Dice rolling 
Panic 
Competitor 
change 
An article 
the CEO 
read 
Ego 
Opinion 
Cherished 
notions 
Marketing 
whims Cosmic rays 
Not ‘on 
brand’ 
enough 
IT 
inflexibility 
Internal 
company 
needs 
Some 
dumbass 
consultant 
Shiny 
feature 
blindness 
Knee jerk 
reactons 
@OptimiseOrDie
#2 : These are the inputs you 
need… 
Insight - Inputs 
Insight 
Eye tracking 
Segmentation 
Surveys 
Sales and 
Call Centre 
Customer 
contact 
Social 
analytics 
Session 
Replay 
Usability 
testing 
Forms 
analytics 
Search 
analytics Voice of 
Customer 
Market 
research 
A/B and 
MVT testing 
Big & 
unstructured 
data 
Web 
analytics 
Competitor 
Customer evals 
services 
@OptimiseOrDie
Insight - Inputs 
@OptimiseOrDie 
#2 : Brainstorming the test 
• Check your inputs 
• Assemble the widest possible team 
• Share your data and research 
• Design Emotive Writing guidelines
Insight - Inputs 
@OptimiseOrDie 
#2 : Emotive Writing - example 
Customers do not know what to do and need support and advice 
• Emphasize the fact that you understand that their situation is stressful 
• Emphasize your expertise and leadership in vehicle glazing and will help them get the best 
solution for their situation 
• Explain what they will need to do online and during the call-back so that they know what the 
next steps will be 
• Explain that they will be able ask any other questions they might have during the call-back 
Customers do not feel confident in assessing the damage 
• Emphasize the fact that you will help them assess the damage correctly online 
Customers need to understand the benefits of booking online 
• Emphasize that the online booking system is quick, easy and provides all the information 
they need in regards with their appointment and general cost information 
Customers mistrust insurers and find dealing with their insurance situation very frustrating 
• Where possible communicate the fact that the job is most likely to be free for insured 
customers, or good value for money for cash customers 
• Show that you understand the hassle of dealing with insurance companies – emphasise that 
you will help with their insurance paperwork for them, freeing them of this burden 
Some customers cannot be bothered to take action to fix their car glass 
• Emphasize the consequences of not doing anything, 
e.g. ‘It’s going to cost you more if the chip develops into a crack’
Insight - Inputs 
@OptimiseOrDie 
#2 : THE DARK SIDE 
“Keep your family safe and get back on the 
road fast with Autoglass.”
Insight - Inputs 
@OptimiseOrDie 
#2 : NOW YOU CAN BEGIN 
• You should have inputs, research, data, guidelines 
• Sit down with the team and prompt with 12 questions: 
– Who is this page (or process) for? 
– What problem does this solve for the user? 
– How do we know they need it? 
– What is the primary action we want people to take? 
– What might prompt the user to take this action? 
– How will we know if this is doing what we want it to do? 
– How do people get to this page? 
– How long are people here on this page? 
– What can we remove from this page? 
– How can we test this solution with people? 
– How are we solving the users needs in different and better ways than other 
places on our site? 
– If this is a homepage, ask these too (bit.ly/1fX2RAa)
Insight - Inputs 
@OptimiseOrDie 
#2 : PROMPT YOURSELF 
• Check your UX or Copywriting 
guidelines. 
• Use Get Mental Notes 
• What levers can we apply now? 
• Create a hypothesis: 
“WE BELIEVE THAT DOING [A] 
FOR PEOPLE [B] WILL MAKE 
OUTCOME [C] HAPPEN. 
WE'LL KNOW THIS WHEN WE 
SEE DATA [D] AND FEEDBACK 
[E]” 
www.GetMentalNotes.com
Insight - Inputs 
@OptimiseOrDie 
#2 : THE FUN BIT! 
• Collaborative Sketching 
• Brainwriting 
• Refine and Test!
We believe that doing [A] for 
People [B] will make 
outcome [C] happen. 
We’ll know this when we 
observe data [D] and obtain 
feedback [E]. (reverse) 
@OptimiseOrDie
#2 : Solutions 
• You need multiple tool inputs 
– Tool decks are here : www.slideshare.net/sullivac 
• Collaborative, Customer connected team 
– If you’re not doing this, you’re hosed 
• Session replay tools provide vital input 
– Get vital additional customer evidence 
• Simple page Analytics don’t cut it 
– Invest in your analytics, especially event 
tracking 
• Ego, Opinion, Cherished notions – fill gaps 
– Fill these vacuums with insights and data 
• Champion the user 
– Give them a chair at every meeting @OptimiseOrDie
#2 : HYPOTHESIS DESIGN SUMMARY 
Insight - Inputs 
@OptimiseOrDie 
• Inputs – get the right stuff 
• Research, Guidelines, Data 
• Framing the problem(s) 
• Questions to get you going 
• Use card prompts for Psychology 
• Create a hypothesis 
• Collaborative Sketching 
• Brainwriting 
• Refine and Check Hypothesis 
• Instrument and Test
We believe that doing [A] for 
People [B] will make 
outcome [C] happen. 
We’ll know this when we 
observe data [D] and obtain 
feedback [E]. (reverse) 
@OptimiseOrDie
#3 : No analytics integration 
• Investigating problems with tests 
• Segmentation of results 
• Tests that fail, flip or move around 
• Tests that don’t make sense 
• Broken test setups 
• What drives the averages you see? 
@OptimiseOrDie
28 
A B B A
These Danish 
porn sites are 
so hardcore! 
We’re still 
waiting for our 
AB tests to 
finish! 
#4 : The test will finish after you die 
• Use a test length calculator like this one: 
• visualwebsiteoptimizer.com/ab-split-test-duration/
#5 : You don’t test for long enough 
• The minimum length 
– 2 business cycles (so you can cross check) 
– Usually a week, 2 weeks, Month 
– Always test ‘whole’ not partial cycles 
– Be aware of multiple cycles 
– Don’t self stop! 
– PURCHASE CYCLES – KNOW THEM
Business & Purchase Cycles 
@OptimiseOrDie 
Start Test Finish Avg Cycle 
• Customers change 
• Your traffic mix changes 
• Markets, competitors 
• Be aware of all the waves 
• Always test whole cycles 
• Minimum 2 cycles (wk/mo) 
• Don’t exclude slower buyers
#5 : You don’t test for long enough 
• How long after that 
– I aim for a minimum 250 outcomes, ideally 350+ for each ‘creative’ 
– If you test 4 recipes, that’s 1400 outcomes needed 
– You should have worked out how long each batch of 350 needs 
before you start! 
– 95% confidence is the cherry – not the cake - BUT BIG SECRET -> (p 
values are unreliable) 
– If you segment, you’ll need more data 
– It may need a bigger sample if the response rates are similar* 
– Use a test length calculator but be aware of BARE MINIMUM TO 
EXPECT 
– Important insider tip – watch the error bars! The +/- stuff 
* Stats geeks know I’m glossing over something here. That test time depends 
on how the two experiments separate in terms of relative performance as 
well as how volatile the test response is. I’ll talk about this when I record this 
one! This is why testing similar stuff sux. 32
#5 : You put faith in the Confidence 
value 
95%, 99%, 99.99% 
‘Confidence’ or ‘Chance to beat baseline’ – what’s that? 
• It’s a stats thing 
• Seriously, look at this one LAST in your testing 
• Purchase Cycle, Business Cycles, Sample Size, Error bar 
separation – ALL come before this one. Got it? 
• Why? It’s to do with p-values. Read this article: 
• http://bit.ly/1gq9dtd 
• If you rely on confidence, you are relying upon 
something that’s unreliable and moves around, 
particularly early in testing. 
• Don’t be fooled by your testing package – watch the 
error bars instead of confidence.
#5 : The tennis court 
– Let’s say we want to estimate, on average, what height Roger Federer 
and Nadal hit the ball over the net at. So, let’s start the match: 
@OptimiseOrDie
First Set Federer 6-4 
– We start to collect values 
62cm 
+/- 2cm 
63.5cm 
+/- 2cm 
@OptimiseOrDie
Second Set – Nadal 7-6 
– Nadal starts sending them low over the net 
62cm 
+/- 1cm 
62.5cm 
+/- 1cm 
@OptimiseOrDie
Final Set Nadal 7-6 
– We start to collect values 
61.8cm 
+/- .3cm 
62cm 
+/- .3cm
Let’s look at this a different way 
62.5cm 
+/- 1cm 
@OptimiseOrDie 
9.1% 
± 0.3 
9.3% 
± 0.3
62.5cm 
+/- 1cm 
@OptimiseOrDie 
9.1% 
± 0.5 
9.3% 
± 0.5 
9.1% 
± 0.2 
9.3% 
± 0.2 
9.1% 
± 0.1 
9.3% 
± 0.1
Graph is a range, not a line: 
9.1 ± 1.9% 9.1 ± 0.9% 9.1 ± 0.3%
#5 : How long to test? 
• The minimum length: 
– 2 business cycles and > purchase cycle as a minimum, regardless of 
outcomes. Test for less and you’re biasing the sample. 
– ALWAYS ALWAYS TEST WHOLE CYCLES. 
– 250 ABSOLUTE MINIMUM FOR ANY SAMPLE, 350+ nicer, 1000 sweet! 
– Error bar separation (or minimal overlap) between creatives 
– Ignore 95%+ confidence (it’s unreliable) 
– Use a test calculator (VWO have a nice one). 
– Work out your ‘test units’ – how long to get 350 outcomes for each 
creative in your test. 
– This is a minimum you should expect but sample size (or overlap) may 
mean you need longer 
– When to stop? 
@OptimiseOrDie
#5 : When to stop 
• Self stopping is a huge problem: 
– “I stopped the test when it looked good” 
– “It hit 20% on Thursday, so I figured – time to cut and run” 
– “We need test time for something else. Looks good to us” 
– “We’ve got a big sample now so why not finish it today?” 
• False Positives and Negatives 
– If you cut part of a business cycle, you bias the segments you have in 
the test. 
– So if you ignore weekend shoppers by stopping your test on Friday, that 
will affect results 
– The other problems is FALSE POSITIVES and FALSE NEGATIVES 
@OptimiseOrDie
#5 : When to stop 
Scenario 1 Scenario 2 Scenario 3 Scenario 4 
@OptimiseOrDie 
After 200 
observations 
Insignificant Insignificant Significant! Significant! 
After 500 
observations 
Insignificant Significant! Insignificant Significant! 
End of 
experiment 
Insignificant Significant! Insignificant Significant! 
Scenario 1 Scenario 2 Scenario 3 Scenario 4 
After 200 
observations 
Insignificant Insignificant Significant! Significant! 
After 500 
observations 
Insignificant Significant! trial stopped trial stopped 
End of 
experiment 
Insignificant Significant! Significant! Significant!
#5 : When to stop 
• So – what to do? 
• Run a test calculator 
• Set the test time to hit the highest of the minimums 
• What minimums do you mean? 
– Minimum sample (250, 350, higher) 
– Business cycles (2+) 
– Purchase cycles (1 or 2+) 
– What your test calculator says 
• The longest one is how long it’s gonna take. 
• Set the test time 
• Run the test 
• Stop the test at the end, on a whole cycle 
• Analyse 
• That’s it! 
@OptimiseOrDie
#6 : The early stages of a test… 
• Ignore the graphs. Don’t draw conclusions. Don’t dance. Calm down. 
• Get a feel for the test but don’t do anything yet! 
• Remember – in A/B - 50% of returning visitors will see a new shiny website! 
• Until your test has had at least 2 business cycles and 250+ outcomes, don’t bother 
even getting remotely excited! 
• Watching regularly is good though. You’re looking for anything that looks really 
odd – if everyone is looking (but not concluding) then oddities will get spotted. 
• All tests move around or show big swings early in the testing cycle. Here is a very 
high traffic site – it still takes 10 days to start settling. Lower traffic sites will 
stretch this period further. 
45
#7 : No QA 
testing for the AB 
test?
#7 – BIG SECRET! 
• Over 40% of tests have had QA issues. 
• It’s very easy to break or bias the testing 
Browser testing www.crossbrowsertesting.com 
www.browserstack.com 
www.spoon.net 
www.cloudtesting.com 
www.multibrowserviewer.com 
www.saucelabs.com 
Mobile devices www.deviceanywhere.com 
www.perfectomobile.com 
www.opendevicelab.com 
@OptimiseOrDie
#7 : What other QA testing should I do? 
• Testing from several locations (office, home, elsewhere) 
• Testing the IP filtering is set up 
• Test tags are firing correctly (analytics and the test tool) 
• Test as a repeat visitor and check session timeouts 
• Cross check figures from 2+ sources 
• Monitor closely from launch, recheck, watch 
• WATCH FOR BIAS! 
@OptimiseOrDie
#8 : Tests are random and not 
prioritised 
Once you have a list of 
potential test areas, rank 
them by opportunity vs. 
effort. 
The common ranking 
metrics that I use include: 
•Opportunity (revenue, 
impact) 
•Dev resource 
•Time to market 
•Risk / Complexity 
Make yourself a quadrant
#9 : Your cycles are too slow 
0 6 12 18 
Months 
Conversio 
n 
@OptimiseOrDie
#9 : Solutions 
• Give Priority Boarding for opportunities 
– The best seats reserved for metric shifters 
• Release more often to close the gap 
– More testing resource helps, analytics ‘hawk eye’ 
• Kaizen – continuous improvement 
– Others call it JFDI (just f***ing do it) 
• Make changes AS WELL as tests, basically! 
– These small things add up 
• RUSH Hair booking – Over 100 changes 
– No functional changes at all – 37% improvement 
• Inbetween product lifecycles? 
– The added lift for 10 days work, worth 360k 
@OptimiseOrDie
#9 : Make your own cycles 
@OptimiseOrDie
#10 : How do I know when it’s ready? 
• The hallmarks of a cooked test are: 
– It’s done at least 1 or preferably 2+ business and at least one if 
not two purchase cycles 
– You have at least 250-350 outcomes for each recipe 
– It’s not moving around hugely at creative or segment level 
performance 
– The test results are clear – even if the precise values are not 
– The intervals are not overlapping (much) 
– If a test is still moving around, you need to investigate 
– FIND OUT WHAT MARKETING ARE DOING 
– FIND OUT WHAT EVERYONE IS DOING 
– Be careful about limited time period campaigns (e.g. TV, print, 
online) 
– If you know when TV (or other big campaigns) are running, try 
one week with TV and one without during tests – very 
interesting. 53
54 
#11 : Your test 
fails 
@OptimiseOrDie
#11: Your test fails 
• Learn from the failure! If you can’t learn from the failure, you’ve 
designed a crap test. 
• Next time you design, imagine all your stuff failing. What would 
you do? If you don’t know or you’re not sure, get it changed so 
that a negative becomes insightful. 
• So : failure itself at a creative or variable level should tell you 
something. 
• On a failed test, always analyse the segmentation and analytics 
• One or more segments will be over and under 
• Check for varied performance 
• Now add the failure info to your Knowledge Base: 
• Look at it carefully – what does the failure tell you? Which 
element do you think drove the failure? 
• If you know what failed (e.g. making the price bigger) then you 
have very useful information 
• You turned the handle the wrong way 
• Now brainstorm a new test 
@OptimiseOrDie
#12 : The test is ‘about the same’ 
• Analyse the segmentation 
• Check the analytics and instrumentation 
• One or more segments may be over and under 
• They may be cancelling out – the average is a lie 
• The segment level performance will help you (beware of 
small sample sizes) 
• If you genuinely have a test which failed to move any 
segments, it’s a crap test – be bolder 
• This usually happens when it isn’t bold or brave enough in 
shifting away from the original design, particularly on 
lower traffic sites 
• Get testing again! 
@OptimiseOrDie
#13 : The test keeps moving 
around 
• There are three reasons it is moving around 
– Your sample size (outcomes) is still too small 
– The external traffic mix, customers or reaction has 
suddenly changed or 
– Your inbound marketing driven traffic mix is 
completely volatile (very rare) 
• Check the sample size 
• Check all your marketing activity 
• Check the instrumentation 
• If no reason, check segmentation 
@OptimiseOrDie
#14 : The test has flipped on me 
• Something like this can happen: 
• Check your sample size. If it’s still small, then expect this until the test 
settles. 
• If the test does genuinely flip – and quite severely – then something has 
changed with the traffic mix, the customer base or your advertising. 
Maybe the PPC budget ran out? Seriously! 
• To analyse a flipped test, you’ll need to check your segmented data. This 
is why you have a split testing package AND an analytics system. 
• The segmented data will help you to identify the source of the shift in 
response to your test. I rarely get a flipped one and it’s always something
• No – and this is why: 
– It’s a waste of time 
– It’s easier to test and monitor instead 
– You are eating into test time 
– Also applies to A/A/B/B testing 
– A/B/A running at 25%/50%/25% is the best 
• Read my post here : 
http://bit.ly/WcI9EZ 
59 
#15 : Should I run an A/A test 
first
#16 : Nobody feels the 
test 
• You promised a 25% rise in checkouts - you only see 2% 
• Traffic, Advertising, Marketing may have changed 
• Check they’re using the same precise metrics 
• Run a calibration exercise 
• I often leave a 5 or 10% stub running in a test 
• This tracks old creative once new one goes live 
• If conversion is also down for that one, BINGO! 
• Remember – the AB test is an estimate – it doesn’t 
precisely record future performance 
• This is why infrequent testing is bad 
• Always be trying a new test instead of basking in the 
glory of one you ran 6 months ago. You’re only as good 
as your next test. 
@OptimiseOrDie
#17 : You forgot about Mobile & 
Tablet 
• If you’re AB testing a responsive site, pay attention 
• Content will break differently on many screens 
• Know thy users and their devices 
• Use bango or google analytics to define a test list 
• Make sure you test mobile devices & viewports 
• What looks good on your desk may not be for the user 
• Harder to design cross device tests 
• You’ll need to segment mobile, tablet & desktop response 
in the analytics or AB testing package 
• Your personal phone is not a device mix 
• Ask me about making your device list 
• Buy core devices, rent the rest from deviceanywhere.com 
@OptimiseOrDie
#18 : Oh shit – no traffic 
• If small volumes, contact customers – reach out. 
• If data volumes aren’t there, there are still customers! 
• Drive design from levers you can apply – game the system 
• Pick clean and simple clusters of change (hypothesis driven) 
• Use a goal at an earlier ring stage or funnel step 
• Beware of using clickthroughs when attrition is high on the 
other side 
• Try before and after testing on identical time periods 
(measure in analytics model) 
• Be careful about small sample sizes (<100 outcomes) 
• Are you working automated emails? 
• Fix JFDI, performance and UX issues too!
#19 : Oh shit – no traffic 
• Forget MVT or A/B/N tests – run your numbers 
• Test things with high impact – don’t be a wuss! 
• Use UX, Session Replay to aid insight 
• Run a task gap survey (4Q style) 
• Run a dropped basket survey (LF style) 
• Run a general survey + check social + other sites 
• Run sitewide tests that appear on all pages or large clusters 
of pages – 
• UVPs (“We are a cool brand”), USPs (“Free returns!”), UCPs 
(“10% off today”). 
• Headers, Footers, Nudge Bars, USP bars, footer changes, 
Navigation, Product pages, Delivery info etc.
#19 : I chose the wrong test 
type 
• A/B testing – good for: 
– A single change of content or design layout 
– A group of related changes (e.g. payment security) 
– Finding a new and radical shift for a template design 
– Lower traffic pages or shorter test times 
• Multivariate testing – good for: 
– Higher traffic pages 
– Groups of unrelated changes (e.g. delivery & security) 
– Multiple content or design style changes 
– Finding specific drivers of test lifts 
– Testing multiple versions (e.g. click here, book now, go) 
– Where you need to understand strong and weak cross variable 
interactions 
– Don’t use to settle arguments or sloppy thinking!
Netherlands A/B Shift Example 
Previous winner 
+7.25% 
+8.19% additional lift
#20 – Other flavours of testing 
• Micro testing (tiny change) – good for: 
– Proving to the boss that testing works 
– Demonstrating to IT that it works without impact 
– Showing the impact of a seemingly tiny change 
– Proof of concept before larger test 
• Funnel testing – good for: 
– Checkouts 
– Lead gen 
– Forms processes 
– Quotations 
– Any multi-step process with data entry 
• Fake it and Build it – good for: 
– Testing new business ideas 
– Trying out promotions on a test sample 
– Estimating impact before you build 
– Helps you calculate ROI 
– You can even split test entire server farms 
Vs.
#20 – Other flavours of testing 
“Congratulations! 
Today you’re the 
lucky winner of our 
random awards 
programme. You 
get all these extra 
features for free, 
on us. Enjoy.”
Top F***ups for 2014 
1. Testing in the wrong place 
2. Your hypothesis inputs are crap 
3. No analytics integration 
4. Your test will finish after you die 
5. You don’t test for long enough 
6. You peek before it’s ready 
7. No QA for your split test 
8. Opportunities are not prioritised 
9. Testing cycles are too slow 
10. You don’t know when tests are ready 
11. Your test fails 
12. The test is ‘about the same’ 
13. Test flips behaviour 
14. Test keeps moving around 
15. You run an A/A test and waste time 
16. Nobody ‘feels’ the test 
17. You forgot you were responsive 
18. You forgot you had no traffic 
19. You ran the wrong test type 
20. You didn’t try all the flavours of testing 
@OptimiseOrDie
WE’RE ALL WINGING IT
2004 Headspace 
What I thought 
I knew in 2004 
Reality
2014 Headspace 
What I 
know I 
know 
On a 
good day
Guessaholics Anonymous
Rumsfeldian 
Space 
@OptimiseOrDie
Rumsfeldian 
Space 
@OptimiseOrDie
#1 Smart Talented Polymath People 
The 5 Legged Rumsfeldian Barstool 
@OptimiseOrDie 
Flexible and Agile teams
Fittest? Agile! 
@OptimiseOrDie
#2 : Analytics Investment (tools, people, dev 
time) 
@OptimiseOrDie
@OptimiseOrDie 
#3 : User research and 
insight
#3 : THE BEST IDEAS COME FROM? 
@OptimiseOrDie
#4 : GREAT COPYWRITING 
“On the average, five times as many people 
read the headline as read the body copy. When 
you have written your headline, you have spent 
eighty cents out of your dollar.” 
David Ogilvy 
“In 9 years and 40M split tests with visitors, the 
majority of my testing success came from 
playing with the words.” 
@OptimiseOrDie
The 5 Legged Optimisation 
#1 Culture & Team 
#2 UX, CX, Service Design, Insight 
#3 Toolkit & Analytics investment 
#4 Persuasive Copywriting 
#5 Experimentation tools & process 
@OptimiseOrDie 
Barstool
READ STUFF
READ STUFF
#5 : FIND STUFF 
@OptimiseOrDie 
@danbarker Analytics 
@fastbloke Analytics 
@timlb Analytics 
@jamesgurd Analytics 
@therustybear Analytics 
@carmenmardiros Analytics 
@davechaffey Analytics 
@priteshpatel9 Analytics 
@cutroni Analytics 
@avinash Analytics 
@Aschottmuller Analytics, CRO 
@cartmetrix Analytics, 
CRO 
@Kissmetrics CRO / UX 
@Unbounce CRO / UX 
@Morys CRO / Neuro 
@UXFeeds UX / Neuro 
@Psyblog Neuro 
@Gfiorelli1 SEO / Analytics 
@PeepLaja CRO 
@TheGrok CRO 
@UIE UX 
@LukeW UX / Forms 
@cjforms UX / Forms 
@axbom UX 
@iatv UX 
@Chudders Photo UX 
@JeffreyGroks Innovation 
@StephanieRieger Innovation 
@BrianSolis Innovation 
@DrEscotet Neuro 
@TheBrainLady Neuro 
@RogerDooley Neuro 
@Cugelman Neuro 
@Smashingmag Dev / UX 
@uxmag UX 
@Webtrends UX / 
CRO
#5 : LEARN STUFF 
@OptimiseOrDie 
Baymard.com 
Lukew.com 
Smashingmagazine.com 
ConversionXL.com 
Medium.com 
Whichtestwon.com 
Unbounce.com 
Measuringusability.com 
RogerDooley.com 
Kissmetrics.com 
Uxmatters.com 
Smartinsights.com 
Econsultancy.com 
Cutroni.com 
www.GetMentalNotes.com
The Best Companies… 
• Invest continually in analytics instrumentation, tools, people 
• Use an Agile, iterative, cross-silo, one team project culture 
• Prefer collaborative tools to having lots of meetings 
• Prioritise development based on numbers and insight 
• Practice real continuous product improvement, not SLEDD* 
• Are fixing bugs, cruft, bad stuff as well as optimising 
• Source photos and content that support persuasion and utility 
• Have cross channel, cross device design, testing and QA 
• Segment their data for valuable insights, every test or change 
• Continually reduce cycle (iteration) time in their process 
• Blend ‘long’ design, continuous improvement AND split tests 
• Make optimisation the engine of change, not the slave of ego 
* Single Large Expensive Doomed Developments
THE FUTURE OF TESTING – 
CONDUCTRICS.COM 
slidesha.re/1ivS68s
Thank You! 
88 
@OptimiseOrDie 
Email 
Slides 
: sullivac@gmail.com 
: slideshare.com/sullivac 
: linkd.in/pvrg14

More Related Content

What's hot

128 High Converting Growth Hacks - the most epic growth hacking list
128 High Converting Growth Hacks - the most epic growth hacking list128 High Converting Growth Hacks - the most epic growth hacking list
128 High Converting Growth Hacks - the most epic growth hacking listHelvijs Smoteks
 
Lean Analytics: Using Data to Build a Better Business Faster
Lean Analytics: Using Data to Build a Better Business FasterLean Analytics: Using Data to Build a Better Business Faster
Lean Analytics: Using Data to Build a Better Business FasterLean Startup Co.
 
Data Analytics for better Product Decision Making by Mixpanel PM
Data Analytics for better Product Decision Making by Mixpanel PMData Analytics for better Product Decision Making by Mixpanel PM
Data Analytics for better Product Decision Making by Mixpanel PMProduct School
 
How We Built 1,000+ Links Per Month With This 6-Month Sprint.
How We Built 1,000+ Links Per Month With This 6-Month Sprint.How We Built 1,000+ Links Per Month With This 6-Month Sprint.
How We Built 1,000+ Links Per Month With This 6-Month Sprint.Search Engine Journal
 
Introduction to Lean Analytics for Lean Startup Circle SF
Introduction to Lean Analytics for Lean Startup Circle SFIntroduction to Lean Analytics for Lean Startup Circle SF
Introduction to Lean Analytics for Lean Startup Circle SFLean Analytics
 
Seo basicsfrom-digital marketing paathshala
Seo basicsfrom-digital marketing paathshalaSeo basicsfrom-digital marketing paathshala
Seo basicsfrom-digital marketing paathshalaSimplilearn
 
Dropbox Startup Lessons Learned
Dropbox Startup Lessons LearnedDropbox Startup Lessons Learned
Dropbox Startup Lessons Learnedgueste94e4c
 
Talks@Coursera - A/B Testing @ Internet Scale
Talks@Coursera - A/B Testing @ Internet ScaleTalks@Coursera - A/B Testing @ Internet Scale
Talks@Coursera - A/B Testing @ Internet Scalecourseratalks
 
Growth marketing
Growth marketingGrowth marketing
Growth marketingOnur Polat
 
Product Led Growth: The Rise of the User
Product Led Growth: The Rise of the UserProduct Led Growth: The Rise of the User
Product Led Growth: The Rise of the UserOpenView
 
Booking.com - Data science and experimentation at Booking.com: a data-driven ...
Booking.com - Data science and experimentation at Booking.com: a data-driven ...Booking.com - Data science and experimentation at Booking.com: a data-driven ...
Booking.com - Data science and experimentation at Booking.com: a data-driven ...BigDataExpo
 
eMetrics London - The AB Testing Hype Cycle
eMetrics London - The AB Testing Hype CycleeMetrics London - The AB Testing Hype Cycle
eMetrics London - The AB Testing Hype CycleCraig Sullivan
 
Customer Centricity and Product Led Growth by Airbnb Product & Growth
Customer Centricity and Product Led Growth by Airbnb Product & Growth Customer Centricity and Product Led Growth by Airbnb Product & Growth
Customer Centricity and Product Led Growth by Airbnb Product & Growth Product School
 
Go-to-Market Strategy Reboot Camp (Overview)
Go-to-Market Strategy Reboot Camp (Overview)Go-to-Market Strategy Reboot Camp (Overview)
Go-to-Market Strategy Reboot Camp (Overview)SP Home Run Inc.
 
SAMPLE SIZE – The indispensable A/B test calculation that you’re not making
SAMPLE SIZE – The indispensable A/B test calculation that you’re not makingSAMPLE SIZE – The indispensable A/B test calculation that you’re not making
SAMPLE SIZE – The indispensable A/B test calculation that you’re not makingZack Notes
 
Establishing an Audit Framework
Establishing an Audit FrameworkEstablishing an Audit Framework
Establishing an Audit FrameworkAnnie Cushing
 
SMX Advanced: Thriving in the New World of Pagination
SMX Advanced: Thriving in the New World of PaginationSMX Advanced: Thriving in the New World of Pagination
SMX Advanced: Thriving in the New World of PaginationLily Ray
 
The Only Metric That Matters by a Partner at Greylock Partners
The Only Metric That Matters by a Partner at Greylock PartnersThe Only Metric That Matters by a Partner at Greylock Partners
The Only Metric That Matters by a Partner at Greylock PartnersProduct School
 

What's hot (20)

128 High Converting Growth Hacks - the most epic growth hacking list
128 High Converting Growth Hacks - the most epic growth hacking list128 High Converting Growth Hacks - the most epic growth hacking list
128 High Converting Growth Hacks - the most epic growth hacking list
 
Lean Analytics: Using Data to Build a Better Business Faster
Lean Analytics: Using Data to Build a Better Business FasterLean Analytics: Using Data to Build a Better Business Faster
Lean Analytics: Using Data to Build a Better Business Faster
 
Data Analytics for better Product Decision Making by Mixpanel PM
Data Analytics for better Product Decision Making by Mixpanel PMData Analytics for better Product Decision Making by Mixpanel PM
Data Analytics for better Product Decision Making by Mixpanel PM
 
A/B testing
A/B testingA/B testing
A/B testing
 
How We Built 1,000+ Links Per Month With This 6-Month Sprint.
How We Built 1,000+ Links Per Month With This 6-Month Sprint.How We Built 1,000+ Links Per Month With This 6-Month Sprint.
How We Built 1,000+ Links Per Month With This 6-Month Sprint.
 
Introduction to Lean Analytics for Lean Startup Circle SF
Introduction to Lean Analytics for Lean Startup Circle SFIntroduction to Lean Analytics for Lean Startup Circle SF
Introduction to Lean Analytics for Lean Startup Circle SF
 
Seo basicsfrom-digital marketing paathshala
Seo basicsfrom-digital marketing paathshalaSeo basicsfrom-digital marketing paathshala
Seo basicsfrom-digital marketing paathshala
 
Dropbox Startup Lessons Learned
Dropbox Startup Lessons LearnedDropbox Startup Lessons Learned
Dropbox Startup Lessons Learned
 
Talks@Coursera - A/B Testing @ Internet Scale
Talks@Coursera - A/B Testing @ Internet ScaleTalks@Coursera - A/B Testing @ Internet Scale
Talks@Coursera - A/B Testing @ Internet Scale
 
Growth marketing
Growth marketingGrowth marketing
Growth marketing
 
Product Led Growth: The Rise of the User
Product Led Growth: The Rise of the UserProduct Led Growth: The Rise of the User
Product Led Growth: The Rise of the User
 
Booking.com - Data science and experimentation at Booking.com: a data-driven ...
Booking.com - Data science and experimentation at Booking.com: a data-driven ...Booking.com - Data science and experimentation at Booking.com: a data-driven ...
Booking.com - Data science and experimentation at Booking.com: a data-driven ...
 
eMetrics London - The AB Testing Hype Cycle
eMetrics London - The AB Testing Hype CycleeMetrics London - The AB Testing Hype Cycle
eMetrics London - The AB Testing Hype Cycle
 
Customer Centricity and Product Led Growth by Airbnb Product & Growth
Customer Centricity and Product Led Growth by Airbnb Product & Growth Customer Centricity and Product Led Growth by Airbnb Product & Growth
Customer Centricity and Product Led Growth by Airbnb Product & Growth
 
Go-to-Market Strategy Reboot Camp (Overview)
Go-to-Market Strategy Reboot Camp (Overview)Go-to-Market Strategy Reboot Camp (Overview)
Go-to-Market Strategy Reboot Camp (Overview)
 
SAMPLE SIZE – The indispensable A/B test calculation that you’re not making
SAMPLE SIZE – The indispensable A/B test calculation that you’re not makingSAMPLE SIZE – The indispensable A/B test calculation that you’re not making
SAMPLE SIZE – The indispensable A/B test calculation that you’re not making
 
Establishing an Audit Framework
Establishing an Audit FrameworkEstablishing an Audit Framework
Establishing an Audit Framework
 
Tracking ist nicht kaputt!
Tracking ist nicht kaputt!Tracking ist nicht kaputt!
Tracking ist nicht kaputt!
 
SMX Advanced: Thriving in the New World of Pagination
SMX Advanced: Thriving in the New World of PaginationSMX Advanced: Thriving in the New World of Pagination
SMX Advanced: Thriving in the New World of Pagination
 
The Only Metric That Matters by a Partner at Greylock Partners
The Only Metric That Matters by a Partner at Greylock PartnersThe Only Metric That Matters by a Partner at Greylock Partners
The Only Metric That Matters by a Partner at Greylock Partners
 

Viewers also liked

Cross Device Optimisation - Google Analytics Shortcuts
Cross Device Optimisation - Google Analytics ShortcutsCross Device Optimisation - Google Analytics Shortcuts
Cross Device Optimisation - Google Analytics ShortcutsCraig Sullivan
 
#Measurefest : 20 Simple Ways to Fuck Up your AB tests
#Measurefest : 20 Simple Ways to Fuck Up your AB tests#Measurefest : 20 Simple Ways to Fuck Up your AB tests
#Measurefest : 20 Simple Ways to Fuck Up your AB testsCraig Sullivan
 
UXPA UK - Toolkits and Tips for Blending UX, Analytics and CRO
UXPA UK - Toolkits and Tips for Blending UX, Analytics and CROUXPA UK - Toolkits and Tips for Blending UX, Analytics and CRO
UXPA UK - Toolkits and Tips for Blending UX, Analytics and CROCraig Sullivan
 
Conversion Research in One Hour
Conversion Research in One HourConversion Research in One Hour
Conversion Research in One HourCraig Sullivan
 
Digital Impact 2014 - Oh Boy These AB tests sure look like Bullshit to me
Digital Impact 2014 - Oh Boy These AB tests sure look like Bullshit to meDigital Impact 2014 - Oh Boy These AB tests sure look like Bullshit to me
Digital Impact 2014 - Oh Boy These AB tests sure look like Bullshit to meCraig Sullivan
 
Myths and Illusions of Cross Device Testing - Elite Camp June 2015
Myths and Illusions of Cross Device Testing - Elite Camp June 2015Myths and Illusions of Cross Device Testing - Elite Camp June 2015
Myths and Illusions of Cross Device Testing - Elite Camp June 2015Craig Sullivan
 
Myths, Lies and Illusions of AB and Split Testing
Myths, Lies and Illusions of AB and Split TestingMyths, Lies and Illusions of AB and Split Testing
Myths, Lies and Illusions of AB and Split TestingCraig Sullivan
 
UX Designer: Cognizant Technology Solutions
UX Designer: Cognizant Technology SolutionsUX Designer: Cognizant Technology Solutions
UX Designer: Cognizant Technology SolutionsGaurav Kulshrestha
 
Conversion rate optimisation 101 @ LS Yorkshire Leeds Feb 2014
Conversion rate optimisation 101 @ LS Yorkshire Leeds Feb 2014Conversion rate optimisation 101 @ LS Yorkshire Leeds Feb 2014
Conversion rate optimisation 101 @ LS Yorkshire Leeds Feb 2014Manuel Da Costa
 
Introduction to conversion rate optimization (cro) for ecommerce
Introduction to conversion rate optimization (cro) for ecommerceIntroduction to conversion rate optimization (cro) for ecommerce
Introduction to conversion rate optimization (cro) for ecommerceIn Marketing We Trust
 
Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me
Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to MeBrighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me
Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to MeCraig Sullivan
 
Conversion Rate Optimization 101 - Converting Visitors In To Buyers
Conversion Rate Optimization 101 - Converting Visitors In To BuyersConversion Rate Optimization 101 - Converting Visitors In To Buyers
Conversion Rate Optimization 101 - Converting Visitors In To BuyersKothapally Arun
 
Why do my AB tests suck? measurecamp
Why do my AB tests suck?   measurecampWhy do my AB tests suck?   measurecamp
Why do my AB tests suck? measurecampCraig Sullivan
 
Surviving the hype cycle Shortcuts to split testing success
Surviving the hype cycle   Shortcuts to split testing successSurviving the hype cycle   Shortcuts to split testing success
Surviving the hype cycle Shortcuts to split testing successCraig Sullivan
 
UX and copywriting - Blend Conference
UX and copywriting - Blend ConferenceUX and copywriting - Blend Conference
UX and copywriting - Blend Conferencepaulin.dementhon
 
Human-Centered Copywriting: How Your Words Can Make or Break Your User Experi...
Human-Centered Copywriting: How Your Words Can Make or Break Your User Experi...Human-Centered Copywriting: How Your Words Can Make or Break Your User Experi...
Human-Centered Copywriting: How Your Words Can Make or Break Your User Experi...UserTesting
 
UX/CASE STUDY-STYLE COPYWRITING: Product Buying Guides
UX/CASE STUDY-STYLE COPYWRITING: Product Buying GuidesUX/CASE STUDY-STYLE COPYWRITING: Product Buying Guides
UX/CASE STUDY-STYLE COPYWRITING: Product Buying GuidesAdam Stanley
 
USECON RoX 2015: Slip into your customers' shoes - Mobile Ethnography
USECON RoX 2015: Slip into your customers' shoes - Mobile EthnographyUSECON RoX 2015: Slip into your customers' shoes - Mobile Ethnography
USECON RoX 2015: Slip into your customers' shoes - Mobile EthnographyUSECON
 
#Measurecamp : 18 Simple Ways to F*** up Your AB Testing
#Measurecamp : 18 Simple Ways to F*** up Your AB Testing#Measurecamp : 18 Simple Ways to F*** up Your AB Testing
#Measurecamp : 18 Simple Ways to F*** up Your AB TestingCraig Sullivan
 
Interact London - 21 Oct 2015 - Scaling Stupidity
Interact London - 21 Oct 2015 - Scaling StupidityInteract London - 21 Oct 2015 - Scaling Stupidity
Interact London - 21 Oct 2015 - Scaling StupidityCraig Sullivan
 

Viewers also liked (20)

Cross Device Optimisation - Google Analytics Shortcuts
Cross Device Optimisation - Google Analytics ShortcutsCross Device Optimisation - Google Analytics Shortcuts
Cross Device Optimisation - Google Analytics Shortcuts
 
#Measurefest : 20 Simple Ways to Fuck Up your AB tests
#Measurefest : 20 Simple Ways to Fuck Up your AB tests#Measurefest : 20 Simple Ways to Fuck Up your AB tests
#Measurefest : 20 Simple Ways to Fuck Up your AB tests
 
UXPA UK - Toolkits and Tips for Blending UX, Analytics and CRO
UXPA UK - Toolkits and Tips for Blending UX, Analytics and CROUXPA UK - Toolkits and Tips for Blending UX, Analytics and CRO
UXPA UK - Toolkits and Tips for Blending UX, Analytics and CRO
 
Conversion Research in One Hour
Conversion Research in One HourConversion Research in One Hour
Conversion Research in One Hour
 
Digital Impact 2014 - Oh Boy These AB tests sure look like Bullshit to me
Digital Impact 2014 - Oh Boy These AB tests sure look like Bullshit to meDigital Impact 2014 - Oh Boy These AB tests sure look like Bullshit to me
Digital Impact 2014 - Oh Boy These AB tests sure look like Bullshit to me
 
Myths and Illusions of Cross Device Testing - Elite Camp June 2015
Myths and Illusions of Cross Device Testing - Elite Camp June 2015Myths and Illusions of Cross Device Testing - Elite Camp June 2015
Myths and Illusions of Cross Device Testing - Elite Camp June 2015
 
Myths, Lies and Illusions of AB and Split Testing
Myths, Lies and Illusions of AB and Split TestingMyths, Lies and Illusions of AB and Split Testing
Myths, Lies and Illusions of AB and Split Testing
 
UX Designer: Cognizant Technology Solutions
UX Designer: Cognizant Technology SolutionsUX Designer: Cognizant Technology Solutions
UX Designer: Cognizant Technology Solutions
 
Conversion rate optimisation 101 @ LS Yorkshire Leeds Feb 2014
Conversion rate optimisation 101 @ LS Yorkshire Leeds Feb 2014Conversion rate optimisation 101 @ LS Yorkshire Leeds Feb 2014
Conversion rate optimisation 101 @ LS Yorkshire Leeds Feb 2014
 
Introduction to conversion rate optimization (cro) for ecommerce
Introduction to conversion rate optimization (cro) for ecommerceIntroduction to conversion rate optimization (cro) for ecommerce
Introduction to conversion rate optimization (cro) for ecommerce
 
Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me
Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to MeBrighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me
Brighton CRO Meetup #1 - Oh Boy These AB tests Sure Look Like Bullshit to Me
 
Conversion Rate Optimization 101 - Converting Visitors In To Buyers
Conversion Rate Optimization 101 - Converting Visitors In To BuyersConversion Rate Optimization 101 - Converting Visitors In To Buyers
Conversion Rate Optimization 101 - Converting Visitors In To Buyers
 
Why do my AB tests suck? measurecamp
Why do my AB tests suck?   measurecampWhy do my AB tests suck?   measurecamp
Why do my AB tests suck? measurecamp
 
Surviving the hype cycle Shortcuts to split testing success
Surviving the hype cycle   Shortcuts to split testing successSurviving the hype cycle   Shortcuts to split testing success
Surviving the hype cycle Shortcuts to split testing success
 
UX and copywriting - Blend Conference
UX and copywriting - Blend ConferenceUX and copywriting - Blend Conference
UX and copywriting - Blend Conference
 
Human-Centered Copywriting: How Your Words Can Make or Break Your User Experi...
Human-Centered Copywriting: How Your Words Can Make or Break Your User Experi...Human-Centered Copywriting: How Your Words Can Make or Break Your User Experi...
Human-Centered Copywriting: How Your Words Can Make or Break Your User Experi...
 
UX/CASE STUDY-STYLE COPYWRITING: Product Buying Guides
UX/CASE STUDY-STYLE COPYWRITING: Product Buying GuidesUX/CASE STUDY-STYLE COPYWRITING: Product Buying Guides
UX/CASE STUDY-STYLE COPYWRITING: Product Buying Guides
 
USECON RoX 2015: Slip into your customers' shoes - Mobile Ethnography
USECON RoX 2015: Slip into your customers' shoes - Mobile EthnographyUSECON RoX 2015: Slip into your customers' shoes - Mobile Ethnography
USECON RoX 2015: Slip into your customers' shoes - Mobile Ethnography
 
#Measurecamp : 18 Simple Ways to F*** up Your AB Testing
#Measurecamp : 18 Simple Ways to F*** up Your AB Testing#Measurecamp : 18 Simple Ways to F*** up Your AB Testing
#Measurecamp : 18 Simple Ways to F*** up Your AB Testing
 
Interact London - 21 Oct 2015 - Scaling Stupidity
Interact London - 21 Oct 2015 - Scaling StupidityInteract London - 21 Oct 2015 - Scaling Stupidity
Interact London - 21 Oct 2015 - Scaling Stupidity
 

Similar to 20 top AB testing mistakes and how to avoid them

20 Ways to Shaft your Split Tesring : Conversion Conference
20 Ways to Shaft your Split Tesring : Conversion Conference20 Ways to Shaft your Split Tesring : Conversion Conference
20 Ways to Shaft your Split Tesring : Conversion ConferenceCraig Sullivan
 
Conversion Hotel 2014: Craig Sullivan (UK) keynote
Conversion Hotel 2014: Craig Sullivan (UK) keynoteConversion Hotel 2014: Craig Sullivan (UK) keynote
Conversion Hotel 2014: Craig Sullivan (UK) keynoteWebanalisten .nl
 
Confessions of an uber optimiser conversion summit - craig sullivan - v 1.9
Confessions of an uber optimiser   conversion summit - craig sullivan - v 1.9Confessions of an uber optimiser   conversion summit - craig sullivan - v 1.9
Confessions of an uber optimiser conversion summit - craig sullivan - v 1.9Craig Sullivan
 
Toolkits and tips of the conversion pros v 1.6
Toolkits and tips of the conversion pros v 1.6Toolkits and tips of the conversion pros v 1.6
Toolkits and tips of the conversion pros v 1.6Craig Sullivan
 
AB Testing and UX - a love story with numbers and people (by Craig Sullivan a...
AB Testing and UX - a love story with numbers and people (by Craig Sullivan a...AB Testing and UX - a love story with numbers and people (by Craig Sullivan a...
AB Testing and UX - a love story with numbers and people (by Craig Sullivan a...Northern User Experience
 
Toolkits and tips for UX analytics CRO by Craig Sullivan
Toolkits and tips for UX analytics CRO by Craig SullivanToolkits and tips for UX analytics CRO by Craig Sullivan
Toolkits and tips for UX analytics CRO by Craig SullivanUXPA UK
 
Planning a Successful Online Business
Planning a Successful Online BusinessPlanning a Successful Online Business
Planning a Successful Online Businesslisabaadeseo
 
The Neuromarketing Toolkit - Chinwag Psych - 4 Feb 2014
The Neuromarketing Toolkit - Chinwag Psych - 4 Feb 2014The Neuromarketing Toolkit - Chinwag Psych - 4 Feb 2014
The Neuromarketing Toolkit - Chinwag Psych - 4 Feb 2014Craig Sullivan
 
Conversion Optimization Willa Fogarty
Conversion Optimization Willa FogartyConversion Optimization Willa Fogarty
Conversion Optimization Willa FogartyWilla Fogarty
 
12 Things to do Before Your Company Dies : Conversion Conference London - Oct...
12 Things to do Before Your Company Dies : Conversion Conference London - Oct...12 Things to do Before Your Company Dies : Conversion Conference London - Oct...
12 Things to do Before Your Company Dies : Conversion Conference London - Oct...Craig Sullivan
 
Lean Analytics & Analytics Dashboards
Lean Analytics & Analytics DashboardsLean Analytics & Analytics Dashboards
Lean Analytics & Analytics DashboardsYves Ferket
 
Lean startup & customer development
Lean startup & customer developmentLean startup & customer development
Lean startup & customer developmentEmanuele Musa
 
Product Management - Successful products, Strategy, Finding a job, B2B vs B2C...
Product Management - Successful products, Strategy, Finding a job, B2B vs B2C...Product Management - Successful products, Strategy, Finding a job, B2B vs B2C...
Product Management - Successful products, Strategy, Finding a job, B2B vs B2C...Rahul Deshpande
 
John lewis lego craig sullivan 1335
John lewis lego craig sullivan  1335John lewis lego craig sullivan  1335
John lewis lego craig sullivan 1335Charlie Lines
 
13 Secrets to Successful Metrics-Based Marketing
13 Secrets to Successful Metrics-Based Marketing13 Secrets to Successful Metrics-Based Marketing
13 Secrets to Successful Metrics-Based MarketingMorgan Friedman
 
Why does my Mobile Conversion rate suck? 19 Sep 2013 @ Conversion Thursday #...
Why does my Mobile Conversion rate suck?  19 Sep 2013 @ Conversion Thursday #...Why does my Mobile Conversion rate suck?  19 Sep 2013 @ Conversion Thursday #...
Why does my Mobile Conversion rate suck? 19 Sep 2013 @ Conversion Thursday #...Craig Sullivan
 
Business strategy- for retail shoe company
Business strategy- for retail shoe companyBusiness strategy- for retail shoe company
Business strategy- for retail shoe companyVijayananda Mohire
 
Inside eCommerce - Micksgarage Masterclass - April 15th 2014.
Inside eCommerce - Micksgarage Masterclass - April 15th 2014.Inside eCommerce - Micksgarage Masterclass - April 15th 2014.
Inside eCommerce - Micksgarage Masterclass - April 15th 2014.John Walsh
 
Customer Feedback: the missing piece of the Agile puzzle
Customer Feedback: the missing piece of the Agile puzzleCustomer Feedback: the missing piece of the Agile puzzle
Customer Feedback: the missing piece of the Agile puzzleskierkowski
 

Similar to 20 top AB testing mistakes and how to avoid them (20)

20 Ways to Shaft your Split Tesring : Conversion Conference
20 Ways to Shaft your Split Tesring : Conversion Conference20 Ways to Shaft your Split Tesring : Conversion Conference
20 Ways to Shaft your Split Tesring : Conversion Conference
 
Conversion Hotel 2014: Craig Sullivan (UK) keynote
Conversion Hotel 2014: Craig Sullivan (UK) keynoteConversion Hotel 2014: Craig Sullivan (UK) keynote
Conversion Hotel 2014: Craig Sullivan (UK) keynote
 
Confessions of an uber optimiser conversion summit - craig sullivan - v 1.9
Confessions of an uber optimiser   conversion summit - craig sullivan - v 1.9Confessions of an uber optimiser   conversion summit - craig sullivan - v 1.9
Confessions of an uber optimiser conversion summit - craig sullivan - v 1.9
 
Toolkits and tips of the conversion pros v 1.6
Toolkits and tips of the conversion pros v 1.6Toolkits and tips of the conversion pros v 1.6
Toolkits and tips of the conversion pros v 1.6
 
AB Testing and UX - a love story with numbers and people (by Craig Sullivan a...
AB Testing and UX - a love story with numbers and people (by Craig Sullivan a...AB Testing and UX - a love story with numbers and people (by Craig Sullivan a...
AB Testing and UX - a love story with numbers and people (by Craig Sullivan a...
 
Toolkits and tips for UX analytics CRO by Craig Sullivan
Toolkits and tips for UX analytics CRO by Craig SullivanToolkits and tips for UX analytics CRO by Craig Sullivan
Toolkits and tips for UX analytics CRO by Craig Sullivan
 
Planning a Successful Online Business
Planning a Successful Online BusinessPlanning a Successful Online Business
Planning a Successful Online Business
 
The Neuromarketing Toolkit - Chinwag Psych - 4 Feb 2014
The Neuromarketing Toolkit - Chinwag Psych - 4 Feb 2014The Neuromarketing Toolkit - Chinwag Psych - 4 Feb 2014
The Neuromarketing Toolkit - Chinwag Psych - 4 Feb 2014
 
Conversion Optimization Willa Fogarty
Conversion Optimization Willa FogartyConversion Optimization Willa Fogarty
Conversion Optimization Willa Fogarty
 
12 Things to do Before Your Company Dies : Conversion Conference London - Oct...
12 Things to do Before Your Company Dies : Conversion Conference London - Oct...12 Things to do Before Your Company Dies : Conversion Conference London - Oct...
12 Things to do Before Your Company Dies : Conversion Conference London - Oct...
 
Lean Analytics & Analytics Dashboards
Lean Analytics & Analytics DashboardsLean Analytics & Analytics Dashboards
Lean Analytics & Analytics Dashboards
 
Lean startup & customer development
Lean startup & customer developmentLean startup & customer development
Lean startup & customer development
 
Product Management - Successful products, Strategy, Finding a job, B2B vs B2C...
Product Management - Successful products, Strategy, Finding a job, B2B vs B2C...Product Management - Successful products, Strategy, Finding a job, B2B vs B2C...
Product Management - Successful products, Strategy, Finding a job, B2B vs B2C...
 
John lewis lego craig sullivan 1335
John lewis lego craig sullivan  1335John lewis lego craig sullivan  1335
John lewis lego craig sullivan 1335
 
13 Secrets to Successful Metrics-Based Marketing
13 Secrets to Successful Metrics-Based Marketing13 Secrets to Successful Metrics-Based Marketing
13 Secrets to Successful Metrics-Based Marketing
 
Why does my Mobile Conversion rate suck? 19 Sep 2013 @ Conversion Thursday #...
Why does my Mobile Conversion rate suck?  19 Sep 2013 @ Conversion Thursday #...Why does my Mobile Conversion rate suck?  19 Sep 2013 @ Conversion Thursday #...
Why does my Mobile Conversion rate suck? 19 Sep 2013 @ Conversion Thursday #...
 
How to Pitch to Investors
How to Pitch to InvestorsHow to Pitch to Investors
How to Pitch to Investors
 
Business strategy- for retail shoe company
Business strategy- for retail shoe companyBusiness strategy- for retail shoe company
Business strategy- for retail shoe company
 
Inside eCommerce - Micksgarage Masterclass - April 15th 2014.
Inside eCommerce - Micksgarage Masterclass - April 15th 2014.Inside eCommerce - Micksgarage Masterclass - April 15th 2014.
Inside eCommerce - Micksgarage Masterclass - April 15th 2014.
 
Customer Feedback: the missing piece of the Agile puzzle
Customer Feedback: the missing piece of the Agile puzzleCustomer Feedback: the missing piece of the Agile puzzle
Customer Feedback: the missing piece of the Agile puzzle
 

More from Craig Sullivan

Product Design is Poo - And we're all going to die
Product Design is Poo - And we're all going to dieProduct Design is Poo - And we're all going to die
Product Design is Poo - And we're all going to dieCraig Sullivan
 
Product design is Poo - And how to fix it!
Product design is Poo - And how to fix it!Product design is Poo - And how to fix it!
Product design is Poo - And how to fix it!Craig Sullivan
 
Web Analytics Wednesday - Session Replay Tools are Vital
Web Analytics Wednesday - Session Replay Tools are VitalWeb Analytics Wednesday - Session Replay Tools are Vital
Web Analytics Wednesday - Session Replay Tools are VitalCraig Sullivan
 
Surviving the AB Testing Hype Cycle - Reaktor Breakpoint 2015
Surviving the AB Testing Hype Cycle - Reaktor Breakpoint 2015Surviving the AB Testing Hype Cycle - Reaktor Breakpoint 2015
Surviving the AB Testing Hype Cycle - Reaktor Breakpoint 2015Craig Sullivan
 
Elite Camp 2013 - Estonia
Elite Camp 2013 - EstoniaElite Camp 2013 - Estonia
Elite Camp 2013 - EstoniaCraig Sullivan
 
Conversionista : Conversion manager course - Stockholm 20 march 2013
Conversionista : Conversion manager course  - Stockholm 20 march 2013Conversionista : Conversion manager course  - Stockholm 20 march 2013
Conversionista : Conversion manager course - Stockholm 20 march 2013Craig Sullivan
 
3 Optimisation Decks : WAW Copenhagen - 27 Feb 2013
3 Optimisation Decks : WAW Copenhagen - 27 Feb 20133 Optimisation Decks : WAW Copenhagen - 27 Feb 2013
3 Optimisation Decks : WAW Copenhagen - 27 Feb 2013Craig Sullivan
 
Measure camp tools of the cro rabble
Measure camp   tools of the cro rabbleMeasure camp   tools of the cro rabble
Measure camp tools of the cro rabbleCraig Sullivan
 
5 cro tools that i can't live without
5 cro tools that i can't live without5 cro tools that i can't live without
5 cro tools that i can't live withoutCraig Sullivan
 
Natural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion JamNatural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion JamCraig Sullivan
 
eMetrics Stockholm - What the F*** is wrong with my conversion?
eMetrics Stockholm - What the F*** is wrong with my conversion?eMetrics Stockholm - What the F*** is wrong with my conversion?
eMetrics Stockholm - What the F*** is wrong with my conversion?Craig Sullivan
 

More from Craig Sullivan (11)

Product Design is Poo - And we're all going to die
Product Design is Poo - And we're all going to dieProduct Design is Poo - And we're all going to die
Product Design is Poo - And we're all going to die
 
Product design is Poo - And how to fix it!
Product design is Poo - And how to fix it!Product design is Poo - And how to fix it!
Product design is Poo - And how to fix it!
 
Web Analytics Wednesday - Session Replay Tools are Vital
Web Analytics Wednesday - Session Replay Tools are VitalWeb Analytics Wednesday - Session Replay Tools are Vital
Web Analytics Wednesday - Session Replay Tools are Vital
 
Surviving the AB Testing Hype Cycle - Reaktor Breakpoint 2015
Surviving the AB Testing Hype Cycle - Reaktor Breakpoint 2015Surviving the AB Testing Hype Cycle - Reaktor Breakpoint 2015
Surviving the AB Testing Hype Cycle - Reaktor Breakpoint 2015
 
Elite Camp 2013 - Estonia
Elite Camp 2013 - EstoniaElite Camp 2013 - Estonia
Elite Camp 2013 - Estonia
 
Conversionista : Conversion manager course - Stockholm 20 march 2013
Conversionista : Conversion manager course  - Stockholm 20 march 2013Conversionista : Conversion manager course  - Stockholm 20 march 2013
Conversionista : Conversion manager course - Stockholm 20 march 2013
 
3 Optimisation Decks : WAW Copenhagen - 27 Feb 2013
3 Optimisation Decks : WAW Copenhagen - 27 Feb 20133 Optimisation Decks : WAW Copenhagen - 27 Feb 2013
3 Optimisation Decks : WAW Copenhagen - 27 Feb 2013
 
Measure camp tools of the cro rabble
Measure camp   tools of the cro rabbleMeasure camp   tools of the cro rabble
Measure camp tools of the cro rabble
 
5 cro tools that i can't live without
5 cro tools that i can't live without5 cro tools that i can't live without
5 cro tools that i can't live without
 
Natural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion JamNatural born conversion killers - Conversion Jam
Natural born conversion killers - Conversion Jam
 
eMetrics Stockholm - What the F*** is wrong with my conversion?
eMetrics Stockholm - What the F*** is wrong with my conversion?eMetrics Stockholm - What the F*** is wrong with my conversion?
eMetrics Stockholm - What the F*** is wrong with my conversion?
 

20 top AB testing mistakes and how to avoid them

  • 1. Oh Boy! These A/B tests appear to be bullshit!
  • 2. @OptimiseOrDi e • UX and Analytics (1999) • User Centred Design (2001) • Agile, Startups, No budget (2003) • Funnel optimisation (2004) • Multivariate & A/B (2005) • Conversion Optimisation (2005) • Persuasive Copywriting (2006) • Joined Twitter (2007) • Lean UX (2008) • Holistic Optimisation (2009) Was : Consulting all over the place Now : Optimiser of Everything, Spareroom.co.uk
  • 4.
  • 5. AB Test Hype Cycle Zen Plumbing @OptimiseOrDie Timeline Tested stupid ideas, lots Most AB or MVT tests are bullshit Discovered AB testing Triage, Triangulation, Prioritisation, Maths
  • 6. Craig’s Cynical Quadrant Improves revenue Yes Client delighted No Yes Improves UX No (and fires you for another agency) Client fucking delighted Client absolutely fucking furious Client fires you (then wins an award for your work)
  • 7. #1 : You’re doing it in the wrong place @OptimiseOrDie
  • 8. #1 : You’re doing it in the wrong place There are 4 areas a CRO expert always looks at: 1. Inbound attrition (medium, source, landing page, keyword, intent and many more…) 2. Key conversion points (product, basket, registration) 3. Processes, lifecycles and steps (forms, logins, registration, checkout, onboarding, emails, push) 4. Layers of engagement (search, category, product, add) 1. Use visitor flow reports for attrition – very useful. 2. For key conversion points, look at loss rates & interactions 3. Processes and steps – look at funnels or make your own 4. Layers and engagement – make a ring model @OptimiseOrDie
  • 9. Examples – Concept Bounce Engage Outcome @OptimiseOrDie
  • 10. Examples – 16-25Railcard.co.uk Bounce Login to Account Content Engage Start Application Type and Details Eligibility Photo Complete @OptimiseOrDie
  • 11. Examples – Guide Dogs Bounce Content Engage Donation Pathway Donation Page Starts process Funnel steps Complete @OptimiseOrDie
  • 12. Within a layer Page 1 Page 2 Page 3 Page 4 Page 5 Exit Deeper Layer Email Wishlist Contact Like Micro Conversions @OptimiseOrDie
  • 13. #1 : Make a Money Model • Get to know the flow and loss (leaks) inbound, inside and through key processes or conversion points. • Once you know the key steps you’re losing people at and how much traffic you have – make a money model. • 20,000 see the basket page – what’s the basket page to checkout page ratio? • Estimate how much you think you can shift the key metric (e.g. basket adds, basket -> checkout) • What downstream revenue or profit would that generate? • Sort by the money column • Congratulations – you’ve now built the worlds first IT plan for growth with a return on investment estimate attached! • I’ll talk more about prioritising later – but a good real world analogy for you to use: @OptimiseOrDie
  • 14. Think like a store owner! If you can’t refurbish the entire store, which floors or departments will you invest in optimising? Wherever there is: • Footfall • Low return @OptimiseOrDie
  • 15. #2 : Your hypothesis is crap! Insight - Inputs #FAIL Competitor copying Guessing Dice rolling Panic Competitor change An article the CEO read Ego Opinion Cherished notions Marketing whims Cosmic rays Not ‘on brand’ enough IT inflexibility Internal company needs Some dumbass consultant Shiny feature blindness Knee jerk reactons @OptimiseOrDie
  • 16. #2 : These are the inputs you need… Insight - Inputs Insight Eye tracking Segmentation Surveys Sales and Call Centre Customer contact Social analytics Session Replay Usability testing Forms analytics Search analytics Voice of Customer Market research A/B and MVT testing Big & unstructured data Web analytics Competitor Customer evals services @OptimiseOrDie
  • 17. Insight - Inputs @OptimiseOrDie #2 : Brainstorming the test • Check your inputs • Assemble the widest possible team • Share your data and research • Design Emotive Writing guidelines
  • 18. Insight - Inputs @OptimiseOrDie #2 : Emotive Writing - example Customers do not know what to do and need support and advice • Emphasize the fact that you understand that their situation is stressful • Emphasize your expertise and leadership in vehicle glazing and will help them get the best solution for their situation • Explain what they will need to do online and during the call-back so that they know what the next steps will be • Explain that they will be able ask any other questions they might have during the call-back Customers do not feel confident in assessing the damage • Emphasize the fact that you will help them assess the damage correctly online Customers need to understand the benefits of booking online • Emphasize that the online booking system is quick, easy and provides all the information they need in regards with their appointment and general cost information Customers mistrust insurers and find dealing with their insurance situation very frustrating • Where possible communicate the fact that the job is most likely to be free for insured customers, or good value for money for cash customers • Show that you understand the hassle of dealing with insurance companies – emphasise that you will help with their insurance paperwork for them, freeing them of this burden Some customers cannot be bothered to take action to fix their car glass • Emphasize the consequences of not doing anything, e.g. ‘It’s going to cost you more if the chip develops into a crack’
  • 19. Insight - Inputs @OptimiseOrDie #2 : THE DARK SIDE “Keep your family safe and get back on the road fast with Autoglass.”
  • 20. Insight - Inputs @OptimiseOrDie #2 : NOW YOU CAN BEGIN • You should have inputs, research, data, guidelines • Sit down with the team and prompt with 12 questions: – Who is this page (or process) for? – What problem does this solve for the user? – How do we know they need it? – What is the primary action we want people to take? – What might prompt the user to take this action? – How will we know if this is doing what we want it to do? – How do people get to this page? – How long are people here on this page? – What can we remove from this page? – How can we test this solution with people? – How are we solving the users needs in different and better ways than other places on our site? – If this is a homepage, ask these too (bit.ly/1fX2RAa)
  • 21. Insight - Inputs @OptimiseOrDie #2 : PROMPT YOURSELF • Check your UX or Copywriting guidelines. • Use Get Mental Notes • What levers can we apply now? • Create a hypothesis: “WE BELIEVE THAT DOING [A] FOR PEOPLE [B] WILL MAKE OUTCOME [C] HAPPEN. WE'LL KNOW THIS WHEN WE SEE DATA [D] AND FEEDBACK [E]” www.GetMentalNotes.com
  • 22. Insight - Inputs @OptimiseOrDie #2 : THE FUN BIT! • Collaborative Sketching • Brainwriting • Refine and Test!
  • 23. We believe that doing [A] for People [B] will make outcome [C] happen. We’ll know this when we observe data [D] and obtain feedback [E]. (reverse) @OptimiseOrDie
  • 24. #2 : Solutions • You need multiple tool inputs – Tool decks are here : www.slideshare.net/sullivac • Collaborative, Customer connected team – If you’re not doing this, you’re hosed • Session replay tools provide vital input – Get vital additional customer evidence • Simple page Analytics don’t cut it – Invest in your analytics, especially event tracking • Ego, Opinion, Cherished notions – fill gaps – Fill these vacuums with insights and data • Champion the user – Give them a chair at every meeting @OptimiseOrDie
  • 25. #2 : HYPOTHESIS DESIGN SUMMARY Insight - Inputs @OptimiseOrDie • Inputs – get the right stuff • Research, Guidelines, Data • Framing the problem(s) • Questions to get you going • Use card prompts for Psychology • Create a hypothesis • Collaborative Sketching • Brainwriting • Refine and Check Hypothesis • Instrument and Test
  • 26. We believe that doing [A] for People [B] will make outcome [C] happen. We’ll know this when we observe data [D] and obtain feedback [E]. (reverse) @OptimiseOrDie
  • 27. #3 : No analytics integration • Investigating problems with tests • Segmentation of results • Tests that fail, flip or move around • Tests that don’t make sense • Broken test setups • What drives the averages you see? @OptimiseOrDie
  • 28. 28 A B B A
  • 29. These Danish porn sites are so hardcore! We’re still waiting for our AB tests to finish! #4 : The test will finish after you die • Use a test length calculator like this one: • visualwebsiteoptimizer.com/ab-split-test-duration/
  • 30. #5 : You don’t test for long enough • The minimum length – 2 business cycles (so you can cross check) – Usually a week, 2 weeks, Month – Always test ‘whole’ not partial cycles – Be aware of multiple cycles – Don’t self stop! – PURCHASE CYCLES – KNOW THEM
  • 31. Business & Purchase Cycles @OptimiseOrDie Start Test Finish Avg Cycle • Customers change • Your traffic mix changes • Markets, competitors • Be aware of all the waves • Always test whole cycles • Minimum 2 cycles (wk/mo) • Don’t exclude slower buyers
  • 32. #5 : You don’t test for long enough • How long after that – I aim for a minimum 250 outcomes, ideally 350+ for each ‘creative’ – If you test 4 recipes, that’s 1400 outcomes needed – You should have worked out how long each batch of 350 needs before you start! – 95% confidence is the cherry – not the cake - BUT BIG SECRET -> (p values are unreliable) – If you segment, you’ll need more data – It may need a bigger sample if the response rates are similar* – Use a test length calculator but be aware of BARE MINIMUM TO EXPECT – Important insider tip – watch the error bars! The +/- stuff * Stats geeks know I’m glossing over something here. That test time depends on how the two experiments separate in terms of relative performance as well as how volatile the test response is. I’ll talk about this when I record this one! This is why testing similar stuff sux. 32
  • 33. #5 : You put faith in the Confidence value 95%, 99%, 99.99% ‘Confidence’ or ‘Chance to beat baseline’ – what’s that? • It’s a stats thing • Seriously, look at this one LAST in your testing • Purchase Cycle, Business Cycles, Sample Size, Error bar separation – ALL come before this one. Got it? • Why? It’s to do with p-values. Read this article: • http://bit.ly/1gq9dtd • If you rely on confidence, you are relying upon something that’s unreliable and moves around, particularly early in testing. • Don’t be fooled by your testing package – watch the error bars instead of confidence.
  • 34. #5 : The tennis court – Let’s say we want to estimate, on average, what height Roger Federer and Nadal hit the ball over the net at. So, let’s start the match: @OptimiseOrDie
  • 35. First Set Federer 6-4 – We start to collect values 62cm +/- 2cm 63.5cm +/- 2cm @OptimiseOrDie
  • 36. Second Set – Nadal 7-6 – Nadal starts sending them low over the net 62cm +/- 1cm 62.5cm +/- 1cm @OptimiseOrDie
  • 37. Final Set Nadal 7-6 – We start to collect values 61.8cm +/- .3cm 62cm +/- .3cm
  • 38. Let’s look at this a different way 62.5cm +/- 1cm @OptimiseOrDie 9.1% ± 0.3 9.3% ± 0.3
  • 39. 62.5cm +/- 1cm @OptimiseOrDie 9.1% ± 0.5 9.3% ± 0.5 9.1% ± 0.2 9.3% ± 0.2 9.1% ± 0.1 9.3% ± 0.1
  • 40. Graph is a range, not a line: 9.1 ± 1.9% 9.1 ± 0.9% 9.1 ± 0.3%
  • 41. #5 : How long to test? • The minimum length: – 2 business cycles and > purchase cycle as a minimum, regardless of outcomes. Test for less and you’re biasing the sample. – ALWAYS ALWAYS TEST WHOLE CYCLES. – 250 ABSOLUTE MINIMUM FOR ANY SAMPLE, 350+ nicer, 1000 sweet! – Error bar separation (or minimal overlap) between creatives – Ignore 95%+ confidence (it’s unreliable) – Use a test calculator (VWO have a nice one). – Work out your ‘test units’ – how long to get 350 outcomes for each creative in your test. – This is a minimum you should expect but sample size (or overlap) may mean you need longer – When to stop? @OptimiseOrDie
  • 42. #5 : When to stop • Self stopping is a huge problem: – “I stopped the test when it looked good” – “It hit 20% on Thursday, so I figured – time to cut and run” – “We need test time for something else. Looks good to us” – “We’ve got a big sample now so why not finish it today?” • False Positives and Negatives – If you cut part of a business cycle, you bias the segments you have in the test. – So if you ignore weekend shoppers by stopping your test on Friday, that will affect results – The other problems is FALSE POSITIVES and FALSE NEGATIVES @OptimiseOrDie
  • 43. #5 : When to stop Scenario 1 Scenario 2 Scenario 3 Scenario 4 @OptimiseOrDie After 200 observations Insignificant Insignificant Significant! Significant! After 500 observations Insignificant Significant! Insignificant Significant! End of experiment Insignificant Significant! Insignificant Significant! Scenario 1 Scenario 2 Scenario 3 Scenario 4 After 200 observations Insignificant Insignificant Significant! Significant! After 500 observations Insignificant Significant! trial stopped trial stopped End of experiment Insignificant Significant! Significant! Significant!
  • 44. #5 : When to stop • So – what to do? • Run a test calculator • Set the test time to hit the highest of the minimums • What minimums do you mean? – Minimum sample (250, 350, higher) – Business cycles (2+) – Purchase cycles (1 or 2+) – What your test calculator says • The longest one is how long it’s gonna take. • Set the test time • Run the test • Stop the test at the end, on a whole cycle • Analyse • That’s it! @OptimiseOrDie
  • 45. #6 : The early stages of a test… • Ignore the graphs. Don’t draw conclusions. Don’t dance. Calm down. • Get a feel for the test but don’t do anything yet! • Remember – in A/B - 50% of returning visitors will see a new shiny website! • Until your test has had at least 2 business cycles and 250+ outcomes, don’t bother even getting remotely excited! • Watching regularly is good though. You’re looking for anything that looks really odd – if everyone is looking (but not concluding) then oddities will get spotted. • All tests move around or show big swings early in the testing cycle. Here is a very high traffic site – it still takes 10 days to start settling. Lower traffic sites will stretch this period further. 45
  • 46. #7 : No QA testing for the AB test?
  • 47. #7 – BIG SECRET! • Over 40% of tests have had QA issues. • It’s very easy to break or bias the testing Browser testing www.crossbrowsertesting.com www.browserstack.com www.spoon.net www.cloudtesting.com www.multibrowserviewer.com www.saucelabs.com Mobile devices www.deviceanywhere.com www.perfectomobile.com www.opendevicelab.com @OptimiseOrDie
  • 48. #7 : What other QA testing should I do? • Testing from several locations (office, home, elsewhere) • Testing the IP filtering is set up • Test tags are firing correctly (analytics and the test tool) • Test as a repeat visitor and check session timeouts • Cross check figures from 2+ sources • Monitor closely from launch, recheck, watch • WATCH FOR BIAS! @OptimiseOrDie
  • 49. #8 : Tests are random and not prioritised Once you have a list of potential test areas, rank them by opportunity vs. effort. The common ranking metrics that I use include: •Opportunity (revenue, impact) •Dev resource •Time to market •Risk / Complexity Make yourself a quadrant
  • 50. #9 : Your cycles are too slow 0 6 12 18 Months Conversio n @OptimiseOrDie
  • 51. #9 : Solutions • Give Priority Boarding for opportunities – The best seats reserved for metric shifters • Release more often to close the gap – More testing resource helps, analytics ‘hawk eye’ • Kaizen – continuous improvement – Others call it JFDI (just f***ing do it) • Make changes AS WELL as tests, basically! – These small things add up • RUSH Hair booking – Over 100 changes – No functional changes at all – 37% improvement • Inbetween product lifecycles? – The added lift for 10 days work, worth 360k @OptimiseOrDie
  • 52. #9 : Make your own cycles @OptimiseOrDie
  • 53. #10 : How do I know when it’s ready? • The hallmarks of a cooked test are: – It’s done at least 1 or preferably 2+ business and at least one if not two purchase cycles – You have at least 250-350 outcomes for each recipe – It’s not moving around hugely at creative or segment level performance – The test results are clear – even if the precise values are not – The intervals are not overlapping (much) – If a test is still moving around, you need to investigate – FIND OUT WHAT MARKETING ARE DOING – FIND OUT WHAT EVERYONE IS DOING – Be careful about limited time period campaigns (e.g. TV, print, online) – If you know when TV (or other big campaigns) are running, try one week with TV and one without during tests – very interesting. 53
  • 54. 54 #11 : Your test fails @OptimiseOrDie
  • 55. #11: Your test fails • Learn from the failure! If you can’t learn from the failure, you’ve designed a crap test. • Next time you design, imagine all your stuff failing. What would you do? If you don’t know or you’re not sure, get it changed so that a negative becomes insightful. • So : failure itself at a creative or variable level should tell you something. • On a failed test, always analyse the segmentation and analytics • One or more segments will be over and under • Check for varied performance • Now add the failure info to your Knowledge Base: • Look at it carefully – what does the failure tell you? Which element do you think drove the failure? • If you know what failed (e.g. making the price bigger) then you have very useful information • You turned the handle the wrong way • Now brainstorm a new test @OptimiseOrDie
  • 56. #12 : The test is ‘about the same’ • Analyse the segmentation • Check the analytics and instrumentation • One or more segments may be over and under • They may be cancelling out – the average is a lie • The segment level performance will help you (beware of small sample sizes) • If you genuinely have a test which failed to move any segments, it’s a crap test – be bolder • This usually happens when it isn’t bold or brave enough in shifting away from the original design, particularly on lower traffic sites • Get testing again! @OptimiseOrDie
  • 57. #13 : The test keeps moving around • There are three reasons it is moving around – Your sample size (outcomes) is still too small – The external traffic mix, customers or reaction has suddenly changed or – Your inbound marketing driven traffic mix is completely volatile (very rare) • Check the sample size • Check all your marketing activity • Check the instrumentation • If no reason, check segmentation @OptimiseOrDie
  • 58. #14 : The test has flipped on me • Something like this can happen: • Check your sample size. If it’s still small, then expect this until the test settles. • If the test does genuinely flip – and quite severely – then something has changed with the traffic mix, the customer base or your advertising. Maybe the PPC budget ran out? Seriously! • To analyse a flipped test, you’ll need to check your segmented data. This is why you have a split testing package AND an analytics system. • The segmented data will help you to identify the source of the shift in response to your test. I rarely get a flipped one and it’s always something
  • 59. • No – and this is why: – It’s a waste of time – It’s easier to test and monitor instead – You are eating into test time – Also applies to A/A/B/B testing – A/B/A running at 25%/50%/25% is the best • Read my post here : http://bit.ly/WcI9EZ 59 #15 : Should I run an A/A test first
  • 60. #16 : Nobody feels the test • You promised a 25% rise in checkouts - you only see 2% • Traffic, Advertising, Marketing may have changed • Check they’re using the same precise metrics • Run a calibration exercise • I often leave a 5 or 10% stub running in a test • This tracks old creative once new one goes live • If conversion is also down for that one, BINGO! • Remember – the AB test is an estimate – it doesn’t precisely record future performance • This is why infrequent testing is bad • Always be trying a new test instead of basking in the glory of one you ran 6 months ago. You’re only as good as your next test. @OptimiseOrDie
  • 61. #17 : You forgot about Mobile & Tablet • If you’re AB testing a responsive site, pay attention • Content will break differently on many screens • Know thy users and their devices • Use bango or google analytics to define a test list • Make sure you test mobile devices & viewports • What looks good on your desk may not be for the user • Harder to design cross device tests • You’ll need to segment mobile, tablet & desktop response in the analytics or AB testing package • Your personal phone is not a device mix • Ask me about making your device list • Buy core devices, rent the rest from deviceanywhere.com @OptimiseOrDie
  • 62. #18 : Oh shit – no traffic • If small volumes, contact customers – reach out. • If data volumes aren’t there, there are still customers! • Drive design from levers you can apply – game the system • Pick clean and simple clusters of change (hypothesis driven) • Use a goal at an earlier ring stage or funnel step • Beware of using clickthroughs when attrition is high on the other side • Try before and after testing on identical time periods (measure in analytics model) • Be careful about small sample sizes (<100 outcomes) • Are you working automated emails? • Fix JFDI, performance and UX issues too!
  • 63. #19 : Oh shit – no traffic • Forget MVT or A/B/N tests – run your numbers • Test things with high impact – don’t be a wuss! • Use UX, Session Replay to aid insight • Run a task gap survey (4Q style) • Run a dropped basket survey (LF style) • Run a general survey + check social + other sites • Run sitewide tests that appear on all pages or large clusters of pages – • UVPs (“We are a cool brand”), USPs (“Free returns!”), UCPs (“10% off today”). • Headers, Footers, Nudge Bars, USP bars, footer changes, Navigation, Product pages, Delivery info etc.
  • 64. #19 : I chose the wrong test type • A/B testing – good for: – A single change of content or design layout – A group of related changes (e.g. payment security) – Finding a new and radical shift for a template design – Lower traffic pages or shorter test times • Multivariate testing – good for: – Higher traffic pages – Groups of unrelated changes (e.g. delivery & security) – Multiple content or design style changes – Finding specific drivers of test lifts – Testing multiple versions (e.g. click here, book now, go) – Where you need to understand strong and weak cross variable interactions – Don’t use to settle arguments or sloppy thinking!
  • 65. Netherlands A/B Shift Example Previous winner +7.25% +8.19% additional lift
  • 66. #20 – Other flavours of testing • Micro testing (tiny change) – good for: – Proving to the boss that testing works – Demonstrating to IT that it works without impact – Showing the impact of a seemingly tiny change – Proof of concept before larger test • Funnel testing – good for: – Checkouts – Lead gen – Forms processes – Quotations – Any multi-step process with data entry • Fake it and Build it – good for: – Testing new business ideas – Trying out promotions on a test sample – Estimating impact before you build – Helps you calculate ROI – You can even split test entire server farms Vs.
  • 67. #20 – Other flavours of testing “Congratulations! Today you’re the lucky winner of our random awards programme. You get all these extra features for free, on us. Enjoy.”
  • 68. Top F***ups for 2014 1. Testing in the wrong place 2. Your hypothesis inputs are crap 3. No analytics integration 4. Your test will finish after you die 5. You don’t test for long enough 6. You peek before it’s ready 7. No QA for your split test 8. Opportunities are not prioritised 9. Testing cycles are too slow 10. You don’t know when tests are ready 11. Your test fails 12. The test is ‘about the same’ 13. Test flips behaviour 14. Test keeps moving around 15. You run an A/A test and waste time 16. Nobody ‘feels’ the test 17. You forgot you were responsive 18. You forgot you had no traffic 19. You ran the wrong test type 20. You didn’t try all the flavours of testing @OptimiseOrDie
  • 70. 2004 Headspace What I thought I knew in 2004 Reality
  • 71. 2014 Headspace What I know I know On a good day
  • 75. #1 Smart Talented Polymath People The 5 Legged Rumsfeldian Barstool @OptimiseOrDie Flexible and Agile teams
  • 77. #2 : Analytics Investment (tools, people, dev time) @OptimiseOrDie
  • 78. @OptimiseOrDie #3 : User research and insight
  • 79. #3 : THE BEST IDEAS COME FROM? @OptimiseOrDie
  • 80. #4 : GREAT COPYWRITING “On the average, five times as many people read the headline as read the body copy. When you have written your headline, you have spent eighty cents out of your dollar.” David Ogilvy “In 9 years and 40M split tests with visitors, the majority of my testing success came from playing with the words.” @OptimiseOrDie
  • 81. The 5 Legged Optimisation #1 Culture & Team #2 UX, CX, Service Design, Insight #3 Toolkit & Analytics investment #4 Persuasive Copywriting #5 Experimentation tools & process @OptimiseOrDie Barstool
  • 84. #5 : FIND STUFF @OptimiseOrDie @danbarker Analytics @fastbloke Analytics @timlb Analytics @jamesgurd Analytics @therustybear Analytics @carmenmardiros Analytics @davechaffey Analytics @priteshpatel9 Analytics @cutroni Analytics @avinash Analytics @Aschottmuller Analytics, CRO @cartmetrix Analytics, CRO @Kissmetrics CRO / UX @Unbounce CRO / UX @Morys CRO / Neuro @UXFeeds UX / Neuro @Psyblog Neuro @Gfiorelli1 SEO / Analytics @PeepLaja CRO @TheGrok CRO @UIE UX @LukeW UX / Forms @cjforms UX / Forms @axbom UX @iatv UX @Chudders Photo UX @JeffreyGroks Innovation @StephanieRieger Innovation @BrianSolis Innovation @DrEscotet Neuro @TheBrainLady Neuro @RogerDooley Neuro @Cugelman Neuro @Smashingmag Dev / UX @uxmag UX @Webtrends UX / CRO
  • 85. #5 : LEARN STUFF @OptimiseOrDie Baymard.com Lukew.com Smashingmagazine.com ConversionXL.com Medium.com Whichtestwon.com Unbounce.com Measuringusability.com RogerDooley.com Kissmetrics.com Uxmatters.com Smartinsights.com Econsultancy.com Cutroni.com www.GetMentalNotes.com
  • 86. The Best Companies… • Invest continually in analytics instrumentation, tools, people • Use an Agile, iterative, cross-silo, one team project culture • Prefer collaborative tools to having lots of meetings • Prioritise development based on numbers and insight • Practice real continuous product improvement, not SLEDD* • Are fixing bugs, cruft, bad stuff as well as optimising • Source photos and content that support persuasion and utility • Have cross channel, cross device design, testing and QA • Segment their data for valuable insights, every test or change • Continually reduce cycle (iteration) time in their process • Blend ‘long’ design, continuous improvement AND split tests • Make optimisation the engine of change, not the slave of ego * Single Large Expensive Doomed Developments
  • 87. THE FUTURE OF TESTING – CONDUCTRICS.COM slidesha.re/1ivS68s
  • 88. Thank You! 88 @OptimiseOrDie Email Slides : sullivac@gmail.com : slideshare.com/sullivac : linkd.in/pvrg14