SlideShare une entreprise Scribd logo
1  sur  38
Télécharger pour lire hors ligne
Company
     D
 LOGO




                 A/B Testing Framework
                     Design Issues

        Patrick McKenzie 2010
          (This presentation is meant to be read. It is released
        under the Creative Commons By Attribution license –
        feel free to spread it or use it.)




                                                                         www.abingo.org
 By Patrick McKenzie 2010. Please use or send to people who'd benefit.
Company
     D
                                 A/B Testing Frameworks
 LOGO



                           •   Why You Should Care
                           •   Core Use Scenarios
                           •   A/B Test Lifecycle
                           •   Design Decisions
                           •   Technical Considerations
                           •   API Considerations




          www.abingo.org
Company
     D
                                   Why You Should Care
 LOGO


                           There is a paucity of A/B testing frameworks.

                             "I can probably name a dozen different systems for
                             building high scale applications (distributed storage,
                             message queues, caching layers, search engines,
                             etc), but I can’t name a single AB testing framework
                             other than Google Website Optimizer. That seems
                             like a serious inversion of priorities for most
                             startups."




                                          http://www.tomkleinpeter.com/2009
                                          /01/21/where-are-the-ab-testing-
                                          frameworks/

          www.abingo.org
Company
     D
                                    Why You Should Care
 LOGO



                           •   A/B testing helps you validate your
                               hypotheses about customers and product.
                           •   A/B testing is drop-dead easy if your tech
                               supports it.
                           •   You won't do it otherwise, because it feels
                               like boring busywork.
                                   The goal is to have split-testing be a continuous part of our
                                   development process, so much so that it is considered a
                                   completely routine part of developing a new feature. In fact,
                                   I've seen this approach work so well that it would be
                                   considered weird and kind of silly for anyone to ship a new
                                   feature without subjecting it to a split-test. That's when this
                                   approach can pay huge dividends.

                                                     Eric Ries in blog post


          www.abingo.org
Company
     D
                                   Why You Should Care
 LOGO



                           •   There are only two decent A/B test
                               frameworks for Rails. Both less than 9
                               months old.
                           •   There are (to best of my knowledge) no
                               OSS frameworks for Java, Python, etc.
                           •   You should write one. V1.0 can be done
                               in 10 man hours in modern MVC
                               frameworks. Will be best ROI you ever
                               get.
                           •   This presentation hopes to save you time
                               by telling you where the hard decisions
                               are.

          www.abingo.org
Company
     D
                                    Three Use Scenarios
 LOGO



                           •   Customers interacting with site.
                           •   Implementers coding A/B test.
                           •   Somebody interpreting results.




          www.abingo.org
Company
     D
                           User View of A/B Test
 LOGO                       (What Cindy Sees)




          www.abingo.org
Company
     D
                           User View of A/B Test
 LOGO                        (What Bob Sees)




          www.abingo.org
Company
     D
                                    Key Points For Users
 LOGO



                           •   Users get consistent behavior. Cindy
                               always sees her alternative. Bob always
                               sees his.
                           •   A/B test doesn't break usage of site.
                               (Sounds obvious, can be non-trivial. Test
                               for interactions!)
                           •   Ending A/B test doesn't break site.


                                   Did you know that in Google Website Optimizer
                                   users can bookmark individual A/B alternatives
                                   because they have distinct URLs? And that after
                                   the test is over they may 404? Yeah. Don't do
                                   that.

          www.abingo.org
Company
     D
                                    What Developers See
 LOGO



                           •   One line to add a test.
                           •   One line to track it.




                           •   No thought required beyond creating
                               alternatives.



          www.abingo.org
Company
     D
                               What Internal Customers See
 LOGO



                           •   Simple, clear, actionable results.
                           •   Stats 101 not required.




                                     Your marketing team might know math.
                                     That doesn't mean they should have to.



          www.abingo.org
Company
     D
                                      A/B Test Lifecycle
 LOGO



                           •   Come up with alternatives.
                           •   Code alternatives.
                           •   Test alternatives.
                           •   Deploy to site.
                           •   Users interact with alternatives.
                           •   Analyze results.
                           •   End test.

                                      When designing your A/B testing framework,
                                      keep in mind that you'll be doing all of the
                                      above. Eliminate as much friction from each
                                      step as possible – this decreases total time
                                      through the loop.

          www.abingo.org
Company
     D
                                 Come up with alternatives.
 LOGO



                           •   Not generally a technical problem.
                           •   Inspiration can come from anywhere – a
                               blog post, a passing fancy, customer
                               comments.
                           •   Should never have to say "We can't do
                               that!"
                           •   Strong recommendation: If we pay your
                               salary, you are authorized to test.

                                   Customers do not think in terms of
                                   Model/View/Controller interfaces. They just want
                                   to know what the app can do. You should be able
                                   to A/B test from any point in the app.

          www.abingo.org
Company
     D
                                      Code Alternatives
 LOGO



                           •   Programming is hard, but you have to do it
                               anyway.
                           •   Programming A/B tests is easy – one liner
                               and if statement.
                           •   Testing framework handles all
                               bookkeeping – programmers never care.
                           •   Re-use conversion code. Typical
                               businesses have lots of tests, few defined
                               conversions. No need to reinvent wheel
                               every single time.



          www.abingo.org
Company
     D
                                       Test Alternatives
 LOGO



                           •   A/B tests are live code. They can have
                               bugs. You should be able to unit test like
                               normal.
                           •   Helpful for developers to have access to
                               quick "switch what test I'm seeing"
                               functionality. Simplest example: manually
                               add parameter to URL
                               (&exampleTest=altA). Turn off feature in
                               production.
                           •   Careful of test interactions. Very easy to
                               do once you start testing behavior in
                               addition to display.

          www.abingo.org
Company
     D
                                          Deploy to site.
 LOGO



                           •   Avoid pointless work here. "Push code
                               live, test starts automatically" is the ideal.
                           •   Testing framework should handle its own
                               setup first time test is called. After that, re-
                               use.
                           •   Note this decision going to be made
                               thousands or hundreds of thousands of
                               times, possibly right after you push live:
                               consider performance implications.
                           •   Can make code default to old version,
                               control start/stop of test via dashboard.
                               Could be worth it, adds complexity.

          www.abingo.org
Company
     D
                               Users interact with alternatives.
 LOGO



                           •   Happily, this takes very little work for you...
                           •   … except when it creates Heisenbugs.
                           •   In addition to thorough testing, make sure
                               your "What The User Is Seeing" feature
                               (you have one, right?) reflects their A/B
                               tests.




          www.abingo.org
Company
     D
                                        Analyze results.
 LOGO



                           •   Stats behind A/B tests may not be well
                               understood. Impress that stats are real,
                               measured, and actionable. It doesn't
                               matter if they think it is magic as long as
                               they trust the magic.
                           •   Do significance testing so it isn't magic.
                           •   Doing significance testing is grunt work: let
                               the computer do it.
                           •   Spend the extra time to make internal
                               dashboard pretty. People trust pretty
                               things.
                           •   A/B tests not a good place to dig for data.
                               One glance tells you all you need.
          www.abingo.org
Company
     D
                                                End test
 LOGO


                           •   Simple solution: rip code out, test stops.
                           •   Simple solution requires redeploy. In event of bug
                               or strong test result ("Oh my God what were we
                               thinking!?!") might want immediate end button on
                               dashboard. Be able to specify alternative.
                           •   Automatic end of test? Probably a misfeature, but
                               easy to implement.
                           •   Ending test should switch all users to winner (or
                               else you get to support old tests until doomsday).
                               However, users have memories.
                           •   Negatively affected users (e.g. you end test in favor
                               of higher price, user planning on buying later saw
                               lower price) may be mad. Not big problem, but be
                               ready.

          www.abingo.org
Company
     D
                                   Design Considerations
 LOGO



                           •   Tracking and managing identity.
                           •   How to choose alternatives by identity.
                           •   Where to store test participation.
                           •   Where to store alternatives.
                           •   Stats is hard, let's go shopping.
                           •   Presenting results.




          www.abingo.org
Company
     D
                                       Tracking Identity
 LOGO



                           •   Cindy is Cindy, Bob is Bob, Cindy should
                               always see Cindy's tests.
                           •   Cindy is not a cookie. Cindy is not a
                               session. Cindy is freaking Cindy. Even
                               when she is on different computer.
                           •   You already have identity via user
                               authentication. Probably want to punt
                               identity problem there. Have it inform
                               framework of current user identity.
                           •   Important edge case: new user signup
                               should persist “identity” from anonymous
                               visitor to identifiable user.

          www.abingo.org
Company
     D
                                       Tracking Identity
 LOGO



                           •   Easiest identity is random number thrown
                               into cookie. Associate with user accounts.
                               Restore on login. Bam, done.
                           •   However, you will occasionally have A/B
                               test conversions outside of Cindy's HTTP
                               cycle. (e.g. Purchase notification comes
                               from Paypal, not from Cindy. Cindy calls
                               up to place order.) Think it through – not
                               terribly difficult if you plan for it.




          www.abingo.org
Company
     D
                                How To Choose Alternatives
 LOGO



                           •   If you have N alternatives, picking
                               randomly and persisting it by identity works
                               decently.
                           •   Another approach: MD5(identity) %
                               number_of_alts. Saves space
                               (marginally).
                           •   Don't need to save what test Cindy is
                               seeing as long as you can reproduce it.




          www.abingo.org
Company
     D
                                How To Choose Alternatives
 LOGO



                           •   If you have N alternatives, picking
                               randomly and persisting it by identity works
                               decently.
                           •   Another approach: MD5(identity) %
                               number_of_alts. Saves space
                               (marginally).
                           •   Don't need to save what test Cindy is
                               seeing as long as you can reproduce it.




          www.abingo.org
Company
     D
                               Where to store test participation
 LOGO



                           •   Cookie/session bad idea: Cindy will log in
                               at work tomorrow. She should see
                               consistent behavior.
                           •   Cache (memcached) possible, but if Cindy
                               is evicted from cache or cache resets,
                               tough for Cindy and tough for you.
                           •   Persistent data store best bet. Will talk
                               about specific data stores later in slides.




          www.abingo.org
Company
     D
                                 Where to store alternatives
 LOGO



                           •   Many approaches. Whatever works for
                               you.
                           •   A/Bingo puts alternatives directly in code.
                               Easiest place, always right in front of
                               developer, no conceptual overhead.
                           •   Vanity puts alternatives in special
                               experiment files. Arguably cleaner code,
                               but have to context/switch.
                           •   Google Website Optimizer has you define
                               alternatives on a web form. Great for
                               marketing department at insurance
                               company. Don't do this. Greatly limits
                               possibilities, increases integration work,
          www.abingo.org       blows testing to heck and back.
Company
     D
                                           Doing Stats
 LOGO



                           •   If possible, call out to dedicated stats
                               modules/libraries to do stats.
                           •   Many types of possible stats for A/B
                               testing. Pick one, stick with it. I use Z-
                               scores because a) I remember them and
                               b) implementation was drop-dead easy.
                           •   Sadly, Ruby lacks many good stats
                               libraries. Oh, to be a Perl programmer...
                           •   This subject worth its own presentation.
                               See Ben Tilly.
                               http://elem.com/~btilly/effective-ab-testing/


          www.abingo.org
Company
     D
                                      Presenting Results
 LOGO



                           •   Text is easy! Graphs not quite.
                           •   Google's confidence bars are sexy... and
                               pretty useless.
                           •   Simple, human language to describe what
                               confidence intervals and statistical
                               significance mean.
                           •   De-emphasize null results (A > B but not
                               statistically significantly so) but don't hide
                               them. (After all, the fact that "this test was
                               too close to call" tells you something
                               useful.)


          www.abingo.org
Company
     D
                                  Technical Considerations
 LOGO



                           •   Less than 1,000 visitors per hour? Skip
                               these slides.
                           •   A/B testing turns performance
                               assumptions on head: heavy writes in very
                               bursty fashion ("as soon as test goes
                               live"), very non-relational data, fairly
                               infrequent reads (~3X writes on my site),
                               extraordinarily infrequent use of summary
                               statistics.
                           •   Practically tailor-made for key/value store,
                               not so much for SQL.


          www.abingo.org
Company
     D
                           Queries You Have To Answer FAST
 LOGO



                           •   Who is Cindy? (user → identity)
                           •   Is Cindy participating in Test X?
                           •   If so, what alternative has she seen?
                           •   If not, what alternative should she see?
                           •   Record fact that Cindy is participating in
                               Test X.
                           •   Has Cindy converted in Test X?
                           •   Record fact that Cindy converted for Test
                               X.



          www.abingo.org
Company
     D
                           Queries You Can Answer Leisurely
 LOGO



                           •   How many people have participated in
                               Experiment X?
                           •   How many saw Alternative A?
                           •   Umm, do that stats magic for me.




          www.abingo.org
Company
     D
                                Query You Will NEVER ASK
 LOGO



                           •   Who saw Alternative A in Experiment X?




          www.abingo.org
Company
     D
                                   Possible Architectures
 LOGO



                           •   Summary statistics (participant counts &
                               conversion counts) in MySQL table with
                               "fairly few" rows. Simple increment
                               statements for updates.
                           •   Participation information (Cindy,
                               Experiment X, Alternative A) in key/value
                               store.
                           •   Or whole thing in key/value store.




          www.abingo.org
Company
     D
                           Quick Speed Improvement for SQL
 LOGO



                           •   Give each of your alternatives a unique
                               string ID like MD5(experiment name,
                               alternative name). Calculate that in
                               application code. Index on column.
                           •   UPDATE alternatives SET participants =
                               participants + 1 where lookup_code =
                               'CALCULATED IN APPLICATION';
                           •   This avoids having to translate human
                               name in code to ID in table. (Or having to
                               use multi-column index for lookup.)
                           •   Note: I am not a very good guy with DBs,
                               but I am informed this is fairly fast. Test
                               for yourself.
          www.abingo.org
Company
     D
                                  Specific Key/Value Store
 LOGO                               Recommendations

                           •   MySQL with big string columns for key,
                               value: ewwwwww. I mean, ewwwwww.
                           •   Memcachd: Acceptable (and fast) but not
                               persistent. Also tends to only go down
                               when server does. For A/B testing, might
                               just re-run all in progress tests if it dies.
                           •   MemcacheDB: Tried it. Has unacceptable
                               performance when BerkeleyDB flushes to
                               disk. (5 seconds+!)
                           •   Redis: Tried it. Not in production yet. My
                               recommendation – very fast. Vanity also
                               uses it.

          www.abingo.org
Company
     D
                                      API Considerations
 LOGO


                           Only need to expose two methods:

                              •   ab_test(name, alternatives, conversion_name)
                              •   conversion(conversion_name)


                           Note lack of identity in method calls. Let the
                           framework worry about that.

                           How you specify alternatives up to you.
                           Array of strings is easy to understand.


          www.abingo.org
Company
     D
                                          Consuming API
 LOGO


                           ab_test(name, alternatives, conversion_name) returns
                           the chosen alternative, handles all bookkeeping as
                           side effect.

                           Typically:

                           if (ab_test(...) == "something") {
                              #do something
                           } else {
                             #do something else
                           }

                           Fun opportunity for blocks/binding if your language
                           supports that.
          www.abingo.org
Company
     D
                                        Got Questions?
 LOGO


                           Great A/B testing resources:
                           • Eric Ries (startuplessonslearned.com) – heavy on
                              motivation, less on stats/design decisions
                           • #abtests and @abtests on Twitter. Good
                              community, many ideas for inspiration.
                           • http://abtests.com – ditto
                           • http://www.bingocardcreator.com/abingo/resources
                              – links I use when I forget the math.
                           • http://www.kalzumeus.com – my blog
                           • patrick@bingocardcreator.com –
                           I'm always happy to chat about A/B testing, with
                           anybody. Potentially available for consulting.



          www.abingo.org

Contenu connexe

Tendances

A/B testing at Spotify
A/B testing at SpotifyA/B testing at Spotify
A/B testing at SpotifyAli Sarrafi
 
Test for Success: A Guide to A/B Testing on Emails & Landing Pages
Test for Success: A Guide to A/B Testing on Emails & Landing PagesTest for Success: A Guide to A/B Testing on Emails & Landing Pages
Test for Success: A Guide to A/B Testing on Emails & Landing PagesOptimizely
 
SXSW 2016 - Everything you think about A/B testing is wrong
SXSW 2016 - Everything you think about A/B testing is wrongSXSW 2016 - Everything you think about A/B testing is wrong
SXSW 2016 - Everything you think about A/B testing is wrongDan Chuparkoff
 
Practical Introduction to A/B Testing
Practical Introduction to A/B TestingPractical Introduction to A/B Testing
Practical Introduction to A/B TestingAlex Alwan
 
Experimentation Platform at Netflix
Experimentation Platform at NetflixExperimentation Platform at Netflix
Experimentation Platform at NetflixSteve Urban
 
Why everything is an A/B Test at Pinterest
Why everything is an A/B Test at PinterestWhy everything is an A/B Test at Pinterest
Why everything is an A/B Test at PinterestKrishna Gade
 
Controlled Experimentation aka A/B Testing for PMs by Tinder Sr PM
Controlled Experimentation aka A/B Testing for PMs by Tinder Sr PMControlled Experimentation aka A/B Testing for PMs by Tinder Sr PM
Controlled Experimentation aka A/B Testing for PMs by Tinder Sr PMProduct School
 
Ab testing 101
Ab testing 101Ab testing 101
Ab testing 101Ashish Dua
 
Netflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.js
Netflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.jsNetflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.js
Netflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.jsChris Saint-Amant
 
A guide to product metrics by Mixpanel
A guide to product metrics by MixpanelA guide to product metrics by Mixpanel
A guide to product metrics by MixpanelHarsha MV
 
A/B Testing at Pinterest: Building a Culture of Experimentation
A/B Testing at Pinterest: Building a Culture of Experimentation A/B Testing at Pinterest: Building a Culture of Experimentation
A/B Testing at Pinterest: Building a Culture of Experimentation WrangleConf
 
How to Focus On the Problem, Not the Solution by Spotify PM
How to Focus On the Problem, Not the Solution by Spotify PMHow to Focus On the Problem, Not the Solution by Spotify PM
How to Focus On the Problem, Not the Solution by Spotify PMProduct School
 
A/B Testing Pitfalls and Lessons Learned at Spotify
A/B Testing Pitfalls and Lessons Learned at SpotifyA/B Testing Pitfalls and Lessons Learned at Spotify
A/B Testing Pitfalls and Lessons Learned at SpotifyDanielle Jabin
 
Tag-it 2016 slides: UX + A/B Testing at Booking.com: Design focused on conver...
Tag-it 2016 slides: UX + A/B Testing at Booking.com: Design focused on conver...Tag-it 2016 slides: UX + A/B Testing at Booking.com: Design focused on conver...
Tag-it 2016 slides: UX + A/B Testing at Booking.com: Design focused on conver...Maria Lígia Klokner
 
The Scientific Method of Experimentation by Google PM
The Scientific Method of Experimentation by Google PMThe Scientific Method of Experimentation by Google PM
The Scientific Method of Experimentation by Google PMProduct School
 
Zuora Sales Deck
Zuora Sales DeckZuora Sales Deck
Zuora Sales DeckRyan Gum
 
Lean Analytics for Startups and Enterprises
Lean Analytics for Startups and EnterprisesLean Analytics for Startups and Enterprises
Lean Analytics for Startups and EnterprisesLean Analytics
 
Usability Testing 101 - an introduction
Usability Testing 101 - an introductionUsability Testing 101 - an introduction
Usability Testing 101 - an introductionElizabeth Snowdon
 
Inverting The Testing Pyramid
Inverting The Testing PyramidInverting The Testing Pyramid
Inverting The Testing PyramidNaresh Jain
 

Tendances (20)

A/B testing at Spotify
A/B testing at SpotifyA/B testing at Spotify
A/B testing at Spotify
 
Test for Success: A Guide to A/B Testing on Emails & Landing Pages
Test for Success: A Guide to A/B Testing on Emails & Landing PagesTest for Success: A Guide to A/B Testing on Emails & Landing Pages
Test for Success: A Guide to A/B Testing on Emails & Landing Pages
 
SXSW 2016 - Everything you think about A/B testing is wrong
SXSW 2016 - Everything you think about A/B testing is wrongSXSW 2016 - Everything you think about A/B testing is wrong
SXSW 2016 - Everything you think about A/B testing is wrong
 
Practical Introduction to A/B Testing
Practical Introduction to A/B TestingPractical Introduction to A/B Testing
Practical Introduction to A/B Testing
 
Experimentation Platform at Netflix
Experimentation Platform at NetflixExperimentation Platform at Netflix
Experimentation Platform at Netflix
 
Why everything is an A/B Test at Pinterest
Why everything is an A/B Test at PinterestWhy everything is an A/B Test at Pinterest
Why everything is an A/B Test at Pinterest
 
Controlled Experimentation aka A/B Testing for PMs by Tinder Sr PM
Controlled Experimentation aka A/B Testing for PMs by Tinder Sr PMControlled Experimentation aka A/B Testing for PMs by Tinder Sr PM
Controlled Experimentation aka A/B Testing for PMs by Tinder Sr PM
 
Ab testing 101
Ab testing 101Ab testing 101
Ab testing 101
 
Netflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.js
Netflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.jsNetflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.js
Netflix JavaScript Talks - Scaling A/B Testing on Netflix.com with Node.js
 
A guide to product metrics by Mixpanel
A guide to product metrics by MixpanelA guide to product metrics by Mixpanel
A guide to product metrics by Mixpanel
 
A/B Testing at Pinterest: Building a Culture of Experimentation
A/B Testing at Pinterest: Building a Culture of Experimentation A/B Testing at Pinterest: Building a Culture of Experimentation
A/B Testing at Pinterest: Building a Culture of Experimentation
 
How to Focus On the Problem, Not the Solution by Spotify PM
How to Focus On the Problem, Not the Solution by Spotify PMHow to Focus On the Problem, Not the Solution by Spotify PM
How to Focus On the Problem, Not the Solution by Spotify PM
 
A/B Testing Pitfalls and Lessons Learned at Spotify
A/B Testing Pitfalls and Lessons Learned at SpotifyA/B Testing Pitfalls and Lessons Learned at Spotify
A/B Testing Pitfalls and Lessons Learned at Spotify
 
A/B testing
A/B testingA/B testing
A/B testing
 
Tag-it 2016 slides: UX + A/B Testing at Booking.com: Design focused on conver...
Tag-it 2016 slides: UX + A/B Testing at Booking.com: Design focused on conver...Tag-it 2016 slides: UX + A/B Testing at Booking.com: Design focused on conver...
Tag-it 2016 slides: UX + A/B Testing at Booking.com: Design focused on conver...
 
The Scientific Method of Experimentation by Google PM
The Scientific Method of Experimentation by Google PMThe Scientific Method of Experimentation by Google PM
The Scientific Method of Experimentation by Google PM
 
Zuora Sales Deck
Zuora Sales DeckZuora Sales Deck
Zuora Sales Deck
 
Lean Analytics for Startups and Enterprises
Lean Analytics for Startups and EnterprisesLean Analytics for Startups and Enterprises
Lean Analytics for Startups and Enterprises
 
Usability Testing 101 - an introduction
Usability Testing 101 - an introductionUsability Testing 101 - an introduction
Usability Testing 101 - an introduction
 
Inverting The Testing Pyramid
Inverting The Testing PyramidInverting The Testing Pyramid
Inverting The Testing Pyramid
 

Similaire à A/B Testing Framework Design

How To Fit Testing Into The Iteration
How To Fit Testing Into The IterationHow To Fit Testing Into The Iteration
How To Fit Testing Into The IterationRally Software
 
DevOpsDays Jakarta Igites
DevOpsDays Jakarta IgitesDevOpsDays Jakarta Igites
DevOpsDays Jakarta IgitesDevOpsDaysJKT
 
Abb presentation uklug
Abb presentation uklugAbb presentation uklug
Abb presentation uklugdominion
 
QA Role in Agile Teams
QA Role in Agile Teams QA Role in Agile Teams
QA Role in Agile Teams Synerzip
 
How to establish ways of working that allows shifting-left of the automation ...
How to establish ways of working that allows shifting-left of the automation ...How to establish ways of working that allows shifting-left of the automation ...
How to establish ways of working that allows shifting-left of the automation ...Max Barrass
 
XebiaLabs & codecentric Webinar: Deploy Higher Quality Applications Faster (G...
XebiaLabs & codecentric Webinar: Deploy Higher Quality Applications Faster (G...XebiaLabs & codecentric Webinar: Deploy Higher Quality Applications Faster (G...
XebiaLabs & codecentric Webinar: Deploy Higher Quality Applications Faster (G...XebiaLabs
 
SOASTA Webinar: Process Compression For Mobile App Dev 120612
SOASTA Webinar: Process Compression For Mobile App Dev 120612SOASTA Webinar: Process Compression For Mobile App Dev 120612
SOASTA Webinar: Process Compression For Mobile App Dev 120612SOASTA
 
Rob Sabourin: On Testing
Rob Sabourin: On TestingRob Sabourin: On Testing
Rob Sabourin: On TestingTechWell
 
Practical ideas for getting the most out of your working environment
Practical ideas for getting the most out of your working environmentPractical ideas for getting the most out of your working environment
Practical ideas for getting the most out of your working environmentbluebirdtrans
 
[Webinar] Introducing Feature Management
[Webinar] Introducing Feature Management [Webinar] Introducing Feature Management
[Webinar] Introducing Feature Management Optimizely
 
Don’t Go over the Waterfall: Keep Agile Testing Agile
Don’t Go over the Waterfall: Keep Agile Testing AgileDon’t Go over the Waterfall: Keep Agile Testing Agile
Don’t Go over the Waterfall: Keep Agile Testing AgileTechWell
 
Don't hate, automate. lessons learned from implementing continuous delivery
Don't hate, automate. lessons learned from implementing continuous deliveryDon't hate, automate. lessons learned from implementing continuous delivery
Don't hate, automate. lessons learned from implementing continuous deliverySolano Labs
 
Automated Exploratory Tests
Automated Exploratory TestsAutomated Exploratory Tests
Automated Exploratory TestsZbyszek Mockun
 
Topic production code
Topic production codeTopic production code
Topic production codeKavi Kumar
 
How to Master UX Testing in an Agile Design Process
How to Master UX Testing in an Agile Design ProcessHow to Master UX Testing in an Agile Design Process
How to Master UX Testing in an Agile Design ProcessUserZoom
 
Winning strategies in Test Automation
Winning strategies in Test AutomationWinning strategies in Test Automation
Winning strategies in Test AutomationXBOSoft
 

Similaire à A/B Testing Framework Design (20)

How To Fit Testing Into The Iteration
How To Fit Testing Into The IterationHow To Fit Testing Into The Iteration
How To Fit Testing Into The Iteration
 
DevOpsDays Jakarta Igites
DevOpsDays Jakarta IgitesDevOpsDays Jakarta Igites
DevOpsDays Jakarta Igites
 
Abb presentation uklug
Abb presentation uklugAbb presentation uklug
Abb presentation uklug
 
Hardening
HardeningHardening
Hardening
 
QA Role in Agile Teams
QA Role in Agile Teams QA Role in Agile Teams
QA Role in Agile Teams
 
How to establish ways of working that allows shifting-left of the automation ...
How to establish ways of working that allows shifting-left of the automation ...How to establish ways of working that allows shifting-left of the automation ...
How to establish ways of working that allows shifting-left of the automation ...
 
XebiaLabs & codecentric Webinar: Deploy Higher Quality Applications Faster (G...
XebiaLabs & codecentric Webinar: Deploy Higher Quality Applications Faster (G...XebiaLabs & codecentric Webinar: Deploy Higher Quality Applications Faster (G...
XebiaLabs & codecentric Webinar: Deploy Higher Quality Applications Faster (G...
 
SOASTA Webinar: Process Compression For Mobile App Dev 120612
SOASTA Webinar: Process Compression For Mobile App Dev 120612SOASTA Webinar: Process Compression For Mobile App Dev 120612
SOASTA Webinar: Process Compression For Mobile App Dev 120612
 
Rob Sabourin: On Testing
Rob Sabourin: On TestingRob Sabourin: On Testing
Rob Sabourin: On Testing
 
Practical ideas for getting the most out of your working environment
Practical ideas for getting the most out of your working environmentPractical ideas for getting the most out of your working environment
Practical ideas for getting the most out of your working environment
 
[Webinar] Introducing Feature Management
[Webinar] Introducing Feature Management [Webinar] Introducing Feature Management
[Webinar] Introducing Feature Management
 
Lean Quality & Engineering
Lean Quality & EngineeringLean Quality & Engineering
Lean Quality & Engineering
 
Don’t Go over the Waterfall: Keep Agile Testing Agile
Don’t Go over the Waterfall: Keep Agile Testing AgileDon’t Go over the Waterfall: Keep Agile Testing Agile
Don’t Go over the Waterfall: Keep Agile Testing Agile
 
Don't hate, automate. lessons learned from implementing continuous delivery
Don't hate, automate. lessons learned from implementing continuous deliveryDon't hate, automate. lessons learned from implementing continuous delivery
Don't hate, automate. lessons learned from implementing continuous delivery
 
Automated Exploratory Tests
Automated Exploratory TestsAutomated Exploratory Tests
Automated Exploratory Tests
 
Topic production code
Topic production codeTopic production code
Topic production code
 
Expo qa15 Keynote
Expo qa15 KeynoteExpo qa15 Keynote
Expo qa15 Keynote
 
Scaling agile
Scaling agileScaling agile
Scaling agile
 
How to Master UX Testing in an Agile Design Process
How to Master UX Testing in an Agile Design ProcessHow to Master UX Testing in an Agile Design Process
How to Master UX Testing in an Agile Design Process
 
Winning strategies in Test Automation
Winning strategies in Test AutomationWinning strategies in Test Automation
Winning strategies in Test Automation
 

Plus de Patrick McKenzie

Conversion Optimization in Practice: BaconBiz 2013
Conversion Optimization in Practice: BaconBiz 2013Conversion Optimization in Practice: BaconBiz 2013
Conversion Optimization in Practice: BaconBiz 2013Patrick McKenzie
 
Patrick McKenzie Opticon 2014: Advanced A/B Testing
Patrick McKenzie Opticon 2014: Advanced A/B TestingPatrick McKenzie Opticon 2014: Advanced A/B Testing
Patrick McKenzie Opticon 2014: Advanced A/B TestingPatrick McKenzie
 
Microconf Europe 2013 -- Patrick McKenzie
Microconf Europe 2013 -- Patrick McKenzieMicroconf Europe 2013 -- Patrick McKenzie
Microconf Europe 2013 -- Patrick McKenziePatrick McKenzie
 
Selling Your Twilio-powered Apps to Businesses
Selling Your Twilio-powered Apps to BusinessesSelling Your Twilio-powered Apps to Businesses
Selling Your Twilio-powered Apps to BusinessesPatrick McKenzie
 
Building Stuff To Help You Sell The Stuff You Build
Building Stuff To Help You Sell The Stuff You BuildBuilding Stuff To Help You Sell The Stuff You Build
Building Stuff To Help You Sell The Stuff You BuildPatrick McKenzie
 
Productizing Twilio Applications
Productizing Twilio ApplicationsProductizing Twilio Applications
Productizing Twilio ApplicationsPatrick McKenzie
 
Software Businesses On 5 Hours A Week
Software Businesses On 5 Hours A WeekSoftware Businesses On 5 Hours A Week
Software Businesses On 5 Hours A WeekPatrick McKenzie
 
Software For Underserved Markets
Software For Underserved MarketsSoftware For Underserved Markets
Software For Underserved MarketsPatrick McKenzie
 
SEO for Software Companies
SEO for Software CompaniesSEO for Software Companies
SEO for Software CompaniesPatrick McKenzie
 
Data-Driven Software Design
Data-Driven Software DesignData-Driven Software Design
Data-Driven Software DesignPatrick McKenzie
 

Plus de Patrick McKenzie (10)

Conversion Optimization in Practice: BaconBiz 2013
Conversion Optimization in Practice: BaconBiz 2013Conversion Optimization in Practice: BaconBiz 2013
Conversion Optimization in Practice: BaconBiz 2013
 
Patrick McKenzie Opticon 2014: Advanced A/B Testing
Patrick McKenzie Opticon 2014: Advanced A/B TestingPatrick McKenzie Opticon 2014: Advanced A/B Testing
Patrick McKenzie Opticon 2014: Advanced A/B Testing
 
Microconf Europe 2013 -- Patrick McKenzie
Microconf Europe 2013 -- Patrick McKenzieMicroconf Europe 2013 -- Patrick McKenzie
Microconf Europe 2013 -- Patrick McKenzie
 
Selling Your Twilio-powered Apps to Businesses
Selling Your Twilio-powered Apps to BusinessesSelling Your Twilio-powered Apps to Businesses
Selling Your Twilio-powered Apps to Businesses
 
Building Stuff To Help You Sell The Stuff You Build
Building Stuff To Help You Sell The Stuff You BuildBuilding Stuff To Help You Sell The Stuff You Build
Building Stuff To Help You Sell The Stuff You Build
 
Productizing Twilio Applications
Productizing Twilio ApplicationsProductizing Twilio Applications
Productizing Twilio Applications
 
Software Businesses On 5 Hours A Week
Software Businesses On 5 Hours A WeekSoftware Businesses On 5 Hours A Week
Software Businesses On 5 Hours A Week
 
Software For Underserved Markets
Software For Underserved MarketsSoftware For Underserved Markets
Software For Underserved Markets
 
SEO for Software Companies
SEO for Software CompaniesSEO for Software Companies
SEO for Software Companies
 
Data-Driven Software Design
Data-Driven Software DesignData-Driven Software Design
Data-Driven Software Design
 

Dernier

QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 

Dernier (20)

QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 

A/B Testing Framework Design

  • 1. Company D LOGO A/B Testing Framework Design Issues Patrick McKenzie 2010 (This presentation is meant to be read. It is released under the Creative Commons By Attribution license – feel free to spread it or use it.) www.abingo.org By Patrick McKenzie 2010. Please use or send to people who'd benefit.
  • 2. Company D A/B Testing Frameworks LOGO • Why You Should Care • Core Use Scenarios • A/B Test Lifecycle • Design Decisions • Technical Considerations • API Considerations www.abingo.org
  • 3. Company D Why You Should Care LOGO There is a paucity of A/B testing frameworks. "I can probably name a dozen different systems for building high scale applications (distributed storage, message queues, caching layers, search engines, etc), but I can’t name a single AB testing framework other than Google Website Optimizer. That seems like a serious inversion of priorities for most startups." http://www.tomkleinpeter.com/2009 /01/21/where-are-the-ab-testing- frameworks/ www.abingo.org
  • 4. Company D Why You Should Care LOGO • A/B testing helps you validate your hypotheses about customers and product. • A/B testing is drop-dead easy if your tech supports it. • You won't do it otherwise, because it feels like boring busywork. The goal is to have split-testing be a continuous part of our development process, so much so that it is considered a completely routine part of developing a new feature. In fact, I've seen this approach work so well that it would be considered weird and kind of silly for anyone to ship a new feature without subjecting it to a split-test. That's when this approach can pay huge dividends. Eric Ries in blog post www.abingo.org
  • 5. Company D Why You Should Care LOGO • There are only two decent A/B test frameworks for Rails. Both less than 9 months old. • There are (to best of my knowledge) no OSS frameworks for Java, Python, etc. • You should write one. V1.0 can be done in 10 man hours in modern MVC frameworks. Will be best ROI you ever get. • This presentation hopes to save you time by telling you where the hard decisions are. www.abingo.org
  • 6. Company D Three Use Scenarios LOGO • Customers interacting with site. • Implementers coding A/B test. • Somebody interpreting results. www.abingo.org
  • 7. Company D User View of A/B Test LOGO (What Cindy Sees) www.abingo.org
  • 8. Company D User View of A/B Test LOGO (What Bob Sees) www.abingo.org
  • 9. Company D Key Points For Users LOGO • Users get consistent behavior. Cindy always sees her alternative. Bob always sees his. • A/B test doesn't break usage of site. (Sounds obvious, can be non-trivial. Test for interactions!) • Ending A/B test doesn't break site. Did you know that in Google Website Optimizer users can bookmark individual A/B alternatives because they have distinct URLs? And that after the test is over they may 404? Yeah. Don't do that. www.abingo.org
  • 10. Company D What Developers See LOGO • One line to add a test. • One line to track it. • No thought required beyond creating alternatives. www.abingo.org
  • 11. Company D What Internal Customers See LOGO • Simple, clear, actionable results. • Stats 101 not required. Your marketing team might know math. That doesn't mean they should have to. www.abingo.org
  • 12. Company D A/B Test Lifecycle LOGO • Come up with alternatives. • Code alternatives. • Test alternatives. • Deploy to site. • Users interact with alternatives. • Analyze results. • End test. When designing your A/B testing framework, keep in mind that you'll be doing all of the above. Eliminate as much friction from each step as possible – this decreases total time through the loop. www.abingo.org
  • 13. Company D Come up with alternatives. LOGO • Not generally a technical problem. • Inspiration can come from anywhere – a blog post, a passing fancy, customer comments. • Should never have to say "We can't do that!" • Strong recommendation: If we pay your salary, you are authorized to test. Customers do not think in terms of Model/View/Controller interfaces. They just want to know what the app can do. You should be able to A/B test from any point in the app. www.abingo.org
  • 14. Company D Code Alternatives LOGO • Programming is hard, but you have to do it anyway. • Programming A/B tests is easy – one liner and if statement. • Testing framework handles all bookkeeping – programmers never care. • Re-use conversion code. Typical businesses have lots of tests, few defined conversions. No need to reinvent wheel every single time. www.abingo.org
  • 15. Company D Test Alternatives LOGO • A/B tests are live code. They can have bugs. You should be able to unit test like normal. • Helpful for developers to have access to quick "switch what test I'm seeing" functionality. Simplest example: manually add parameter to URL (&exampleTest=altA). Turn off feature in production. • Careful of test interactions. Very easy to do once you start testing behavior in addition to display. www.abingo.org
  • 16. Company D Deploy to site. LOGO • Avoid pointless work here. "Push code live, test starts automatically" is the ideal. • Testing framework should handle its own setup first time test is called. After that, re- use. • Note this decision going to be made thousands or hundreds of thousands of times, possibly right after you push live: consider performance implications. • Can make code default to old version, control start/stop of test via dashboard. Could be worth it, adds complexity. www.abingo.org
  • 17. Company D Users interact with alternatives. LOGO • Happily, this takes very little work for you... • … except when it creates Heisenbugs. • In addition to thorough testing, make sure your "What The User Is Seeing" feature (you have one, right?) reflects their A/B tests. www.abingo.org
  • 18. Company D Analyze results. LOGO • Stats behind A/B tests may not be well understood. Impress that stats are real, measured, and actionable. It doesn't matter if they think it is magic as long as they trust the magic. • Do significance testing so it isn't magic. • Doing significance testing is grunt work: let the computer do it. • Spend the extra time to make internal dashboard pretty. People trust pretty things. • A/B tests not a good place to dig for data. One glance tells you all you need. www.abingo.org
  • 19. Company D End test LOGO • Simple solution: rip code out, test stops. • Simple solution requires redeploy. In event of bug or strong test result ("Oh my God what were we thinking!?!") might want immediate end button on dashboard. Be able to specify alternative. • Automatic end of test? Probably a misfeature, but easy to implement. • Ending test should switch all users to winner (or else you get to support old tests until doomsday). However, users have memories. • Negatively affected users (e.g. you end test in favor of higher price, user planning on buying later saw lower price) may be mad. Not big problem, but be ready. www.abingo.org
  • 20. Company D Design Considerations LOGO • Tracking and managing identity. • How to choose alternatives by identity. • Where to store test participation. • Where to store alternatives. • Stats is hard, let's go shopping. • Presenting results. www.abingo.org
  • 21. Company D Tracking Identity LOGO • Cindy is Cindy, Bob is Bob, Cindy should always see Cindy's tests. • Cindy is not a cookie. Cindy is not a session. Cindy is freaking Cindy. Even when she is on different computer. • You already have identity via user authentication. Probably want to punt identity problem there. Have it inform framework of current user identity. • Important edge case: new user signup should persist “identity” from anonymous visitor to identifiable user. www.abingo.org
  • 22. Company D Tracking Identity LOGO • Easiest identity is random number thrown into cookie. Associate with user accounts. Restore on login. Bam, done. • However, you will occasionally have A/B test conversions outside of Cindy's HTTP cycle. (e.g. Purchase notification comes from Paypal, not from Cindy. Cindy calls up to place order.) Think it through – not terribly difficult if you plan for it. www.abingo.org
  • 23. Company D How To Choose Alternatives LOGO • If you have N alternatives, picking randomly and persisting it by identity works decently. • Another approach: MD5(identity) % number_of_alts. Saves space (marginally). • Don't need to save what test Cindy is seeing as long as you can reproduce it. www.abingo.org
  • 24. Company D How To Choose Alternatives LOGO • If you have N alternatives, picking randomly and persisting it by identity works decently. • Another approach: MD5(identity) % number_of_alts. Saves space (marginally). • Don't need to save what test Cindy is seeing as long as you can reproduce it. www.abingo.org
  • 25. Company D Where to store test participation LOGO • Cookie/session bad idea: Cindy will log in at work tomorrow. She should see consistent behavior. • Cache (memcached) possible, but if Cindy is evicted from cache or cache resets, tough for Cindy and tough for you. • Persistent data store best bet. Will talk about specific data stores later in slides. www.abingo.org
  • 26. Company D Where to store alternatives LOGO • Many approaches. Whatever works for you. • A/Bingo puts alternatives directly in code. Easiest place, always right in front of developer, no conceptual overhead. • Vanity puts alternatives in special experiment files. Arguably cleaner code, but have to context/switch. • Google Website Optimizer has you define alternatives on a web form. Great for marketing department at insurance company. Don't do this. Greatly limits possibilities, increases integration work, www.abingo.org blows testing to heck and back.
  • 27. Company D Doing Stats LOGO • If possible, call out to dedicated stats modules/libraries to do stats. • Many types of possible stats for A/B testing. Pick one, stick with it. I use Z- scores because a) I remember them and b) implementation was drop-dead easy. • Sadly, Ruby lacks many good stats libraries. Oh, to be a Perl programmer... • This subject worth its own presentation. See Ben Tilly. http://elem.com/~btilly/effective-ab-testing/ www.abingo.org
  • 28. Company D Presenting Results LOGO • Text is easy! Graphs not quite. • Google's confidence bars are sexy... and pretty useless. • Simple, human language to describe what confidence intervals and statistical significance mean. • De-emphasize null results (A > B but not statistically significantly so) but don't hide them. (After all, the fact that "this test was too close to call" tells you something useful.) www.abingo.org
  • 29. Company D Technical Considerations LOGO • Less than 1,000 visitors per hour? Skip these slides. • A/B testing turns performance assumptions on head: heavy writes in very bursty fashion ("as soon as test goes live"), very non-relational data, fairly infrequent reads (~3X writes on my site), extraordinarily infrequent use of summary statistics. • Practically tailor-made for key/value store, not so much for SQL. www.abingo.org
  • 30. Company D Queries You Have To Answer FAST LOGO • Who is Cindy? (user → identity) • Is Cindy participating in Test X? • If so, what alternative has she seen? • If not, what alternative should she see? • Record fact that Cindy is participating in Test X. • Has Cindy converted in Test X? • Record fact that Cindy converted for Test X. www.abingo.org
  • 31. Company D Queries You Can Answer Leisurely LOGO • How many people have participated in Experiment X? • How many saw Alternative A? • Umm, do that stats magic for me. www.abingo.org
  • 32. Company D Query You Will NEVER ASK LOGO • Who saw Alternative A in Experiment X? www.abingo.org
  • 33. Company D Possible Architectures LOGO • Summary statistics (participant counts & conversion counts) in MySQL table with "fairly few" rows. Simple increment statements for updates. • Participation information (Cindy, Experiment X, Alternative A) in key/value store. • Or whole thing in key/value store. www.abingo.org
  • 34. Company D Quick Speed Improvement for SQL LOGO • Give each of your alternatives a unique string ID like MD5(experiment name, alternative name). Calculate that in application code. Index on column. • UPDATE alternatives SET participants = participants + 1 where lookup_code = 'CALCULATED IN APPLICATION'; • This avoids having to translate human name in code to ID in table. (Or having to use multi-column index for lookup.) • Note: I am not a very good guy with DBs, but I am informed this is fairly fast. Test for yourself. www.abingo.org
  • 35. Company D Specific Key/Value Store LOGO Recommendations • MySQL with big string columns for key, value: ewwwwww. I mean, ewwwwww. • Memcachd: Acceptable (and fast) but not persistent. Also tends to only go down when server does. For A/B testing, might just re-run all in progress tests if it dies. • MemcacheDB: Tried it. Has unacceptable performance when BerkeleyDB flushes to disk. (5 seconds+!) • Redis: Tried it. Not in production yet. My recommendation – very fast. Vanity also uses it. www.abingo.org
  • 36. Company D API Considerations LOGO Only need to expose two methods: • ab_test(name, alternatives, conversion_name) • conversion(conversion_name) Note lack of identity in method calls. Let the framework worry about that. How you specify alternatives up to you. Array of strings is easy to understand. www.abingo.org
  • 37. Company D Consuming API LOGO ab_test(name, alternatives, conversion_name) returns the chosen alternative, handles all bookkeeping as side effect. Typically: if (ab_test(...) == "something") { #do something } else { #do something else } Fun opportunity for blocks/binding if your language supports that. www.abingo.org
  • 38. Company D Got Questions? LOGO Great A/B testing resources: • Eric Ries (startuplessonslearned.com) – heavy on motivation, less on stats/design decisions • #abtests and @abtests on Twitter. Good community, many ideas for inspiration. • http://abtests.com – ditto • http://www.bingocardcreator.com/abingo/resources – links I use when I forget the math. • http://www.kalzumeus.com – my blog • patrick@bingocardcreator.com – I'm always happy to chat about A/B testing, with anybody. Potentially available for consulting. www.abingo.org