SlideShare a Scribd company logo
1 of 13
Recommender systems
evaluation: a 3D benchmark
  Alan Said1, Domonkos Tikk2, Yue
     Shi3, Martha Larson3, Klára
     Stumpf2, Paolo Cremonesi4

1: TU Berlin
2: Gravity R&D
3: TU Delft
4: Politecnico di Milano/Moviri
Motivation
• Current recsys evaluation benchmarks are
  insufficient
  – mostly focused on IR measures (RMSE,
    MAP@X, precision/recall)
  – does not consider the need of all stakeholders
    (users, content provider, recsys vendor)
  – technological and business requirements are
    mostly overlooked
• 3D Recommender System Benchmarking
  Model
Stakeholders




users


                      content of service
                          provider
        recommender
The Proposed 3D model
Recent benchmarks (1)

• pros:
  – Large scale
  – very well organized
• cons:
  – qualitative assessment of recommendation:
    simplified to RMSE
  – rating prediction (not ranking)
  – no focus on direct business and technical
    parameters (scalability, robustness, reactivity)
Recent benchmarks (2)


• pros:
  – constraints on training and response time
  – real traffic (only planned)
  – major driver: revenue increase
• cons:
  – only business goals, but otherwise unclear
    optimization criteria
  – user needs are neglected
  – organization
Recent Benchmarks (3)


• pros:
  – availability of additional metadata (compared to
    KDD Cup 2011)
  – not rating based (implicit feedback)
  – ranking based evaluation metric (MAP@500)
• cons:
  – offline evaluation
  – size does not matter anymore (lower interest)
  – no business requirements or technical constraint
3D MODEL
User requirements
• functional (quality-related)
  – relevant, interesting, novel, diverse,
    serendipitious, context-aware, ethical, etc.
• non-functional (technology related)
  – real-time
  – usability-related
Business requirements
• Business model
  – for-profit: revenue stream
  – NP-style: award driven (reputation,
    community building)
• KPI depends on the application area
  – Revenue increase
  – CTR
  – Raise awarness to content or service
Technical constraints
• data driven
  – availability of user feedback (e.g. satellite TV)
• system driven
  – hardware/software limitations (device-
    dependent)
• scalability
  – typical response time
• robustness
Example
• VoD recommendation scenario (TV)
  – user: easy contect exploration, context-
    awareness (time, viewer identification)
  – business: increase VoD sales & awareness
    (user base)
  – technical: middleware, HW/SW of the
    provider, response time
Facit
• Recommendation tasks have many aspects
  typically overlooked
• Tasks define the important user, business,
  and technical quality measures
  – the fulfilment of all is required at a certain level
  – trade-off is usually required
• Proposal: with our 3D evaluation concept
  more comprehensive evaluation can be
  achieved

More Related Content

Viewers also liked

Lessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scaleLessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scaleDomonkos Tikk
 
Recommenders on video sharing portals - business and algorithmic aspects
Recommenders on video sharing portals - business and algorithmic aspectsRecommenders on video sharing portals - business and algorithmic aspects
Recommenders on video sharing portals - business and algorithmic aspectsDomonkos Tikk
 
Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Domonkos Tikk
 
MovieTweetings: a movie rating dataset collected from twitter
MovieTweetings: a movie rating dataset collected from twitterMovieTweetings: a movie rating dataset collected from twitter
MovieTweetings: a movie rating dataset collected from twitterSimon Dooms
 
Context-aware similarities within the factorization framework (CaRR 2013 pres...
Context-aware similarities within the factorization framework (CaRR 2013 pres...Context-aware similarities within the factorization framework (CaRR 2013 pres...
Context-aware similarities within the factorization framework (CaRR 2013 pres...Balázs Hidasi
 
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&D
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&DChallenges Encountered by Scaling Up Recommendation Services at Gravity R&D
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&DDomonkos Tikk
 

Viewers also liked (6)

Lessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scaleLessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scale
 
Recommenders on video sharing portals - business and algorithmic aspects
Recommenders on video sharing portals - business and algorithmic aspectsRecommenders on video sharing portals - business and algorithmic aspects
Recommenders on video sharing portals - business and algorithmic aspects
 
Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...
 
MovieTweetings: a movie rating dataset collected from twitter
MovieTweetings: a movie rating dataset collected from twitterMovieTweetings: a movie rating dataset collected from twitter
MovieTweetings: a movie rating dataset collected from twitter
 
Context-aware similarities within the factorization framework (CaRR 2013 pres...
Context-aware similarities within the factorization framework (CaRR 2013 pres...Context-aware similarities within the factorization framework (CaRR 2013 pres...
Context-aware similarities within the factorization framework (CaRR 2013 pres...
 
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&D
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&DChallenges Encountered by Scaling Up Recommendation Services at Gravity R&D
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&D
 

Similar to Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012

Downloads abc 2006 presentation downloads-ramesh_babu
Downloads abc 2006   presentation downloads-ramesh_babuDownloads abc 2006   presentation downloads-ramesh_babu
Downloads abc 2006 presentation downloads-ramesh_babuHem Rana
 
10 - Project Management
10 - Project Management10 - Project Management
10 - Project ManagementRaymond Gao
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesAlan Said
 
Agile DevOps Transformation Strategy
Agile DevOps Transformation StrategyAgile DevOps Transformation Strategy
Agile DevOps Transformation StrategySatish Nath
 
Software engineering lecture notes
Software engineering lecture notesSoftware engineering lecture notes
Software engineering lecture notesSiva Ayyakutti
 
Module 6 - Systems Planning bak.pptx.pdf
Module 6 - Systems Planning bak.pptx.pdfModule 6 - Systems Planning bak.pptx.pdf
Module 6 - Systems Planning bak.pptx.pdfMASantos15
 
Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...
Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...
Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...Evergreen Systems
 
01. Developing Business _ IT Solutions 2011.ppt
01. Developing Business _ IT Solutions 2011.ppt01. Developing Business _ IT Solutions 2011.ppt
01. Developing Business _ IT Solutions 2011.pptiqbal051663
 
Se lect11 btech
Se lect11 btechSe lect11 btech
Se lect11 btechIIITA
 
ML Application Life Cycle
ML Application Life CycleML Application Life Cycle
ML Application Life CycleSrujanaMerugu1
 
Feasibility Study - Management PPT Slides
Feasibility Study  - Management PPT SlidesFeasibility Study  - Management PPT Slides
Feasibility Study - Management PPT SlidesNusaike Mufthie
 
Software Project Management
Software Project ManagementSoftware Project Management
Software Project ManagementShauryaGupta38
 
Requirements Gathering And Management
Requirements Gathering And ManagementRequirements Gathering And Management
Requirements Gathering And ManagementAlan McSweeney
 
Software engineering jwfiles 3
Software engineering jwfiles 3Software engineering jwfiles 3
Software engineering jwfiles 3Azhar Shaik
 
City universitylondon devprocess_g_a_reitsch
City universitylondon devprocess_g_a_reitschCity universitylondon devprocess_g_a_reitsch
City universitylondon devprocess_g_a_reitschalanreitsch
 
ASUG Utilities Presentation
ASUG Utilities PresentationASUG Utilities Presentation
ASUG Utilities PresentationMichael Robinson
 

Similar to Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012 (20)

Downloads abc 2006 presentation downloads-ramesh_babu
Downloads abc 2006   presentation downloads-ramesh_babuDownloads abc 2006   presentation downloads-ramesh_babu
Downloads abc 2006 presentation downloads-ramesh_babu
 
10 - Project Management
10 - Project Management10 - Project Management
10 - Project Management
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System Challenges
 
Agile DevOps Transformation Strategy
Agile DevOps Transformation StrategyAgile DevOps Transformation Strategy
Agile DevOps Transformation Strategy
 
Software engineering lecture notes
Software engineering lecture notesSoftware engineering lecture notes
Software engineering lecture notes
 
Module 6 - Systems Planning bak.pptx.pdf
Module 6 - Systems Planning bak.pptx.pdfModule 6 - Systems Planning bak.pptx.pdf
Module 6 - Systems Planning bak.pptx.pdf
 
Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...
Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...
Leveraging IT Service Catalog to Transform Services Delivery - Argonne Nation...
 
01. Developing Business _ IT Solutions 2011.ppt
01. Developing Business _ IT Solutions 2011.ppt01. Developing Business _ IT Solutions 2011.ppt
01. Developing Business _ IT Solutions 2011.ppt
 
Se lect11 btech
Se lect11 btechSe lect11 btech
Se lect11 btech
 
PMI Presentation2
PMI Presentation2PMI Presentation2
PMI Presentation2
 
ML Application Life Cycle
ML Application Life CycleML Application Life Cycle
ML Application Life Cycle
 
Feasibility Study - Management PPT Slides
Feasibility Study  - Management PPT SlidesFeasibility Study  - Management PPT Slides
Feasibility Study - Management PPT Slides
 
Software Project Management
Software Project ManagementSoftware Project Management
Software Project Management
 
Requirements Gathering And Management
Requirements Gathering And ManagementRequirements Gathering And Management
Requirements Gathering And Management
 
Software engineering jwfiles 3
Software engineering jwfiles 3Software engineering jwfiles 3
Software engineering jwfiles 3
 
Chap01
Chap01Chap01
Chap01
 
City universitylondon devprocess_g_a_reitsch
City universitylondon devprocess_g_a_reitschCity universitylondon devprocess_g_a_reitsch
City universitylondon devprocess_g_a_reitsch
 
ASUG Utilities Presentation
ASUG Utilities PresentationASUG Utilities Presentation
ASUG Utilities Presentation
 
Dpbok context i
Dpbok   context iDpbok   context i
Dpbok context i
 
Soft requirement
Soft requirementSoft requirement
Soft requirement
 

More from Domonkos Tikk

General factorization framework for context-aware recommendations
General factorization framework for context-aware recommendationsGeneral factorization framework for context-aware recommendations
General factorization framework for context-aware recommendationsDomonkos Tikk
 
Tartalomgazdagítás (content enrichment)
Tartalomgazdagítás (content enrichment) Tartalomgazdagítás (content enrichment)
Tartalomgazdagítás (content enrichment) Domonkos Tikk
 
Idomaar crowd rec_reference_fw
Idomaar crowd rec_reference_fwIdomaar crowd rec_reference_fw
Idomaar crowd rec_reference_fwDomonkos Tikk
 
Big Data in Online Classifieds
Big Data in Online ClassifiedsBig Data in Online Classifieds
Big Data in Online ClassifiedsDomonkos Tikk
 
Context-aware similarities within the factorization framework - presented at ...
Context-aware similarities within the factorization framework - presented at ...Context-aware similarities within the factorization framework - presented at ...
Context-aware similarities within the factorization framework - presented at ...Domonkos Tikk
 
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...Domonkos Tikk
 
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...Domonkos Tikk
 
From a toolkit of recommendation algorithms into a real business: the Gravity...
From a toolkit of recommendation algorithms into a real business: the Gravity...From a toolkit of recommendation algorithms into a real business: the Gravity...
From a toolkit of recommendation algorithms into a real business: the Gravity...Domonkos Tikk
 

More from Domonkos Tikk (8)

General factorization framework for context-aware recommendations
General factorization framework for context-aware recommendationsGeneral factorization framework for context-aware recommendations
General factorization framework for context-aware recommendations
 
Tartalomgazdagítás (content enrichment)
Tartalomgazdagítás (content enrichment) Tartalomgazdagítás (content enrichment)
Tartalomgazdagítás (content enrichment)
 
Idomaar crowd rec_reference_fw
Idomaar crowd rec_reference_fwIdomaar crowd rec_reference_fw
Idomaar crowd rec_reference_fw
 
Big Data in Online Classifieds
Big Data in Online ClassifiedsBig Data in Online Classifieds
Big Data in Online Classifieds
 
Context-aware similarities within the factorization framework - presented at ...
Context-aware similarities within the factorization framework - presented at ...Context-aware similarities within the factorization framework - presented at ...
Context-aware similarities within the factorization framework - presented at ...
 
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
 
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...
 
From a toolkit of recommendation algorithms into a real business: the Gravity...
From a toolkit of recommendation algorithms into a real business: the Gravity...From a toolkit of recommendation algorithms into a real business: the Gravity...
From a toolkit of recommendation algorithms into a real business: the Gravity...
 

Recently uploaded

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Recently uploaded (20)

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 workshop at ACM Recsys 2012

  • 1. Recommender systems evaluation: a 3D benchmark Alan Said1, Domonkos Tikk2, Yue Shi3, Martha Larson3, Klára Stumpf2, Paolo Cremonesi4 1: TU Berlin 2: Gravity R&D 3: TU Delft 4: Politecnico di Milano/Moviri
  • 2. Motivation • Current recsys evaluation benchmarks are insufficient – mostly focused on IR measures (RMSE, MAP@X, precision/recall) – does not consider the need of all stakeholders (users, content provider, recsys vendor) – technological and business requirements are mostly overlooked • 3D Recommender System Benchmarking Model
  • 3. Stakeholders users content of service provider recommender
  • 5. Recent benchmarks (1) • pros: – Large scale – very well organized • cons: – qualitative assessment of recommendation: simplified to RMSE – rating prediction (not ranking) – no focus on direct business and technical parameters (scalability, robustness, reactivity)
  • 6. Recent benchmarks (2) • pros: – constraints on training and response time – real traffic (only planned) – major driver: revenue increase • cons: – only business goals, but otherwise unclear optimization criteria – user needs are neglected – organization
  • 7. Recent Benchmarks (3) • pros: – availability of additional metadata (compared to KDD Cup 2011) – not rating based (implicit feedback) – ranking based evaluation metric (MAP@500) • cons: – offline evaluation – size does not matter anymore (lower interest) – no business requirements or technical constraint
  • 9. User requirements • functional (quality-related) – relevant, interesting, novel, diverse, serendipitious, context-aware, ethical, etc. • non-functional (technology related) – real-time – usability-related
  • 10. Business requirements • Business model – for-profit: revenue stream – NP-style: award driven (reputation, community building) • KPI depends on the application area – Revenue increase – CTR – Raise awarness to content or service
  • 11. Technical constraints • data driven – availability of user feedback (e.g. satellite TV) • system driven – hardware/software limitations (device- dependent) • scalability – typical response time • robustness
  • 12. Example • VoD recommendation scenario (TV) – user: easy contect exploration, context- awareness (time, viewer identification) – business: increase VoD sales & awareness (user base) – technical: middleware, HW/SW of the provider, response time
  • 13. Facit • Recommendation tasks have many aspects typically overlooked • Tasks define the important user, business, and technical quality measures – the fulfilment of all is required at a certain level – trade-off is usually required • Proposal: with our 3D evaluation concept more comprehensive evaluation can be achieved