SlideShare une entreprise Scribd logo
1  sur  243
Télécharger pour lire hors ligne
FLYING BLIND ON A ROCKET CYCLE
PIONEERING EXPERIENCE-CENTERED PRODUCT
STRATEGY FOR EMERGING SPACES
JOE LAMANTIA
Currently: VP Design & Development @ Bottomline Technologies
Previous 20 years: end-to-end customer experience, all stages of product and
service development, and digital / business transformation, focusing on
emerging business and technology.
Archetype(s): Sometime Entrepreneur / Proto-academic / Arm-chair Pro Cyclist
https://www.linkedin.com/in/digitaljoelamantia/
@mojoe
JoeLamantia.com [joelamantia.net]
!3
Businesses around the world depend on Bottomline Technologies
(NASDAQ: EPAY) solutions to help them make complex business
payments simple, smart and secure, including some of the world’s largest
banks, and private and publicly traded companies.
This case study describes building a learning-
driven strategy capability to guide an
adventurous product development group
focused on the new domains of big data
analytics and machine intelligence.
I’ll share the outcomes of our efforts to launch
new products chartered directly around
customer experience value; outline the
methods, tools, and perspectives that powered
product discovery and strategic planning; share
a framework and patterns for identifying and
understanding emerging domains; and review
the application of this toolkit to new situations.
EMERGING SPACES
ROADS?!
WHERE WE’RE GOING, WE DON’T
NEED ROADS…!
= $$
DATA SCIENCE
MACHINE INTELLIGENCE
PRODUCT STRATEGY?
STRATEGY…
BUSINESS STRATEGY IS ABOUT
IDENTIFYING YOUR BUSINESS
OBJECTIVES AND DECIDING WHERE TO
INVEST TO BEST ACHIEVE THOSE
OBJECTIVES.
Marty Cagan
http://svpg.com/business-strategy-vs-product-strategy/
THE PRODUCT STRATEGY SPEAKS TO
HOW YOU HOPE TO DELIVER ON THE
BUSINESS STRATEGY.
Marty Cagan
http://svpg.com/business-strategy-vs-product-strategy/
http://rethinkingproductmanagement.blogspot.com/2012/06/product-strategy-what-does-it-mean-need.html
http://melissaperri.com/2016/07/14/what-is-good-product-strategy/
PRODUCT STRATEGY
http://mashable.com/2015/02/13/fifty-shades-of-grey-mad-libs/#7Te9vMnONqqF
OPPORTUNITY ASSESSMENT
“I ASK PRODUCT MANAGERS TO ANSWER TEN FUNDAMENTAL QUESTIONS”
1. Exactly what problem will this solve? (value proposition)
2. For whom do we solve that problem? (target market)
3. How big is the opportunity? (market size)
4. What alternatives are out there? (competitive landscape)
5. Why are we best suited to pursue this? (our differentiator)
6. Why now? (market window)
7. How will we get this product to market? (go-to-market strategy)
8. How will we measure success/make money from this product? (metrics/revenue strategy)
9. What factors are critical to success? (solution requirements)
10.Given the above, what’s the recommendation? (go or no-go)
http://svpg.com/assessing-product-opportunities/
Assessing
Product
Opportunities
by Marty Cagan | Dec 13, 2006
PRODUCT DISCOVERY
MODERN PRODUCT DISCOVERY
• Introduction [:26]
• Modern Product Discovery [:54]
• The Evolution of Modern Product Discovery [4:15]
• The Agile Manifesto [7:06]
• The Rise of User Experience Design [8:47]
• The Lean Startup: Eric Ries [9:49]
• The Jobs-To-Be-Done Framework: Clayton Christensen and Anthony Ulwick [10:42]
• OKRs and Design Sprints [12:12]
• The Goal of Modern Product Discovery [14:27]
• Putting Discovery Practices Into Context: The Opportunity Solution Tree [21:32]
• The Future of Product Discovery [29:42]
https://www.producttalk.org/2017/02/evolution-product-discovery/
The Evolution of
Modern Product
Discovery

February 8, 2017 by Teresa Torres 9
Comments
http://rethinkingproductmanagement.blogspot.com/2012/06/product-strategy-what-does-it-mean-need.html
WHY ARE YOU HERE…?
WHERE ARE YOU
GOING…?
PRODUCT STRATEGY CHARTS A DESIRED
SET OF COURSES THROUGH THE SPACE
OF POSSIBLE PRODUCTS FOR A DOMAIN
Joe Lamantia
http://svpg.com/business-strategy-vs-product-strategy/
Johnny Appleseed
TEXT
OBSERVE
ORIENT
ACT
DECIDE
OPPORTUNITY
ASSESSMENT
PRODUCT
DISCOVERY
INVEST…?
PORTFOLIO PLANNING
WHAT AM I LOOKING
FOR…?
DEEP STRUCTURE
CHANGE VECTORS
EARLY SIGNALS
INFLECTION POINTS
EMERGING SPACES
HOLISTIC EXPERIENCES
EACH ASPECT =
POTENTIAL
LEVERAGE POINT
FOR STRATEGIC
ENGAGEMENT
DEEP STRUCTURE
CHANGE VECTORS
EARLY SIGNALS
INFLECTION POINTS
EMERGING SPACES
HOLISTIC EXPERIENCES
DEEP STRUCTURE
ENTERPRISE / B2B
• Business process
• Activity
• Social structure: Organizational model
• Boundaries
• Regulation
• IT / Systems architecture
• Lifecycle
• Flows: capital, information, people
• Frame: shareholder value, social enterprise
CONSUMER / B2C
• Value scheme: wealth, love,
knowledge, safety
• Demographics
• Boundaries
• Mores
• Culture
• Social structure: community / group
• Frame: active lifestyle, sustainability
ONCE UPON A TIME…
Information Visibility through Endeca Discovery Applications
MDEX Engine
Rapidly
changing

data and
content
Large volumes of 

highly attributed records
Structured and

unstructured
information
Discovery Applications
Intuitive user experience guides
untrained users to discover relationships
in data
Specialized Database
High performance database purpose
built for data-driven search, navigation,
and analytics
Flexible Data Integration
Consolidate structured and unstructured
data to bridge whitespace between
enterprise systems
$$$$
ASSIMILATE!
…NOW WHAT…?
THE
NEW
GIG
1. GET IN THE HEADS OF DATA SCIENTISTS
2. BE THE SPIRIT OF THE PRODUCT
BUT HOW…?
CONTINUOUS LEARNING
LEAN STRATEGY
CONTINUOUS LEARNING
UNDERSTAND & EMPATHIZE
WITH CUSTOMER PERSPECTIVES
>>ARTICULATE CUSTOMER VALUE SOURCES
IDENTIFY BUSINESS IMPLICATIONS
>> INFORM ALL STAGES OF PRODUCT & SERVICE DEVELOPMENT
INVESTIGATING CUSTOMERS
EXPLORING HYPOTHESES ABOUT VALUE
INVESTIGATING CUSTOMERS:
“WHAT DO AP MANAGERS NEED (TO BE
MORE EFFECTIVE (AT IMPROVING
RECONCILIATION PROCESSES))? WHY?”
OUTCOMES
VALUE CHAINS MAP, CUSTOMER
LANDSCAPE / SEGMENTS, PERSONAS,
CAPABILITY MODELS, DOMAIN MODELS
EXPLORING HYPOTHESES ABOUT VALUE:
“AUTOMATION OF RECONCILIATION ACTIVITIES
WILL ENABLE
ACCOUNTS PAYABLE GROUPS IN MID-MARKET
COMPANIES
TO
HANDLE 30% MORE TRANSACTIONS.”
PRODUCT DEVELOPMENT IMPACT
INNOVATION OPPORTUNITIES
PRODUCT HYPOTHESES FOR VALIDATION
PRODUCT CONCEPTS FOR PROTOTYPING
PLANNING GUIDANCE (ROADMAP > EPIC > QA)
DELIVERY GUIDANCE: FEATURES AND FUNCTIONS
INCREMENTAL
EXPLORATORY
PROGRESSIVE
CUMULATIVE
STRUCTURED
ADAPTIVE
DUAL-TRACK AGILE
1. Hypothesis A “Lorum ipsem…”
2. Hypothesis B
3. Investigate A
4. Hypothesis C
5. Investigate B
6. Investigate C
INVESTIGATE
Data Scientist
Square - San Francisco Bay Area
Job Description
Square is hiring a Data Scientist on our Risk team. The Risk team at Square is responsible for enabling growth while mitigating financial loss associated with transactions. We work
closely with our Product and Growth teams to craft a fantastic experience for our buyers and sellers.
Desired Skills & Experience
As a Data Scientist on our Risk team, you will use machine learning and data mining techniques to assess and mitigate the risk of every entity and event in our network. You will
sift through a growing stream of payments, settlements, and customer activities to identify suspicious behavior with high precision and recall. You will explore and understand our
customer base deeply, become an expert in Risk, and contribute to a world-class underwriting system that helps Square provide delightful service to both buyers and sellers.



To accomplish this, you are comfortable writing production code in Java and conducting exploratory data analysis in R and Python. You can take statistical and engineering ideas
from prototype to production. You excel in a small team setting and you apply expert knowledge in engineering and statistics.



Responsibilities
1. Investigate, prototype and productionize features and machine learning models to identify good and bad behavior.
2. Design, build, and maintain robust production machine learning systems.
3. Create visualizations that enable rapid detection of suspicious activity in our user base.
4. Become a domain expert in Risk.
5. Participate in the engineering life-cycle.
6. Work closely with analysts and engineers.
Requirements
1. Ability to find a needle in the haystack. With data.
2. Extensive programming experience in Java and Python or R.
3. Knowledge of one or more of the following: classification techniques in machine learning, data mining, applied statistics, data visualization.
4. Concise verbal and written articulation of complex ideas.
Even Better
1. Contagious passion for Square’s mission.
2. Data mining or machine learning competition experience.
Company Description
Square is a revolutionary service that enables anyone to accept credit cards anywhere. Square offers an easy to use, free credit card reader that plugs into a phone or iPad. It's
simple to sign up. There is no extra equipment, complicated contracts, monthly fees or merchant account required.



Co-founded by Jim McKelvey and Jack Dorsey in 2009, the company is headquartered in San Francisco.
The Conway Model The ‘Subway’ Model
WHAT SORT OF PERSON?
▸ They seem different than analysts:
▸ problem set
▸ relationship to discovery tools
▸ skills and professional profile
▸ discovery / analytical methods
▸ perspective
▸ workflow and collaboration
▸ Are they? How?
AREAS OF INVESTIGATION
▸ Workflow
▸ Environment
▸ Organizational model
▸ Pain points
▸ Tools
▸ Data landscape
▸ Analytical practices
▸ Project structure
▸ Unmet needs
TEXT
DISCUSSION GUIDE
Can you please walk me through a recent or current project?
a. How was the project initiated?
b. How defined was the business problem in the beginning? Did the problem change?
c. Where/who did you obtain data sets from? How did you make the decision?
d.Describe the data you used: How did the data sets look like? How big were they? Were they structured or unstructured?
e. What tools or techniques did you use to do the analyses? Did they map to the specific steps you mentioned just now?
f. How did you decide these were the tools/techniques to use? To what extent were these decisions made by yourself and to what extent were
they standardized by your group/team?
g. How did you present the results of your analyses? What tools did you use? What do you like and dislike about your current tool set?
h. Which stage of this project was the most challenging? To what extent did the tools satisfy what you intended to do? What features were lacking?
i. How much collaboration was there during each stage of the project?
i. Background and role of collaborators
ii. Collaboration modes
iii. Types of information shared
Thinking about the projects you have worked on, is there a common approach you take to address these problems?
How did you decide on this approach/tools?
NEEDS
What are the most common and useful statistical techniques you use during discovery and analysis efforts?
“(1) The most commonly used statistical techniques used to date (in our strategic planning work) are:  dimensionality
reduction (partition clustering, multiple correspondence analysis), factor analysis, partition clustering (k-means, k-medoids,
fuzzy clustering), cluster validation techniques (silhouette, dunn’s index, connectivity), multivariate outlier detection, linear
regression, and logistic regression.”
What statistical capabilities or functions would be very useful if provided within Endeca discovery applications, and where
would they be useful?
(2) Techniques that would assist with identifying outliers or invalid data.  Much of this work seems to be done by hand.  I
believe that we are also getting to the point where we could start using linear regression and splines (for showing trends).”
NEEDS
For example, would system-generated descriptive statistical visualizations be useful for whole data sets - or for smaller user-
selected groups of attributes?  
“With regards to your last question on visualization, we have put in significant effort to use visualization in our Endeca
installation.  We have built visualizations such as tree maps, flow diagrams, sun burst diagrams, scatter plots showing clusters,
and hierarchical edge bundling diagrams to explore our data sets. 
Would it be useful for the application to analyze and suggest possible distribution models it sees in the data; for the values of
individual attributes, and/or for larger sets of data?
Our data tends to be qualitative rather than quantitative so this drives much of our visualizations.
So yes, interactive descriptive statistical visualization would be helpful – on the complete data set and individual attributes.”
Discovery/Information Needs
Support longer term strategic planning:
•How can we decrease the time-to-install service for new customers
•How can we decrease the time it takes to restore service after a storm causes wide-
spread outages
•How can we decrease operational cost for each department/line of business
•How many call center representatives do I need in my call center
•How much offsite technician headcount do we need based on historical/seasonal
trends balanced against current customer install base and ongoing sales/marketing
efforts? 
Evaluate Success:
•How effective was a particular marketing campaign
•How effective is a new training program for call center representatives
•How effective is a self-install approach
Understanding variables that impact KPIs.  KPIs include:
•Call center volume
•% successful resolution by support staff
•Time-to-install
•Sales volume
•Sales revenue 
Understanding & Explaining Variance using Retrospective Analyses
•Why does Connecticut have a shorter time-to-install than Rhode Island
•Why did 2 identical marketing campaigns in 2 different markets have vastly
different impact on sales
•Is the variance significant, or does it represent random deviation?
 Ad-hoc Reporting
•How many calls to the call center needed to be escalated to tier 2 support last
month
•How many new customers complained that a technician was later/didn't show up for
the install appointment
Analyst Profile: Scott – Operations Analyst
Summary
Education
BA Information Systems (Connecticut State College)
MBA  Org Leadership (Johnson & Wales)
Scott is a mid-level analyst with a background in Business
Information Systems, and MBA in Organizational
Leadership.  He works in a 6-person team at Cox-New England
(Telecommunications). His current role involves conducting data mining
analysis to support operations research and organizational decision
making/strategic planning.
Scott's work supports both sides of the profit equation: operations
research/analysis to support internal cost-cutting and process innovation,
and formative/summative evaluation to help drive effective sales/
marketing efforts to increase revenue.  His group is also given target cost
savings goals that they need to help individual departments achieve to
fulfill a cost reduction organizational mandate.  His group accomplishes
this by discovering inefficiencies in process through data mining,
predictive modeling and retrospective data analysis.
Cox has highly attributed enterprise data on customers, marketing
campaigns, pricing variants and special offers, demographics, geography
of the area, building and home types, school schedules, weather events,
etc. that describe customer usage patterns, consumption of media
bandwidth, etc. Each of their products (data, cable, phone, wireless) has
different usage profiles that vary along many of the dimensions and
variables listed above. His group is focused on residential customers;
business customers are handled by a separate unit.
 
 
‘FIVE THINGS ANALYSTS DO WITH DATA’
▸ Clustering
▸ Dimension Reduction
▸ Anomaly Detection
▸ Characterization
▸ Testing probability model & validation
Source: Frontiers in Massive Data Analysis
http://www.nap.edu/openbook.php?record_id=18374
}
}
Structure of data
Profile of data
} Validity of data
Findings
Skillz
Business Analytics Data Science
Intuitive
Manual
Gradual
Individual
Empirical
Augmented
Accelerated
Cooperative*
Nature of sense making activity
Data Scientist: Profile
Sense Maker Segment
Sense makers need to create and/or employ insights to accomplish
their business goals and satisfy their responsibilities.
These insights emerge from independent and collaborative discovery
efforts that involve direct interaction with discovery applications, and
participation in discovery environments.
Insight Consumer
Analyst
Casual Analyst
Data Scientist
Analytics Manager
Problem Solver
Creates data-driven insights, offerings, and resources to transform the organization
Work Experience 10 Years
Education Ph.D. Statistics, MS Bio-Informatics
Job Title Senior Data Scientist
Company LInkedIn
Summarize & Communicate
Review findings with colleagues;
summarize ,visualize, and
communicate key findings to
Insight Consumers/decision makers
Prototype & Experiment
with data driven
feature:
How can we prototype/
evaluate this w/out
disrupting the site?
Gather Data &
Analyze Results
Use descriptive,
inferential, and
predictive statistics
to evaluate results
Analyze & Identify causal/
predictive factors:
Who are the best
candidates to contact for a
job based on recruiter
needs and profile content?
Dana Data Scientist
• Defining and capturing useful measures of
online attention
• Getting all the data analytic tools to work
together properly
• No current workflow support or tools for data
wrangling, analysis, experimentation,, and
prototyping
• Effective tools to help experiment with and
evaluate value /utility of features and
activities for users
• Ability to rapidly prototype data-driven
features w/out risk of online service
disruptions
• Open source data manipulation, mining &
analysis tools including R, Pig, Hadoop, Python,
etc.
• Statistical packages such as SAS, SPSS, etc.
• Custom analytical tools built using open source
components and languages
• Leverage data to support the org mission
• Enhance products & services with data-driven
insights and features
• Use data to identify new opportunities and
prototype/drive new customer offerings
• Create useful data sets/streams, measures, &
resources (e.g., data models, algorithms, etc.
Key Goals
Tools
Pain Points
Wish List
Sample Workflow
Dana is a Senior Data Scientist who has worked at LinkedIn for 5 years.
Dana’s education includes a Ph.D. in Statistics and an MS in Bio
Informatics. Dana’s previous work includes positions in academic research
groups as a doctoral candidate and post-doc, as well as software
engineering roles in the Internet & technology industries.
•Dana works with several other data scientists and her Analytics Manager
on a centralized team
•Dana and her colleagues aim to create data driven insights, features,
resources, and offerings that deliver strategic value to LinkedIn
•Dana works with Analysts on other teams to define and create discovery
tools, data sets, and methods for use by their groups at LinkedIn.
•Dana & team are visible & well established within LinkedIn, and have a
voice in product strategy and operational context; they have a high
degree of autonomy in defining data science projects
•Dana works with Insight Consumers to suggest and determine potential
new data driven offerings to prototype and evaluate.
• How can we leverage data to increase online engagement with LinkedIn?
•How should we measure engagement & what factors drive it?
•What aspects of a personal profile are most likely to encourage /
discourage new connections between people?
•How can we increase people’s activity and contributions to topical
discussion groups?
• What factors drive the effectiveness of our marketing campaigns?
•Why did one of our marketing campaigns work exceptionally well?
• How can leverage data to help recruiters identify and communicate
effectively with qualified and potentially available candidates?
Typical Discovery Scenarios & Problems
Background
Work Context
• Mines, analyzes, & experiments with data to
identify patterns, trends, outliers, causal
factors, predictive models, & opportunities
• Defines and explains newly devised
measurements, predictive models, &
insights
• Compares effectiveness of operations at
achieving company goals for engagement,
growth, data quality
• Produces & explores new data sets
• Collaborates with other data scientists to
capture new data streams
• Prototypes new data driven site features/
offerings
• Runs data based experiments to test/
evaluate models, hypotheses & prototypes
• Communicates & explains analyses to
colleagues & Insight Consumers
I’ll do whatever it takes – wrangle,
extract, manipulate, analyze,
experiment, prototype – to use
data to drive value & innovate
“
”
Activities
Perspectives
Analytical
The analytical perspective is the center of definition for all
analytical roles. Contrast with engineers, who "make stuff".
Analytical roles figure things out for some purpose: whether a
model to inform a product prototype or provide insight.
Empirical
The empirical perspective is distinct from the analytical
perspective, and marks 'true' data scientists. This revolves
around framing and testing hypotheses formally and informally,
often requires validation and interrogation of experimental
methods and results by others, expects significant degree of
transparency at (all) stages of the analytical effort.
Empirical Method
Experiments
Hypotheses
Results
Questions or
beliefs
Predictions
Conclusions
Insights
Domain
Production
Models
Data Sets
Exploratory ValidationInvestigative TrainingModel Building
Analytical
Methods
Insight
Consumer
Data
Scientist
Articulates
Directs
& applies
Creates & refines
Effected by
Lead to
Tested by
Use / require
Motivate
Creates & refines
Generate
Achieves
Informed by & shares
Inform
Understands
Defines & evolves
Inform
Data
Engineer
Implements
Determines
Applied to validates
Data Sources
Used to define
Applied to
Development
Corpus
External
Sources
Production
Corpus
Mirrors
Applied to
Models
Reference Initial Interim New
Drawn from
Analytical Tool
Algorithm Script Test
Implemented as
Implements
Inform
What is the question?
How will we answer the question?
What data will we use?
What analytical method will we use?
What tools will we use?
What are the results?
What do the results mean?
What did we learn / discover?
Who should we inform?
What is the next question?
Manages
Data ProductsManages
EMPIRICAL DISCOVERY
“a hybrid, purposeful,
applied, augmented, iterative
and serendipitous method for
realizing novel insights for
business, through analysis of
large and diverse data sets.”
Data Science and Empirical Discovery: A New Discipline Pioneering a New
Analytical Method
https://blogs.oracle.com/serendipity/entry/data_science_and_empirical_discovery
Data Science
Insight
Model
Insight
Model
Data Product
Product
Analysts
Outcomes
Analysis Workflow & Activities
• Empirical analysis of subsets of data
–Understand topology of data, boundaries (sets / subsets, complete corpus,
totality of data)
• Outlier identification and profiling
–How significant are outliers to overall topology
»Comparative exclusion and profiling of resulting data subsets to understand their role,
discover principal components
• Find and analyze patterns, areas of interestingness / deserving attention
• Find and analyze central actors / factors (in existing model that produced
source data, in topology of working data, in patterns, etc.)
–ID and understand their impact on local and global data topology and primary metrics if in several ways
/ more than one axis / at the same time
• Discover and analyze relationships amongst central actors
–Understand cycles, trends, changes (dynamic characteristics) for core actors,
topology, patterns and structure
–Understand causal factors
• Codify / create new model reflecting insights & outcomes from experiments
Data Science Workflow
• Frame problem / goal of effort
• Identify and extract data to be used in effort from whole corpus / totality of available
data
–Exploratory identification and selection of working data for use in experiments
• Define experiment(s): hypothesis / null hypothesis, methods, success criteria
–Derive insight(s)
–Wrangle, process, visualize, interpret
• Codify / create new model reflecting insights outcomes from experiments
• Validate new model(s)
• Provision training data
• Train new model
• Validation and outcome of training model
• Hand-off for implementation on production systems / as production code
THE ESSENCE
▸Empirical perspective
▸Business imperatives drive activities
▸Analytical approach
▸Recipe is always the same
▸Engineering always present
▸Data challenges are paramount
▸consume 60% - 80% of time and effort
▸Data volumes range huge to moderate (PB > MB)
▸Domain often drives analysis
▸Data scientists already have self-service
▸Some new problems, many the same
▸Use ‘advanced’ analytics, not conventional BA
▸Innovate by applying known analyses to new data
▸Current workflow fragmented across tools and data stores
▸Success can be a model, product, insight, infrastructure, tool
Model of Analytical Workflow
Articulates common analytical activities
“realistic” - represents wrangling, some iterative dynamics
bounded - does not represent business perspective
Originated by Ben Lorica - O’Reilly
*consistent with our research*
UNDERSTAND & EMPATHIZE
WITH CUSTOMER PERSPECTIVES
>>ARTICULATE CUSTOMER VALUE SOURCES
OPPORTUNITY
ASSESSMENT
PRODUCT
DISCOVERY
INVEST…?
PORTFOLIO PLANNING
THE ESSENCE
▸Empirical perspective
▸Business imperatives drive activities
▸Analytical approach
▸Recipe is always the same
▸Engineering always present
▸Data challenges are paramount
▸consume 60% - 80% of time and effort
▸Data volumes range huge to moderate (PB > MB)
▸Domain often drives analysis
▸Data scientists already have self-service
▸Some new problems, many the same
▸Use ‘advanced’ analytics, not conventional BA
▸Innovate by applying known analyses to new data
▸Current workflow fragmented across tools and data stores
▸Success can be a model, product, insight, infrastructure, tool
“…HOUSTON, WE'VE GOT A PROBLEM”
John is tasked with analyzing 30 years of crime data collected by three different authorities. Accordingly, the data arrive in three different formats: one source is a relational database, another is a comma-separated values (CSV)
file, and the third file contains data copied from various tables within a portable document format (PDF) report. Knowing the structure required for his visualization tool, John first reviews the different data sets to identify potential
problems (step 1 in Figure 1).
The relational database allows him to specify a query and generate a file in an acceptable format. For the comma delimited data, the column headings associated with the data were unclear. Using spreadsheet software he adds a
row of header information at the top to fit the format required by the visualization tool. While updating the header, John notices that the location of a given crime is encoded in one column (as ‘City, State’) in the CSV file and
encoded in two columns (one ‘City’ column and one ‘State’ column) in the relational database.
He decides to split the column in the CSV file into two separate columns. John then opens the text file in the spreadsheet but the spreadsheet does not parse the data as desired. After manually moving data fields to appropriate
columns and some other manipulation (step 2), John finally has consistent columns and now combines the three files into one, but then notices that some columns have inconsistently formatted cells.
The ‘Date’ column is formatted as ‘dd/mm/yy’ in some cells and as ‘mm/dd/yyyy’ in others. John returns to the original files, transforms all the dates to the same format, and recombines the files. John loads the merged data file in a
visualization tool (step 3). The tool immediately gives the error message ‘Empty cells in column 3’; it cannot cope with missing data. John returns to the spreadsheet to fill in missing values using a few spreadsheet formulas (back
to step 2). He edits the data by hand; sometimes he transforms the data (e.g. one state reports data only every other year so he uses an average for the missing years). At other times there is nothing he can do after diagnosing a
new problem (i.e. return to step 1). For example, he finds out that survey question 24 did not exist before 2000, and the most recent year of data from Ohio has not been delivered yet, so he tries to pick the best possible value (e.g.
1) to indicate missing values. John detects other, more nuanced, problems; for example, some cells have a blank space instead of being empty. It took hours to notice that difference. John tries to follow a systematic approach
when evaluating the data, but it is difficult to keep track of what he has inspected and how he has modified the data, especially because he discovers different issues across different files. Even after all of this work, he is not sure if
he has examined all of the variables or overlooked any outliers. After a while, the data file seems good enough and he decides to move on.
It took a few days so it is with a great sense of accomplishment that John finally loads the data for the second time into the visualization tool he wants to use (step 3 again). He constructs several views of the data, including a
geospatial representation of the crimes and a scatterplot of age against crime. As soon as he sees the visualized data he realizes that, unfortunately, data quality issues still persist. Extreme outliers appear in the visualization.
Some outliers seem to be valid data (e.g. data from the District of Columbia are very different from data from every other state).
Others seem suspicious (criminals may vary in age from teenagers to older adults, but apparently babies are also committing crimes in certain states). John iteratively removes those outliers he believes to be dirty data (e.g.
criminals under 7 and over 120 years old). Times eries visualizations indicate that, in 1995, some causes of death disappear abruptly while new ones appear.Two days later, an email exchange with colleagues reveals that the
classification of causes of death was changed that year. John writes a transformation script to merge the data so he can analyze distinct terms referring to the same (or at least similar) cause of death.
Although the ‘real’ analysis is just about to start (step 4), John has made dozens of transformations, repeated the process several times, made important discoveries relating to the quality of the data, and made
many decisions impacting the quality of the final ‘clean’ data. He also used visualization repeatedly while walking through the process, but still does not have results to show to his boss. Finally, he is able to work
with the usable data, and useful insights come to the surface, but updated data sets arrive (step 5). Without proper documentation (step 6) of his transformations, John might be forced to repeat many of the
tedious tasks.
“Research directions in data wrangling: Visualizations and transformations for usable and credible data”
“a process of iterative data exploration and transformation that enables analysis.”
WRANGLING SCENARIO
Although the ‘real’ analysis is just about to start (step 4), John has made
dozens of transformations, repeated the process several times, made
important discoveries relating to the quality of the data, and made many
decisions impacting the quality of the final ‘clean’ data.
He also used visualization repeatedly while walking through the process, but
still does not have results to show to his boss.
Finally, he is able to work with the usable data, and useful insights come to the
surface, but updated data sets arrive (step 5).
Without proper documentation (step 6) of his transformations, John might be
forced to repeat many of the tedious tasks.
“Research directions in data wrangling: Visualizations and transformations for usable and credible data”
“a process of iterative data exploration and transformation that enables analysis.”
WRANGLING SCENARIO
One or more initial data sets may be used and new versions may
come later. The wrangling and analysis phases overlap.
While wrangling tools tend to be separated from the visual
analysis tools, the ideal system would provide integrated tools
(light yellow). The purple line illustrates a typical iterative process
with multiple back and forth steps.
Much wrangling may need to take place before the data can be loaded
within visualization and analysis tools, which typically
immediately reveals new problems with the data.
Wrangling might take place at all the stages of analysis as users
sort out interesting insights from dirty data, or new data become
available or needed.
At the bottom we illustrate how the data evolves from raw data to
usable data that leads to new insights.
“a process of iterative data exploration and transformation that enables analysis.”
WRANGLING IN THE ANALYTICAL WORKFLOW
IT’S A CYCLE…
Discovery in the Analytical Workflow
• Commonly recognizable cycle and focus for discovery activities (subset)
• Explicitly iterative, ad-hoc, dynamic
• Goal = incremental / directional advance in understanding
• Core modes of engagement with data = Explore, Analyze
• Modeling phase does not involve exploration
Discovery
DEEP STRUCTURE
CHANGE VECTORS
EARLY SIGNALS
INFLECTION POINTS
EMERGING SPACES
HOLISTIC EXPERIENCES
Activity Centered Design
designed many
discovery solutions
scenario analysis
The Language of Discovery:
A concrete descriptive language for
human discovery activity in diverse
contexts.
A simple and consistent vocabulary that
is independent of domain, role,
information type, etc.
The Language of Discovery:
A concrete descriptive language for
human discovery activity in diverse
contexts.
A simple and consistent vocabulary that
is independent of domain, role,
information type, etc.
activity grammar
Enables understanding of
discovery needs and context
Generative tool for discovery
capability and experiences
DISCOVERY S
Discovery Modes
“a broad, but identifiable discovery activity that is not
tied exclusively to a particular context or domain.”
Locate

Verify

Monitor

Compare

Comprehend

Explore

Analyze

Evaluate

Synthesize
9 modes
Locate
To find a specific (possibly known) thing
e.g. I need to find a new part with particular technical attributes and then source it from the most qualified supplier - Engineering
Verify
‘To confirm or substantiate that an item or set of items meets some specific criterion’
e.g. How can I determine if I am looking at the latest information for a part or supplier? - Supply Chain Specialist
Monitor
‘To maintain awareness of the status of an item or data set for purposes of management or
control’
e.g. I need to monitor at risk/failing customers/dealers so I can prompt my Account Reps to fix the problems - Sales Manager
Compare
To examine two or more things to identify similarities & differences
e.g. I need to compare our module set teardowns with competitive teardown information to see if we’re staying competitive for cost, quality and functionality - Engineering
Comprehend
To generate insight by understanding the nature or meaning of something
e.g. I need to analyze and understand consumer-customer-market trends to inform brand strategy & communications plan – Director, Brand Image
Explore
To proactively investigate or examine something for the purpose of knowledge discovery
e.g. I need to understand the cost drivers for this commodity so I can negotiate better terms with my suppliers and forecast business risk based on market indices -
Procurement
Analyze
To critically examine the detail of something to identify patterns & relationships
e.g. I need to know the cost drivers for a part such as materials that impact cost. Is the relationship a correlation or step function for a part cost driver? - Engineering
Evaluate
To use judgement to determine the significance or value of something with respect to a specific benchmark
or model
e.g. I need to determine my current state in my prints so I can evaluate if I have price variation to negotiate a better price - Procurement
Synthesize
To generate or communicate insight by integrating diverse inputs to create a novel artifact or composite
view
e.g. I need to prepare a weekly report for my boss (sales mgr) of how things are going - Account Rep
HYPOTHESIS
“…FOUND ‘EM!”
Locate
Verify
Monitor
Compare
Comprehend
Explore
Analyze
Evaluate
Synthesize
9 modes
Discovery Modes and Activity
Explore
Wrangle
Analyze
Augment
Sensemaking
Transformation
data quality computed / enriched data
New data triggers
new cycles
Cumulative Change
Direction & Momentum
Begin Conclude
Goal: Make data useful for
analysis
Goal: Understand the nature and
usefulness of data for analysis.
Goal: Accumulate insight through
iterative analysis
Goal: Achieve insights by
analyzing data.
Working with data
to effect outcomes
Explore
Wrangle
Analyze
Augment
Sensemaking
Transformation
data quality computed / enriched data
New data triggers
new cycles
Cumulative Change
Direction & Momentum
Begin Conclude
Advancing insight
Can’t do this…
…Without these
capabilities
Apparent Mode and Activity Affinities
Explore
Wrangle
Analyze
Augment
Sensemaking
Transformation
source data source & enriched data
New data triggers
new cycles
Cumulative incremental progress
Focus of attention: Organization
of the data and quality issues
Focus of attention: Actual &
potential insights
Real wrangling Real analysis
Actual Discovery Modes and Activity Affinities
CAPABILITIES FOR VISUAL DISCOVERY & ANALYSIS TOOLS
▸ Explore data corpus
▸via effectively characterized catalog
▸ Explore individual data sets
▸effective preview / sample / subset
▸ Analyze data
▸within ad-hoc data sets, across ad-hoc data sets
▸ Wrangle data
▸within ad-hoc data sets, across ad-hoc data sets
▸ Verify outcomes: insights, models, data products
▸ Synthesize outcomes
▸ distinct types = insights, model, data product (project)
▸ Publish outcomes
▸ distinct types = insight, data product, model (project)
▸ Integrate specialized / external analytical tools {augment}
▸ analysis tools (R, Python), reference models, validation tools
▸ Integrate external workflow tools {enhancing}
▸ e.g. figshare, model management, projects
▸ Support analytical workflow {enhancing}
Discovery Capabilities: Core
Explore
Wrangle
Analyze
Augment
Sensemaking
Transformation
data quality computed / enriched data
Core discovery capabilities
Discovery Capabilities: Enhancing
Explore
Wrangle
Analyze
Augment
Sensemaking
Transformation
data quality computed / enriched data
Publish &
operationalize
outcomes
Workflow, provenance, versioning, accelerators, collaboration
Acquire and
access data
Enhancing capabilities
DEEP STRUCTURE
CHANGE VECTORS
EARLY SIGNALS
INFLECTION POINTS
EMERGING SPACES
HOLISTIC EXPERIENCES
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 118
Activity Cycles & Capabilities
Core
Capabilities
activity specific
progressive
Influencer
By-product
PublishImport
Precursor
• Core capabilities are necessary &
primary to complete a given cycle
• Enhancing capabilities are
secondary within a cycle
• Enhancing capabilities are
necessary to accumulate
assets(?)
• Enhancing capabilities are
necessary to advance to next
cycle(?)
asset
types
Workflow
Collaboration
PublicationAccelerators
Enhancing Capabilities
common
random access
Versioning
Successor
Provenance
Metadata
PublishImport
Curation
Governance
Import
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 119
Capabililty Evolution
Core
Capabilities
activity specific
progressive
Influencer
By-product
PublishImport
Precursor
• Core capabilities are necessary &
primary to complete a given cycle
• Enhancing capabilities are
secondary within a cycle
• Enhancing capabilities are
necessary to accumulate
assets(?)
• Enhancing capabilities are
necessary to advance to next
cycle(?)
asset
types
Workflow
Collaboration
PublicationAccelerators
Enhancing Capabilities
common
random access
Versioning
Successor
Provenance
Metadata
PublishImport
Curation
Governance
Import
HYPOTHESIS
Business
Analytics
(future)
Data
Science
(now)
=
OPPORTUNITY
“IS THERE ANY THERE, THERE?”
PRODUCT STRATEGY CHARTS A DESIRED
SET OF COURSES THROUGH THE SPACE
OF POSSIBLE PRODUCTS FOR A DOMAIN
Joe Lamantia
PRODUCT STRATEGY
OPPORTUNITY
ASSESSMENT
PRODUCT
DISCOVERY
INVEST…?
PORTFOLIO PLANNING
Tools on the Market Now
Explore
Wrangle
Analyze
Augment
Sensemaking
Transformation
data quality computed / enriched data
Cumulative Change
Direction & Momentum
Begin Conclude
Paxata, Trifacta
Beyond Core?
OSS / hand rolled
EID 3.x
Wave 1 wrangling
tools now in market
No good exploration
tool in market
Tools on the Market Now
Explore
Wrangle
Analyze
Augment
Sensemaking
Transformation
data quality computed / enriched data
Cumulative Change
Direction & Momentum
Begin Conclude
Alteryx
Datameer
Modest exploration
capabilities
Tools on the Market Now
Explore
Wrangle
Analyze
Augment
Sensemaking
Transformation
data quality computed / enriched data
Cumulative Change
Direction & Momentum
Begin Conclude
Alteryx
Modest exploration
capabilities
Qlik
Tools on the Market Now
Explore
Wrangle
Analyze
Augment
Sensemaking
Transformation
data quality computed / enriched data
Cumulative Change
Direction & Momentum
Begin Conclude
Tableau, Platfora
Wave 1 visual
analysis tools now in
market
Modest wrangling
capabilities
BDD 1.x?
Explore
Wrangle
Analyze
Augment
Sensemaking
Transformation
data quality computed / enriched data
Cumulative Change
Direction & Momentum
Begin Conclude
BDD Future 1.x?
Explore
Wrangle
Analyze
Augment
Sensemaking
Transformation
data quality computed / enriched data
Cumulative Change
Direction & Momentum
Begin Conclude
‘Plugable’ external
tools
BDD Future 2.x?
Explore
Wrangle
Analyze
Augment
Sensemaking
Transformation
data quality computed / enriched data
Cumulative Change
Direction & Momentum
Begin Conclude
VISUAL DISCOVERY AND ANALYSIS TOOLS: WAVE 1
Definition: traditional discovery & analysis possible on hadoop stores
Value prop = easy access to hadoop stores for analysts w/out data engineer
In / coming to market now: platfora, datameer, clearstory, sisense, etc.
Segment is viable (people understand the need & have the problem)
Tool maturity will increase incrementally, and in customary ways
alignment to workflow particulars
nuanced and compelling UX
broader footprint of supporting capabilities: provenance, publishing, collaboration
integration with ecosystem of related tools for activity
This class of tools competes with & may replace / displace existing non-hadoop native tools that are still rising with the general analytics wave: qlik, tableau,
microstrategy
Firms making new investments (for new stacks) will try / buy this new generation
Firms extending existing investments less likely to buy new
Long view = tools in this segment could ‘eat’ BI marketshare by adding reporting and other structured analytical capabilities that capture customers
who do not have large BI stacks now, begin investing here, and subsequently need BI capability
OPPORTUNITY
ASSESSMENT
PRODUCT
DISCOVERY
INVEST…?
PORTFOLIO PLANNING
DEEP STRUCTURE
CHANGE VECTORS
EARLY SIGNALS
INFLECTION POINTS
EMERGING SPACES
HOLISTIC EXPERIENCES
DATA DISCOVERY PRODUCT
AN EXAMPLE
Oracle Confidential – Internal
Oracle Big Data Discovery
Overview
Richard Tomlinson
Director, Product Management
September 25, 2014
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal
Hadoop Data Reservoir Concept Gaining Momentum
142
Data Warehouse Data Reservoir
Emerging
Sources
Existing Sources
Source: wikibon.org/wiki/v/Big_Data_Vendor_Revenue_and_Market_Forecast_2013-2017
Source: 451 Research – Total Data Warehousing: 2013-2018
Source: The Forrester WaveTM: Big Data Hadoop Solutions, Q1 2014
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal
Not Easy to Get Analytic Value from Hadoop
143
• Existing analytic tools fall short
– Fail to expose potential of data up front
– Rely on upstream ETL processes to cleanse and prepare data
– Optimized for SQL not unstructured data
– Not built for discovery (assume users know what questions to ask)
• Only point solutions emerging
– Leads to constant context switching
– Need end-to-end capabilities
• Early Hadoop tools complex
– Pig, Oozie, Sqoop, Hive, Spark, etc
• Specialized skills are scarce
– Programming languages (e.g. Map Reduce,
Python, Scala)
– Statistics and machine learning
– Command line interfaces
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal
Requires a Fundamentally New Approach
144
A single intuitive,
interactive and visual
user interface
Explore
TransformDiscover
Find
for anyone to quickly find,
explore, transform and
analyze data in Hadoop
then share results for
enterprise leverage
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal 145
Oracle Big Data Discovery. The Visual Face of Hadoop
Explore
TransformDiscover
Find
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal
• Navigate a rich catalog of
all data in the Hadoop
cluster
• Familiar search and guided
navigation for ease of use
• Access data set summaries,
annotation and
recommendations
• Provision your own data
through self-service upload
• Data is automatically
enriched with extracted
locations, terms, sentiment
• Browse personal big data
projects and those shared
by the community
146
Easily Find Relevant Data Sets
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal
• Understand shape of the
data. Visualize attributes
by type
• Entropy based sorting by
information potential
• View attribute statistics,
data quality and outliers
• Use scratch pad to see
statistical correlations
between attribute
combinations
• Evaluate whether a data
set is worthy of further
investment
147
Explore the Data and Understand Potential
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal
• Intuitive user driven data
wrangling
• Library of data
transformations to replace
values, convert types, collapse,
reshape, pivot, group, custom
tag, merge and much more
• Data enrichments for inferring
location and language. Theme,
entity and sentiment
enrichments for text
• Preview results, undo, commit
and replay transforms
• Run on sample data in memory
or full data set in Hadoop
148
Transform and Enrich Data to Make it Ready
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal
• Mash up different data
sets for deeper
perspectives
• Drag and drop from a rich
library of interactive
visualizations to compose
discovery dashboards
• Filter through data with
powerful search and
intuitive guided
navigation
• Share projects,
bookmarks and snapshots
with team members for
collaboration
149
Analyze the Data to Discover New Insights
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal
Share Results and Publish for Enterprise Leverage
150
• Share and collaborate with the team
– Share projects, bookmarks and snapshots then
collaborate and iterate
• Publish back to Hadoop
– Transforms and enrichments may be applied to
original data sets in Hadoop
– Publish blended data sets back to HDFS
• Leverage results in other tools
– Publish data to Hadoop in format optimized for
advanced analytic tools (e.g. ORAAH)
– Hadoop compliant BI tools (e.g. OBIFS) can
burst out to the masses
– Leverage any native Hadoop tooling (e.g. Pig,
Hive, Impala, Python, etc)
– Integrate BDD data sets with DWH to secure,
govern and optimize for query performance
(e.g. Oracle Big Data SQL)
Oracle Big Data Discovery plays well
with the big data ecosystem
Explore
Transfor
mDiscover
Find
Share &
Collaborate
raw data
transformed data
data reservoir
(HDFS)
Publish
data
warehouse
business
intelligenc
e
advanced
analytics
other
hadoop
tools
Leverage
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal
Oracle Big Data Discovery. Technical Innovation on
Hadoop
151
Oracle Big Data Discovery Workloads
Hadoop Cluster
(BDA or Commodity)
data node
data node
data node
data node
data node
name node
Data Processing, Workflow & Monitoring
• Profiling: catalog entry creation, data type & language detection,
schema configuration
• Sampling: dgraph (index) file creation
• Transforms: >100 functions
• Enrichments: location (geo), text (cleanup, sentiment, entity, key-
phrase, whitelist tagging)
Self-Service Provisioning & Data Transfer
• Personal Data: Upload CSV, XLS and JSON to HDFS
• Enterprise Data: Provision from RDBMS to HDFS
In-Memory Discovery Indexes
• DGraph: Search, Guided Navigation, Analytics
Studio
• Web UI: Catalog, Explore, Transform, Analyze, Share
Hadoop 2.x
Filesystem
(HDFS)
Workload Mgmt
(YARN)
Metadata
(HCatalog)
Other Hadoop
Workloads
MapReduce
Spark
Hive
Pig
Oracle Big Data SQL
(BDA only)
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 152
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 153
DEEP STRUCTURE <> ANALYTICAL WORKFLOW
CHANGE VECTORS <> BIG DATA TECHNOLOGIES
EARLY SIGNALS <> RISE OF DATA SCIENCE
INFLECTION POINTS <> DATA SCIENCE MOMENT
EMERGING SPACES <> EMPIRICAL DISCOVERY
HOLISTIC EXPERIENCES <> VISUAL DISCOVERY TOOL
WHAT NEXT…?
VISUAL DISCOVERY & ANALYSIS TOOLS: WAVE 2
Definition: Augmented discovery & analysis across full business data corpus
Value prop = deeper insights from more diverse data, faster insights,
effected via a mixed toolkit of (semi)automated analytical techniques (clustering, machine learning, regression / correlation, etc.) enhances and directs analyst
attention
Vectors of augmentation: data types, degree of automation
data = text / lingual, location / spatial, native graph, native stream
automation = which specific activities are augmented, to what degree)
Wave 2 is at the ‘pioneer’ stage: specifics of capability, value, implementation unknown
Limiting factors:
Domain specificity: value of general discovery analytics drops once domain boundaries are reached - need to align specifically to domain view of world
Expect verticalization of all analytics
Low / no tolerance for black boxes - deeper insights require transparency
Analytical literacy: level increasing, but orgs can’t benefit from advanced analytical techniques if not understood & trusted
OPPORTUNITY
ASSESSMENT
PRODUCT
DISCOVERY
INVEST…?
PORTFOLIO PLANNING
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal
Feature Selection
Joe Lamantia
Product Strategy(ist)
Oracle Big Data Discovery
November, 2014
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 159
Feature Selection
In machine learning and statistics, feature selection, also known as variable selection, attribute
selection or variable subset selection, is the process of selecting a subset of relevant features for use in
model construction.
The central assumption when using a feature selection technique is that the data contains
many redundant or irrelevant features.
Redundant features are those which provide no more information than the currently selected features, and
irrelevant features provide no useful information in any context. Feature selection techniques are a subset of
the more general field of feature extraction.
Feature extraction creates new features from functions of the original features, whereas feature selection
returns a subset of the features. Feature selection techniques are often used in domains where there are
many features and comparatively few samples (or data points).
Feature selection is also useful as part of the data analysis process, as it shows which features are
important for prediction, and how these features are related.
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
BDD Feedback: Data Scientist Interviews
“Analysts don’t generally analyze the catalog per se - they analyze line
items, or actions, or histories, that kind of thing.”
“It’s generally actions that people are interested in.”
160
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 161
Data
Records
Catalog
Format
Entities
Product, location
Connections
Satisfaction
Goals
Acquire
Transform
Events
Purchase
Status change
Structures & Systems
User
centric
Data
centric
Networks
Business unit
Community
Loyalty factors
Themes
Profit
Efficiency
Plans
Balance budget
Launch product
Manage risks
Business Perspective
Progressive engagement
Complexity & difficulty
Value of outcome
Activities
Traffic logging
Address change
Processes
Fulfillment
Brand monitoring
Analysis PerspectiveData Perspective
Domains
Supply chain
Industry / market
Models
Conversion
Lifetime Customer Value (Decision tree)
Measures
Attrition rate
Unit cost of materials
Sensemaking Spectrum
How analysts have
to engage with data
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 162
Data
Records
Catalog
Format
Entities
Product, location
Connections
Satisfaction
Goals
Acquire
Transform
Events
Purchase
Status change
Structures & Systems
User
centric
Data
centric
Networks
Business unit
Community
Loyalty factors
Themes
Profit
Efficiency
Plans
Balance budget
Launch product
Manage risks
Business Perspective
Progressive engagement
Complexity & difficulty
Value of outcome
Activities
Traffic logging
Address change
Processes
Fulfillment
Brand monitoring
Analysis PerspectiveData Perspective
Domains
Supply chain
Industry / market
Models
Conversion
Lifetime Customer Value (Decision tree)
Measures
Attrition rate
Unit cost of materials
Sensemaking Spectrum
How analysts want
to engage with data
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
BDD Feedback: Data Scientist Interviews
“The transforms are for feature engineering, right?”
“What other goals are there for the transforms?”
“I would assume that’s the only reason for the transforms…”
163
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
BDD Feedback: Data Scientist Interviews
“Getting the data right is the hard part. Once you get the data right…”
164
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
BDD Feedback: Data Scientist Interviews
“…feature engineering needs to be an iterative process”
“…this is an iterative process. Everything goes in a circle.”
“You’re going to do some data cleaning, you’re going to build a model,
you’re going to have to go back and look at what you’re missing and
what you’re not missing.”
•
165
IT’S A CYCLE…
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 167
Analytical Activity
Explore
Wrangle
Analyze
Augment
Sensemaking
Transformation
Features
Goals
Realize insights
Generate Models
Goals
Understand data
Make data useful
Cumulative incremental progress
Data quality
& Features
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 168
Feature
Extraction
Engineering
Generation
Selection
…
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
We can repurpose techniques used
during the traditional feature selection
stage of the analytical workflow to
enhance other stages of the discovery
and analysis workflow.
A likely candidate is exploration as it
is coupled with wrangling.
…Allow analyst engagement and
focus on more useful constructs like
entities or business processes,
instead of dealing only with raw
values and attributes
169
Thesis
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
BDD 1.0 EID
EID
170
BDD ?
Acquire
Ingest
& Clean
Store &
Manage
Featurize
Wrangle
Visual
Analysis
Interactive
Queries
Modeling Story-telling
Build
Deploy
Monitor &
Maintain
Present
Disseminate
Insight cycle Modeling cycle
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
How?
• Features are discovered and inferred
• statistical & other domain-independent methods
• Domain-based
• Known features used to train system
• Sources
• artifacts (scripts, models, dictionaries)
• analytical activities
• direct indication
•
171
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Possible Manifestations
• Feature-based operations
• wrangling: transforms, joins,
• exploration: search, visualization,
• analysis
• Feature recognition: known features identified in new data
• Feature-based enrichment
• Interest graphs - Individual and group
• Modeling capabilities
172
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
• Movement toward user-centric engagement with data:
• Entity-centric navigation & event linkage across data sets (Platfora)
• Answerset (Paxata)
• semantic search & enrichments (BDD)
• thematic data lenses (platfora)
• data harmonization and data stories (clearstory)
• natural language interaction / cognitive computing (IBM)
• expert network (tamr)
173
What’s happening in this product space?
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 174
Capabililty Evolution
Core
Capabilities
activity specific
progressive
Influencer
By-product
PublishImport
Precursor
• Core capabilities are necessary &
primary to complete a given cycle
• Enhancing capabilities are
secondary within a cycle
• Enhancing capabilities are
necessary to accumulate
assets(?)
• Enhancing capabilities are
necessary to advance to next
cycle(?)
asset
types
Workflow
Collaboration
PublicationAccelerators
Enhancing Capabilities
common
random access
Versioning
Successor
Provenance
Metadata
PublishImport
Curation
Governance
Import
WHAT NEXT…?
OPPORTUNITY
ASSESSMENT
PRODUCT
DISCOVERY
INVEST…?
PORTFOLIO PLANNING
Data Science
Insight
Model
Insight
Model
Data Product
Product
Analysts
Outcomes
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
BDD Feedback: Data Scientist Interviews
“How do you know what changes you want to make until you build
your model?  Once you build your model, you know you want to take
the square root of this, or the log of this.  That doesn’t happen until
you start building a model…”
178
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 179
Discovery & Analysis Workflow
Acquire
Ingest
& Clean
Store &
Manage
Featurize
Wrangle
Visual
Analysis
Interactive
Queries
Modeling Story-telling
Build
Deploy
Monitor &
Maintain
Present
Disseminate
Insight cycle Modeling cycle
Adapted from ‘Data Analysis Just One Component of the Data Science Workflow’
http://radar.oreilly.com/2013/09/data-analysis-just-one-component-of-the-data-science-workflow.html
Features
Insights
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
BDD 1.0 EID
EID
180
Analytical Workflow
Acquire
Ingest
& Clean
Store &
Manage
Featurize
Wrangle
Visual
Analysis
Interactive
Queries
Modeling Story-telling
Build
Deploy
Monitor &
Maintain
Present
Disseminate
Insight cycle Modeling cycleData Ingest cycle
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
Modeling
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
Old-school Modeling
• Compute is expensive
• Good (relevant) data is scarce
• All data is difficult to work with, require considerable time and attention just to get provisionally ready
• Human attention is limited - at all levels: engineer, analyst, insight consumer, 
• ‘Experiments' are small, planned, receive close attention
• Rely first on a library of well known methods (carefully vetted by years of practice)
• Don’t run the experiment unless you know you can evaluate the results
•    be sure you have the time
•    be sure have the expertise
•    be confident the results will be meaningful /insightful
• Automation is only feasible in limited circumstances
• Humans interpret experimental results
• Complete experiments before evaluating them
• ‘Small’ infrastructure - data sets, compute source, evaluation tools, archiving
• Modeling is best done by the knowledgeable
•     can have negative consequences when done by novices
• Toolset aligned to: small / mid-sized data
•      requires a high-quotient of human engagement, both directive / evaluative, and to enable execution
182
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
New-school Modeling
• Compute is cheap
• Data is abundant | good (relevant) data is often available 
• Data is still challenging to work with, but tooling allows engagement with much greater quantities, of many types
• Run many experiments
• Try many approaches, using new and old methods
• Machines interpret experimental results, at least in part (batch eval for initial ranking of potential insight)
• ‘Big’ infrastructure - data sets, compute source, evaluation tools, archiving
• Automate where possible: selecting data, prepping data, choosing methods, setting parameters, executing experiments,
evaluating results
• Modeling is better done by those with knowledge, but it can have utility for non-experts
• [forward-looking analogs: genomics, bioinformatics, computational neuroscience]
• Toolset wants to be aligned to big data
•   profile of human engagement varies over analytical lifecycle, seeking automation where possible in direction /
evaluation, and execution
183
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
Practices
• Combine old-school and new school approaches at different stages
of the analytical cycle
• Starting points vary by practitioner maturity, understanding of
problem, available resources
• Experiments often alternate approaches
• Use automation where possible
184
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery185
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
Modeling
186
Exploratory
Analysis
Identify features
Understand
relations between
features
Create new
features
Characterize
Dataset
Build
Baseline
Model
Build
Complex
Model
Feature
Engineering
& Model
Tuning
New features
Straight-forward
& well-known
modeling
methods
Explore &
understand
contents,
distribution,
quality, etc.
Iterative
experimentation
with several
classes of
modeling
methods
Compare to
baseline
Comparative /
reference model
Iterative &
experimental
model & feature
combination,
tuning, evaluation
Recursive feature
elimination
Modeling,
Testing,
Training,
Evaluation data
sets
Initial Predictive
Model
Final Predictive
Model
Explanatory
Model
Explanatory
Model
Discovery cycle Modeling cycle
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
Modeling & BDD
187
Exploratory
Analysis
Identify features
Understand
feature
relationships
Create new
features
Characterize
Dataset
Build
Baseline
Model
Build
Complex
Model
Feature
Engineering
& Model
Tuning
New features
Straight-forward
& well-known
modeling
methods
Explore &
understand
contents,
distribution,
quality, etc.
Iterative
experimentation
with several
classes of
modeling
methods
Compare to
baseline
Comparative /
reference model
Iterative &
experimental
model & feature
combination,
tuning, evaluation
Recursive feature
elimination
Modeling,
Testing,
Training,
Evaluation data
sets
Initial Predictive
Model
Final Predictive
Model
Explanatory
Model
Explanatory
Model
Initial capability…
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
Modeling & BDD
188
Exploratory
Analysis
Identify features
Understand
feature
relationships
Create new
features
Characterize
Dataset
Build
Baseline
Model
Build
Complex
Model
Feature
Engineering
& Model
Tuning
New features
Straight-forward
& well-known
modeling
methods
Explore &
understand
contents,
distribution,
quality, etc.
Iterative
experimentation
with several
classes of
modeling
methods
Compare to
baseline
Comparative /
reference model
Iterative &
experimental
model & feature
combination,
tuning, evaluation
Recursive feature
elimination
Modeling,
Testing,
Training,
Evaluation data
sets
Initial Predictive
Model
Final Predictive
Model
Explanatory
Model
Explanatory
Model
Subsequent capability…
OPPORTUNITY
ASSESSMENT
PRODUCT
DISCOVERY
INVEST…?
PORTFOLIO PLANNING
WHAT NEXT…?
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 191
Business Assets & Activity Cycles
Adapted from ‘Data Analysis Just One Component of the Data Science Workflow’
http://radar.oreilly.com/2013/09/data-analysis-just-one-component-of-the-data-science-workflow.html
Featurize
Wrangle
Visual
Analysis
Interactive
Queries
Discovery Modeling
Features
Data Application
VectorsEnrichments
Acquire Ingest
& Clean
Manage &
Update
Model Train
EvaluateUpdate
Build
MonitorStore &
Expose
Insights ModelsData
Train
Deploy
corpus
operational
analytical
archival
insight stream
awareness
explanatory
prescriptive
intelligence
machine
human
hybrid
systems
transactional
engagement
insight
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 192
Tool Archetypes
Featurize
Wrangle
Trifacta
Visual
Analysis
Platfora
Interactive
Queries
Datameer
Discovery ModelingData Application
Acquire Ingest
& Clean
Manage &
Update
Model Train
EvaluateUpdate
Build
Train
Deploy
MonitorStore &
Expose
Data science workbenches
Sense, yhat
Application Foundries
Azure ML, IBM
Traditional app studios
Java
Discovery Workbenches
BDD x
Data Integrators
Clover
Analysis Workbenches
Alteryx, Alpine
Analytics Platforms
Teradata, Pivotal
ML services
BigML, Wise.io, Skytree
Business Intelligence Suite
OBIEE, Cognos
Python notebooks
iPython, juPyter
DEEP STRUCTURE
CHANGE VECTORS
EARLY SIGNALS
INFLECTION POINTS
EMERGING SPACES
HOLISTIC EXPERIENCES
VALUE CHAIN MAP (WARDLEY MAPPING)
VALUE CHAIN MAP (WARDLEY MAPPING)
ML
WORKING THE ECOSYSTEM
• Oracle = an ecosystem
• ML = commoditizing
• Someone will ‘generate the electricity’ = provide
ML capability within the Oracle ecosystem
• Everyone’s going to need it…
OPPORTUNITY
ASSESSMENT
PRODUCT
DISCOVERY
INVEST…?
PORTFOLIO PLANNING
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery200
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery201
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
Oracle Machine Learning Service
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery203
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
Genesis
204
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
Offering
• Machine learning service exposed as
• Stand-alone productized service (public cloud)
• ‘Product’ integrated with relevant Oracle cloud offerings
• enable machine learning / analytics pipelines for data spanning service
boundaries
• ‘White-label’ ML capability within cloud offerings (SaaS, IaaS, PaaS, DaaS, etc.)
• enables localized ML / analytics pipelines w/in service boundaries
• Collection of Oracle-specific ML accelerators
• Data sets & streams, pipelines, algorithms, R / python libs, project templates,
etc.
•
205
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
Oracle Value Prop
• Provides ML capability across cloud offerings for expanded data
landscape
• Big data
• Big data + Traditional Enterprise in combination
• Streaming Data
• IOT
• Reinforces ‘data gravity’ effect across Oracle cloud offerings
• Entry point for ‘new stack’ (cloud-only) customers needing ML
capability
• ‘Missing link’ completes analytical pipelines across tool boundaries
206
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
Data
Landscape
207
Complexity
Quantity
Traditional
Enterprise
Big Data
IOT
Oracle Machine Learning Service
Product-native ML
Stream /
Real-time
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
Customer Value Prop
• Easy machine learning w/in ecosystem of Oracle cloud offerings
• Turnkey
• Elasticity and adaptivity: resources, pricing,
• Portability across Oracle product / service boundaries
• Manifests appropriately for product / service contexts
• Application Developers
• Analysts / Data Scientists
• Business users
• Machine consumers
208
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
SaaS
ML For the Oracle Cloud Ecosystem
209
Oracle Machine
Learning Service
DaaS
Data Service
IaaS
Infrastructure
Service
PaaS
Platform
Service
Data &
Models
Data &
Models
Data &
Models
‘Public’ OML
product
Customer
Applications
& Data sources
Data &
Models
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
SaaS
White Label ML Capability
210
DaaS
Data Service
IaaS
Infrastructure
Service
PaaS
Platform
Service
Machine Learning
ML ToolsML Tools
ML Tools
ML Tools
ML Tools
ML Tools
ML Tools
ML Tools
ML Tools
Customer
Applications
& Data sources
Oracle Machine
Learning Service
‘Public’ OML
product
Data &
Models
Data &
Models
Data &
Models
Data &
Models
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
Oracle Ecosystem
• All cloud services can be
• data sources for ML service
• consumers of published data & models from ML service
• OML can publish augmented datasets (e.g. pre-scored matrices) as
part of multistep & multi-tool analytical pipelines 
•
211
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery
Initial Capability
• Core ML functions:
• data upload (no transform - BDD integration) from Oracle sources
• modeling / analysis via general purpose, interpretable, methods
• model training
• model evaluation
• Model publication
• Processed data publication 
212
OPPORTUNITY
ASSESSMENT
PRODUCT
DISCOVERY
INVEST…?
PORTFOLIO PLANNING
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery214
Featurize
Wrangle
Visual
Analysis
Interactive
Queries
Discovery ModelingData Application
Acquire Ingest
& Clean
Manage &
Update
Model Train
EvaluateUpdate
Build
Train
Deploy
MonitorStore &
Expose
Discovery Workbenches
BDD (now)
ML services
Oracle Machine Learning
Discovery & Modeling Platform
BDD & ML (combined analysis offering ?)
WHAT NEXT…?
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 217
Automation Potential
Featurize
Wrangle
Visual
Analysis
Interactive
Queries
Discovery Modeling
Adapted from ‘Data Analysis Just One Component of the Data Science Workflow’
http://radar.oreilly.com/2013/09/data-analysis-just-one-component-of-the-data-science-workflow.html
Features
Data Application
VectorsEnrichments
Acquire Ingest
& Clean
Manage &
Update
Model Train
EvaluateUpdate
Build
Train
Deploy
MonitorStore &
Expose
Insights ModelsData
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 218
Machine Intelligence Value Chain
Adapted from ‘Data Analysis Just One Component of the Data Science Workflow’
http://radar.oreilly.com/2013/09/data-analysis-just-one-component-of-the-data-science-workflow.html
Featuriz
Wrangl
Visual
Analys
Interactiv
e
Discover Modeling
Feature
Data Application
VectorEnrichmen
Acquir Ingest
&
Manage
&
Mode Trai
EvaluatUpdat
Buil
MonitoStore
&
Insight ModelsData
Trai
Deplo
corpus
operational
analytical
archival
insight
stream
awareness
intelligence
machine
human
hybrid
systems
transactional
engagement
insight
Process
operations?
transactional
engagement
insight
Apps
Metric
Create
Machine Intelligence
Operationalize
Machine Intelligence
DEEP STRUCTURE <> PRODUCT DEVELOPMENT
CHANGE VECTORS <> ACQUISITION
EARLY SIGNALS <> MARKET ACTIVITY
INFLECTION POINTS <> INNOVATION MOMENTS
EMERGING SPACES <> PRODUCT STRATEGY GIG
HOLISTIC EXPERIENCES <> EXPERIENCE FOCUS
THANK YOU!
I’M HIRING…
TOOLS & FRAMEWORKS
AN EXAMPLE
The Language of Discovery
Category: Primary Research, Design Systems
Outcomes: Building on already-published original
applied research into information retrieval and
usage, the language of discovery posits a domain-
independent framework describing the activity
primitives of discovery in terms of ‘modes’.  
Succeeding professional and industry publications
outline the application of this descriptive vocabulary
in settings including product design and
development, product strategy, and information
management.
Reference:
• Russell-Rose, T., Lamantia, J. and Burrell, M. 2011. A Taxonomy of
Enterprise Search and Discovery. Proceedings of EuroHCIR 2011,
London, UK. http://ceur-ws.org/Vol-763/paper4.pdf
• Russell-Rose, T., Lamantia, J. and Burrell, M. 2011. A Taxonomy of
Enterprise Search and Discovery. Proceedings of HCIR 2011, California,
USA. https://docs.google.com/a/kent.edu/viewer?
a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxoY2lyd29ya3Nob3B8
Z3g6NzdmYjc3OWY2ZjQ2Zjg4MQ
• Russell-Rose, T. and Makri, S. 2012 A Model of Consumer Search
Behavior. Proceedings of EuroHCIR 2012, Nijmegen, NL.
• Designing the Search Experience: http://www.amazon.com/Designing-
Search-Experience-Information-Architecture/dp/0123969816
• Presentation - Strata: http://conferences.oreilly.com/strata/
stratany2012/public/schedule/detail/25411
• Presentation - UX Lisbon conference: http://www.joelamantia.com/
user-experience-ux/slides-for-uxlx-talk-the-language-of-discovery-a-
grammar-for-designing-big-data-interactions
Domain & Market Study: Data Science
Outcomes: Comprehensive portrait of all major facets of a new
analytical discipline, including its practices, roles,
methodology, tools and technologies, workflows,
organizational models, skillsets, alignment with business, areas
of innovation, and relation to the landscape of business
analytics. 
Research outcomes and synthesized insights guided product
design, management, and strategy efforts including;
opportunity identification and profiling, landscape /
competitive modeling, technology lifecycle and evolution
models, product discovery, concept creation and evaluation,
prototyping.
Notable aspects: Consistently delivered insights twelve or
more months ahead of leading industry analysts pursuing
similar agendas.
Artifacts & Synthesis
• Data Science Highlights: http://www.joelamantia.com/user-
research/data-science-highlights-an-investigation-of-the-discipline
• Empirical Discovery Concept and Workflow Model: https://
blogs.oracle.com/serendipity/entry/
empirical_discovery_concept_and_workflow
• Empirical Discovery: A New Discipline https://blogs.oracle.com/
serendipity/entry/data_science_and_empirical_discovery
• Defining Discovery: Core Concepts: https://blogs.oracle.com/
serendipity/entry/defining_discovery_core_concepts
• Discovery and the Age of Insight http://www.joelamantia.com/
language-of-discovery/discovery-and-the-age-of-insight
• Big Data Is Not Enough http://www.joelamantia.com/user-
experience-ux/big-data-is-not-the-insight-slides-from-enterprise-
search-europe
DEEP STRUCTURE
CHANGE VECTORS
EARLY SIGNALS
INFLECTION POINTS
EMERGING SPACES
HOLISTIC EXPERIENCES
DEEP STRUCTURES
ENTERPRISE / B2B
• Business process
• Activity
• Social structure: Organizational model
• Boundaries
• Regulation
• IT / Systems architecture
• Lifecycle
• Flows: capital, information, people
• Frame: shareholder value, social enterprise
CONSUMER / B2C
• Value scheme: wealth, love,
knowledge, safety
• Demographics
• Boundaries
• Mores
• Culture
• Social structure: community / group
• Frame: active lifestyle, sustainability
OPPORTUNITY
ASSESSMENT
PRODUCT
DISCOVERY
INVEST…?
PORTFOLIO PLANNING
CONTINUOUS LEARNING
UNDERSTAND & EMPATHIZE
WITH CUSTOMER PERSPECTIVES
>>ARTICULATE CUSTOMER VALUE SOURCES
IDENTIFY BUSINESS IMPLICATIONS
>> INFORM ALL STAGES OF PRODUCT & SERVICE DEVELOPMENT
INVESTIGATING CUSTOMERS
EXPLORING HYPOTHESES ABOUT VALUE
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 232
Activity Cycles [Structural View]
Initial
Activity
Final
Activity
Cycle Successor
InfluencerBy-product
OutcomeInput
Precursor
Interim
Activity
Interim
Activity
• Cycles are iterative
• Activities are progressive
• Can begin w/ any activity
• Best to begin w/ initial activity
• Impact of activity increases with
‘distance’ - can span cycles
• Inputs are necessary
• Precursors can be incomplete (?)
• Influencers are ‘from the future’
• Influencers enhance the local
cycle
• By-products enhance the
precursor
• Assets are cumulative
• Assets depend on precursor
cycles
• Assets communicate via cycles
asset
types
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 233
Business Assets & Activity Cycles
Adapted from ‘Data Analysis Just One Component of the Data Science Workflow’
http://radar.oreilly.com/2013/09/data-analysis-just-one-component-of-the-data-science-workflow.html
Featurize
Wrangle
Visual
Analysis
Interactive
Queries
Discovery Modeling
Features
Data Application
VectorsEnrichments
Acquire Ingest
& Clean
Manage &
Update
Model Train
EvaluateUpdate
Build
MonitorStore &
Expose
Insights ModelsData
Train
Deploy
corpus
operational
analytical
archival
insight stream
awareness
explanatory
prescriptive
intelligence
machine
human
hybrid
systems
transactional
engagement
insight
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 234
Activity Integration Points / Interfaces
Initial
Activity
Final
Activity
Cycle Successor
InfluencerBy-product
OutcomeInput
Precursor
Interim
Activity
Interim
Activity
• Integration necessary for
individual activities to
communicate w/ one another w/in
a cycle
• Gaps = demand for enhancing
capabilities
• Integration is made possible by
enhancing capabilities
• Cycles = accelerated by good
integration
• Cycles = slowed by poor
integration
• Activity speed is not affected by
integration?
asset
types
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 235
Data Pipeline
Featurize
Wrangle
Visual
Analysis
Interactive
Queries
Discovery Modeling
Adapted from ‘Data Analysis Just One Component of the Data Science Workflow’
http://radar.oreilly.com/2013/09/data-analysis-just-one-component-of-the-data-science-workflow.html
Features
Data Application
VectorsEnrichments
Acquire Ingest
& Clean
Manage &
Update
Model Train
EvaluateUpdate
Build
Train
Deploy
MonitorStore &
Expose
Insights ModelsData
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 236
Machine Intelligence Value Chain
Adapted from ‘Data Analysis Just One Component of the Data Science Workflow’
http://radar.oreilly.com/2013/09/data-analysis-just-one-component-of-the-data-science-workflow.html
Featuriz
Wrangl
Visual
Analys
Interactiv
e
Discover Modeling
Feature
Data Application
VectorEnrichmen
Acquir Ingest
&
Manage
&
Mode Trai
EvaluatUpdat
Buil
MonitoStore
&
Insight ModelsData
Trai
Deplo
corpus
operational
analytical
archival
insight
stream
awareness
intelligence
machine
human
hybrid
systems
transactional
engagement
insight
Process
operations?
transactional
engagement
insight
Apps
Metric
Create
Machine Intelligence
Operationalize
Machine Intelligence
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 237
Tool Archetypes
Featurize
Wrangle
Trifacta
Visual
Analysis
Platfora
Interactive
Queries
Datameer
Discovery ModelingData Application
Acquire Ingest
& Clean
Manage &
Update
Model Train
EvaluateUpdate
Build
Train
Deploy
MonitorStore &
Expose
Data science workbenches
Sense, yhat
Application Foundries
Azure ML, IBM
Traditional app studios
Java
Discovery Workbenches
BDD x
Data Integrators
Clover
Analysis Workbenches
Alteryx, Alpine
Analytics Platforms
Teradata, Pivotal
ML services
BigML, Wise.io, Skytree
Business Intelligence Suite
OBIEE, Cognos
Python notebooks
iPython, juPyter
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 238
Activity Cycles & Capabilities
Core
Capabilities
activity specific
progressive
Influencer
By-product
PublishImport
Precursor
• Core capabilities are necessary &
primary to complete a given cycle
• Enhancing capabilities are
secondary within a cycle
• Enhancing capabilities are
necessary to accumulate
assets(?)
• Enhancing capabilities are
necessary to advance to next
cycle(?)
asset
types
Workflow
Collaboration
PublicationAccelerators
Enhancing Capabilities
common
random access
Versioning
Successor
Provenance
Metadata
PublishImport
Curation
Governance
Import
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
enhancing capabilities
common
239
Assets & Capabilities
core
capabilities
asset specific
Workflow
Collaboration
PublicationAccelerators
Versioning
Provenance
Metadata Curation
Governance
Import
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 240
Asset Scope
Enterprise
Line of Business
Enterprise
Localized
Line of Business
Localized
• Scope determines / implies boundaries, metrics
• Distinct systems (IT) and processes (biz) for
each asset, at each level of scope
• Each distinct system and process = integration
point, create barrier to flow, require interface
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Enterprise
241
Asset Communication
Line of Business
Localized
• Scope determines / implies boundaries, metrics
• Distinct systems (IT) and processes (biz) for
each asset, at each level of scope
• Each distinct system and process = integration
point, create barrier to flow, require interfaceenhancing capabilities
common
enhancing capabilities
common
enhancing
capabilities
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 242
Capabililty Evolution
Core
Capabilities
activity specific
progressive
Influencer
By-product
PublishImport
Precursor
• Core capabilities are necessary &
primary to complete a given cycle
• Enhancing capabilities are
secondary within a cycle
• Enhancing capabilities are
necessary to accumulate
assets(?)
• Enhancing capabilities are
necessary to advance to next
cycle(?)
asset
types
Workflow
Collaboration
PublicationAccelerators
Enhancing Capabilities
common
random access
Versioning
Successor
Provenance
Metadata
PublishImport
Curation
Governance
Import
VALUE CHAIN MAP (WARDLEY MAPPING)

Contenu connexe

Tendances

Design Thinking For Business Strategy
Design Thinking For Business StrategyDesign Thinking For Business Strategy
Design Thinking For Business StrategyHarsh Jawharkar
 
Advancing Testing Program Maturity in your organization
Advancing Testing Program Maturity in your organizationAdvancing Testing Program Maturity in your organization
Advancing Testing Program Maturity in your organizationRamkumar Ravichandran
 
Demystifying User Experience - General Assembly
Demystifying User Experience - General AssemblyDemystifying User Experience - General Assembly
Demystifying User Experience - General AssemblyMike Biggs GAICD
 
UX STRAT Online 2021 Presentation by Dr. Hsien-Hui Tang and Michael T Lai
UX STRAT Online 2021 Presentation by Dr. Hsien-Hui Tang and Michael T LaiUX STRAT Online 2021 Presentation by Dr. Hsien-Hui Tang and Michael T Lai
UX STRAT Online 2021 Presentation by Dr. Hsien-Hui Tang and Michael T LaiUX STRAT
 
UX STRAT 2013: Nathan Shedroff, What It Means to be Strategic
UX STRAT 2013: Nathan Shedroff, What It Means to be StrategicUX STRAT 2013: Nathan Shedroff, What It Means to be Strategic
UX STRAT 2013: Nathan Shedroff, What It Means to be StrategicUX STRAT
 
Designing Convergence/Divergence
Designing Convergence/DivergenceDesigning Convergence/Divergence
Designing Convergence/Divergencefrog
 
What's the profile of a data scientist?
What's the profile of a data scientist? What's the profile of a data scientist?
What's the profile of a data scientist? BICC Thomas More
 
Design thinking: A New Way of Managing Tourism
Design thinking: A New Way of Managing Tourism Design thinking: A New Way of Managing Tourism
Design thinking: A New Way of Managing Tourism Prasanth Udayakumar
 
Customer Experience in the Rise of the Digital Age — Atlanta XD Meeting 9/13/...
Customer Experience in the Rise of the Digital Age — Atlanta XD Meeting 9/13/...Customer Experience in the Rise of the Digital Age — Atlanta XD Meeting 9/13/...
Customer Experience in the Rise of the Digital Age — Atlanta XD Meeting 9/13/...Jeremy Johnson
 
Leroza Presentation : Light Color Version
Leroza Presentation : Light Color VersionLeroza Presentation : Light Color Version
Leroza Presentation : Light Color VersionMadlis
 
Making Sense of Lean Startup Strategies
Making Sense of Lean Startup StrategiesMaking Sense of Lean Startup Strategies
Making Sense of Lean Startup StrategiesSathish Hariharan
 
Leroza Presentation : Dark Color Version
Leroza Presentation : Dark Color VersionLeroza Presentation : Dark Color Version
Leroza Presentation : Dark Color VersionMadlis
 
ROI of UX by Jagriti Pande
ROI of UX  by Jagriti PandeROI of UX  by Jagriti Pande
ROI of UX by Jagriti PandeJagriti Pande
 
Idea To Business
Idea To BusinessIdea To Business
Idea To Businessmpowered
 
Steve Blank’s Petal Diagram vs. Rod King’s Value Engine Map: Visual Tools for...
Steve Blank’s Petal Diagram vs. Rod King’s Value Engine Map: Visual Tools for...Steve Blank’s Petal Diagram vs. Rod King’s Value Engine Map: Visual Tools for...
Steve Blank’s Petal Diagram vs. Rod King’s Value Engine Map: Visual Tools for...Rod King, Ph.D.
 
The User Experience Brief
The User Experience BriefThe User Experience Brief
The User Experience BriefJohn Yesko
 
Crafting Experience Strategy
Crafting Experience StrategyCrafting Experience Strategy
Crafting Experience StrategyCathy Wang
 
UX STRAT USA 2021: Colette Kolenda, Spotify
UX STRAT USA 2021: Colette Kolenda, SpotifyUX STRAT USA 2021: Colette Kolenda, Spotify
UX STRAT USA 2021: Colette Kolenda, SpotifyUX STRAT
 
Email first a lean strategy & a workflow lens
Email first  a lean strategy & a workflow lensEmail first  a lean strategy & a workflow lens
Email first a lean strategy & a workflow lensMike Biggs GAICD
 
Building and Managing Customer-Centered Product Roadmaps
Building and Managing Customer-Centered Product RoadmapsBuilding and Managing Customer-Centered Product Roadmaps
Building and Managing Customer-Centered Product RoadmapsProduct School
 

Tendances (20)

Design Thinking For Business Strategy
Design Thinking For Business StrategyDesign Thinking For Business Strategy
Design Thinking For Business Strategy
 
Advancing Testing Program Maturity in your organization
Advancing Testing Program Maturity in your organizationAdvancing Testing Program Maturity in your organization
Advancing Testing Program Maturity in your organization
 
Demystifying User Experience - General Assembly
Demystifying User Experience - General AssemblyDemystifying User Experience - General Assembly
Demystifying User Experience - General Assembly
 
UX STRAT Online 2021 Presentation by Dr. Hsien-Hui Tang and Michael T Lai
UX STRAT Online 2021 Presentation by Dr. Hsien-Hui Tang and Michael T LaiUX STRAT Online 2021 Presentation by Dr. Hsien-Hui Tang and Michael T Lai
UX STRAT Online 2021 Presentation by Dr. Hsien-Hui Tang and Michael T Lai
 
UX STRAT 2013: Nathan Shedroff, What It Means to be Strategic
UX STRAT 2013: Nathan Shedroff, What It Means to be StrategicUX STRAT 2013: Nathan Shedroff, What It Means to be Strategic
UX STRAT 2013: Nathan Shedroff, What It Means to be Strategic
 
Designing Convergence/Divergence
Designing Convergence/DivergenceDesigning Convergence/Divergence
Designing Convergence/Divergence
 
What's the profile of a data scientist?
What's the profile of a data scientist? What's the profile of a data scientist?
What's the profile of a data scientist?
 
Design thinking: A New Way of Managing Tourism
Design thinking: A New Way of Managing Tourism Design thinking: A New Way of Managing Tourism
Design thinking: A New Way of Managing Tourism
 
Customer Experience in the Rise of the Digital Age — Atlanta XD Meeting 9/13/...
Customer Experience in the Rise of the Digital Age — Atlanta XD Meeting 9/13/...Customer Experience in the Rise of the Digital Age — Atlanta XD Meeting 9/13/...
Customer Experience in the Rise of the Digital Age — Atlanta XD Meeting 9/13/...
 
Leroza Presentation : Light Color Version
Leroza Presentation : Light Color VersionLeroza Presentation : Light Color Version
Leroza Presentation : Light Color Version
 
Making Sense of Lean Startup Strategies
Making Sense of Lean Startup StrategiesMaking Sense of Lean Startup Strategies
Making Sense of Lean Startup Strategies
 
Leroza Presentation : Dark Color Version
Leroza Presentation : Dark Color VersionLeroza Presentation : Dark Color Version
Leroza Presentation : Dark Color Version
 
ROI of UX by Jagriti Pande
ROI of UX  by Jagriti PandeROI of UX  by Jagriti Pande
ROI of UX by Jagriti Pande
 
Idea To Business
Idea To BusinessIdea To Business
Idea To Business
 
Steve Blank’s Petal Diagram vs. Rod King’s Value Engine Map: Visual Tools for...
Steve Blank’s Petal Diagram vs. Rod King’s Value Engine Map: Visual Tools for...Steve Blank’s Petal Diagram vs. Rod King’s Value Engine Map: Visual Tools for...
Steve Blank’s Petal Diagram vs. Rod King’s Value Engine Map: Visual Tools for...
 
The User Experience Brief
The User Experience BriefThe User Experience Brief
The User Experience Brief
 
Crafting Experience Strategy
Crafting Experience StrategyCrafting Experience Strategy
Crafting Experience Strategy
 
UX STRAT USA 2021: Colette Kolenda, Spotify
UX STRAT USA 2021: Colette Kolenda, SpotifyUX STRAT USA 2021: Colette Kolenda, Spotify
UX STRAT USA 2021: Colette Kolenda, Spotify
 
Email first a lean strategy & a workflow lens
Email first  a lean strategy & a workflow lensEmail first  a lean strategy & a workflow lens
Email first a lean strategy & a workflow lens
 
Building and Managing Customer-Centered Product Roadmaps
Building and Managing Customer-Centered Product RoadmapsBuilding and Managing Customer-Centered Product Roadmaps
Building and Managing Customer-Centered Product Roadmaps
 

Similaire à UX STRAT 2018 | Flying Blind On a Rocket Cycle: Pioneering Experience Centered Product Strategy For Emerging Spaces

Data Science Highlights
Data Science Highlights Data Science Highlights
Data Science Highlights Joe Lamantia
 
UX = ROI: It's not just a myth
UX = ROI: It's not just a mythUX = ROI: It's not just a myth
UX = ROI: It's not just a mythJeremy Johnson
 
Top 3 Ways to use your UX Team for Product Owners
Top 3 Ways to use your UX Team for Product OwnersTop 3 Ways to use your UX Team for Product Owners
Top 3 Ways to use your UX Team for Product OwnersJeremy Johnson
 
Intro to Artificial Intelligence w/ Target's Director of PM
 Intro to Artificial Intelligence w/ Target's Director of PM Intro to Artificial Intelligence w/ Target's Director of PM
Intro to Artificial Intelligence w/ Target's Director of PMProduct School
 
Including the User: How insights drive business #pswud2017
Including the User: How insights drive business #pswud2017Including the User: How insights drive business #pswud2017
Including the User: How insights drive business #pswud2017Jeremy Johnson
 
Transformér dine produkter med teknologi
Transformér dine produkter med teknologiTransformér dine produkter med teknologi
Transformér dine produkter med teknologiMicrosoft
 
MUDAI_Capabilities_Introduction (1)
MUDAI_Capabilities_Introduction (1)MUDAI_Capabilities_Introduction (1)
MUDAI_Capabilities_Introduction (1)Jim Fox
 
Product + UX: How to combine strengths to make something truly great!
Product + UX: How to combine strengths to make something truly great!Product + UX: How to combine strengths to make something truly great!
Product + UX: How to combine strengths to make something truly great!Jeremy Johnson
 
Shifting to Hypothesis-Driven Dev at Scale by Squarespace Sr PM
Shifting to Hypothesis-Driven Dev at Scale by Squarespace Sr PMShifting to Hypothesis-Driven Dev at Scale by Squarespace Sr PM
Shifting to Hypothesis-Driven Dev at Scale by Squarespace Sr PMProduct School
 
Bus strseries unlockingvalue-sept2010 -psds
Bus strseries unlockingvalue-sept2010 -psdsBus strseries unlockingvalue-sept2010 -psds
Bus strseries unlockingvalue-sept2010 -psdsAnna Caraveli
 
AnkitaGoyal-Resume
AnkitaGoyal-ResumeAnkitaGoyal-Resume
AnkitaGoyal-ResumeAnkita Goyal
 
Net Solutions JAVA Development Brochure
Net Solutions JAVA Development BrochureNet Solutions JAVA Development Brochure
Net Solutions JAVA Development BrochureNet Solutions
 
Use of Analytics to recover from COVID19 hit economy
Use of Analytics to recover from COVID19 hit economyUse of Analytics to recover from COVID19 hit economy
Use of Analytics to recover from COVID19 hit economyAmit Parija
 
UX STRAT Europe 2019: Zachary Jean Paradis, Publicis Sapient
UX STRAT Europe 2019: Zachary Jean Paradis, Publicis SapientUX STRAT Europe 2019: Zachary Jean Paradis, Publicis Sapient
UX STRAT Europe 2019: Zachary Jean Paradis, Publicis SapientUX STRAT
 
Jumeo Presentation : Dark Color Theme
Jumeo Presentation : Dark Color ThemeJumeo Presentation : Dark Color Theme
Jumeo Presentation : Dark Color Themepunkl.
 

Similaire à UX STRAT 2018 | Flying Blind On a Rocket Cycle: Pioneering Experience Centered Product Strategy For Emerging Spaces (20)

Data Science Highlights
Data Science Highlights Data Science Highlights
Data Science Highlights
 
Greetings david cutler inform and connect
Greetings   david cutler inform and connectGreetings   david cutler inform and connect
Greetings david cutler inform and connect
 
Business Transformation Using TOGAF
Business Transformation Using TOGAF Business Transformation Using TOGAF
Business Transformation Using TOGAF
 
UX = ROI: It's not just a myth
UX = ROI: It's not just a mythUX = ROI: It's not just a myth
UX = ROI: It's not just a myth
 
Greetings david cutler inform and connect
Greetings   david cutler inform and connectGreetings   david cutler inform and connect
Greetings david cutler inform and connect
 
Greetings david cutler inform and connect
Greetings   david cutler inform and connectGreetings   david cutler inform and connect
Greetings david cutler inform and connect
 
Horizon and david cutler
Horizon and david cutlerHorizon and david cutler
Horizon and david cutler
 
Top 3 Ways to use your UX Team for Product Owners
Top 3 Ways to use your UX Team for Product OwnersTop 3 Ways to use your UX Team for Product Owners
Top 3 Ways to use your UX Team for Product Owners
 
Intro to Artificial Intelligence w/ Target's Director of PM
 Intro to Artificial Intelligence w/ Target's Director of PM Intro to Artificial Intelligence w/ Target's Director of PM
Intro to Artificial Intelligence w/ Target's Director of PM
 
Including the User: How insights drive business #pswud2017
Including the User: How insights drive business #pswud2017Including the User: How insights drive business #pswud2017
Including the User: How insights drive business #pswud2017
 
Transformér dine produkter med teknologi
Transformér dine produkter med teknologiTransformér dine produkter med teknologi
Transformér dine produkter med teknologi
 
MUDAI_Capabilities_Introduction (1)
MUDAI_Capabilities_Introduction (1)MUDAI_Capabilities_Introduction (1)
MUDAI_Capabilities_Introduction (1)
 
Product + UX: How to combine strengths to make something truly great!
Product + UX: How to combine strengths to make something truly great!Product + UX: How to combine strengths to make something truly great!
Product + UX: How to combine strengths to make something truly great!
 
Shifting to Hypothesis-Driven Dev at Scale by Squarespace Sr PM
Shifting to Hypothesis-Driven Dev at Scale by Squarespace Sr PMShifting to Hypothesis-Driven Dev at Scale by Squarespace Sr PM
Shifting to Hypothesis-Driven Dev at Scale by Squarespace Sr PM
 
Bus strseries unlockingvalue-sept2010 -psds
Bus strseries unlockingvalue-sept2010 -psdsBus strseries unlockingvalue-sept2010 -psds
Bus strseries unlockingvalue-sept2010 -psds
 
AnkitaGoyal-Resume
AnkitaGoyal-ResumeAnkitaGoyal-Resume
AnkitaGoyal-Resume
 
Net Solutions JAVA Development Brochure
Net Solutions JAVA Development BrochureNet Solutions JAVA Development Brochure
Net Solutions JAVA Development Brochure
 
Use of Analytics to recover from COVID19 hit economy
Use of Analytics to recover from COVID19 hit economyUse of Analytics to recover from COVID19 hit economy
Use of Analytics to recover from COVID19 hit economy
 
UX STRAT Europe 2019: Zachary Jean Paradis, Publicis Sapient
UX STRAT Europe 2019: Zachary Jean Paradis, Publicis SapientUX STRAT Europe 2019: Zachary Jean Paradis, Publicis Sapient
UX STRAT Europe 2019: Zachary Jean Paradis, Publicis Sapient
 
Jumeo Presentation : Dark Color Theme
Jumeo Presentation : Dark Color ThemeJumeo Presentation : Dark Color Theme
Jumeo Presentation : Dark Color Theme
 

Plus de Joe Lamantia

Iterative Discovery and Analysis: Workflow / Activity and Capability Model
Iterative Discovery and Analysis: Workflow / Activity and Capability ModelIterative Discovery and Analysis: Workflow / Activity and Capability Model
Iterative Discovery and Analysis: Workflow / Activity and Capability ModelJoe Lamantia
 
Empirical discovery concept model
Empirical discovery concept modelEmpirical discovery concept model
Empirical discovery concept modelJoe Lamantia
 
Discovery and the Age of Insight: Walmart EIM Open House 2013
Discovery and the Age of Insight: Walmart EIM Open House 2013Discovery and the Age of Insight: Walmart EIM Open House 2013
Discovery and the Age of Insight: Walmart EIM Open House 2013Joe Lamantia
 
Big Data Is Not the Insight: The Language Of Discovery:
Big Data Is Not the Insight: The Language Of Discovery: Big Data Is Not the Insight: The Language Of Discovery:
Big Data Is Not the Insight: The Language Of Discovery: Joe Lamantia
 
Designing Big Data Interactions Using the Language of Discovery
Designing Big Data Interactions Using the Language of DiscoveryDesigning Big Data Interactions Using the Language of Discovery
Designing Big Data Interactions Using the Language of DiscoveryJoe Lamantia
 
Designing Big Data Interactions Using the Language of Discovery
Designing Big Data Interactions Using the Language of DiscoveryDesigning Big Data Interactions Using the Language of Discovery
Designing Big Data Interactions Using the Language of DiscoveryJoe Lamantia
 
The Language of Discovery: Designing Big Data Interactions
The Language of Discovery: Designing Big Data InteractionsThe Language of Discovery: Designing Big Data Interactions
The Language of Discovery: Designing Big Data InteractionsJoe Lamantia
 
User Experience Architecture For Discovery Applications
User Experience Architecture For Discovery ApplicationsUser Experience Architecture For Discovery Applications
User Experience Architecture For Discovery ApplicationsJoe Lamantia
 
Social Interaction Design For Augmented Reality: Patterns and Principles for ...
Social Interaction Design For Augmented Reality: Patterns and Principles for ...Social Interaction Design For Augmented Reality: Patterns and Principles for ...
Social Interaction Design For Augmented Reality: Patterns and Principles for ...Joe Lamantia
 
Understanding Frameworks: Beyond Findability IA Summit 2010
Understanding Frameworks: Beyond Findability IA Summit 2010Understanding Frameworks: Beyond Findability IA Summit 2010
Understanding Frameworks: Beyond Findability IA Summit 2010Joe Lamantia
 
Design Principles for Social Augmented Experiences: Next Wave of AR Panel | W...
Design Principles for Social Augmented Experiences: Next Wave of AR Panel | W...Design Principles for Social Augmented Experiences: Next Wave of AR Panel | W...
Design Principles for Social Augmented Experiences: Next Wave of AR Panel | W...Joe Lamantia
 
Personal Finance On-line: New Models & Opportunities
Personal Finance On-line: New Models & OpportunitiesPersonal Finance On-line: New Models & Opportunities
Personal Finance On-line: New Models & OpportunitiesJoe Lamantia
 
Designing Goal-based Experiences
Designing Goal-based ExperiencesDesigning Goal-based Experiences
Designing Goal-based ExperiencesJoe Lamantia
 
Social Media: Strategic Overview & Business Implications
Social Media: Strategic Overview & Business ImplicationsSocial Media: Strategic Overview & Business Implications
Social Media: Strategic Overview & Business ImplicationsJoe Lamantia
 
Digital Music Services (Strategic Review & Options)
Digital Music Services (Strategic Review & Options)Digital Music Services (Strategic Review & Options)
Digital Music Services (Strategic Review & Options)Joe Lamantia
 
Search Me: Designing Information Retrieval Experiences
Search Me: Designing Information Retrieval ExperiencesSearch Me: Designing Information Retrieval Experiences
Search Me: Designing Information Retrieval ExperiencesJoe Lamantia
 
Designing Frameworks For Interaction and User Experience
Designing Frameworks For Interaction and User Experience Designing Frameworks For Interaction and User Experience
Designing Frameworks For Interaction and User Experience Joe Lamantia
 
Massively Social Games: Next Generation Experiences
Massively Social Games: Next Generation ExperiencesMassively Social Games: Next Generation Experiences
Massively Social Games: Next Generation ExperiencesJoe Lamantia
 
Waves of Change Shaping Digital Experiences
Waves of Change Shaping Digital ExperiencesWaves of Change Shaping Digital Experiences
Waves of Change Shaping Digital ExperiencesJoe Lamantia
 
Frameworks Are The Future of Design
Frameworks  Are The Future of DesignFrameworks  Are The Future of Design
Frameworks Are The Future of DesignJoe Lamantia
 

Plus de Joe Lamantia (20)

Iterative Discovery and Analysis: Workflow / Activity and Capability Model
Iterative Discovery and Analysis: Workflow / Activity and Capability ModelIterative Discovery and Analysis: Workflow / Activity and Capability Model
Iterative Discovery and Analysis: Workflow / Activity and Capability Model
 
Empirical discovery concept model
Empirical discovery concept modelEmpirical discovery concept model
Empirical discovery concept model
 
Discovery and the Age of Insight: Walmart EIM Open House 2013
Discovery and the Age of Insight: Walmart EIM Open House 2013Discovery and the Age of Insight: Walmart EIM Open House 2013
Discovery and the Age of Insight: Walmart EIM Open House 2013
 
Big Data Is Not the Insight: The Language Of Discovery:
Big Data Is Not the Insight: The Language Of Discovery: Big Data Is Not the Insight: The Language Of Discovery:
Big Data Is Not the Insight: The Language Of Discovery:
 
Designing Big Data Interactions Using the Language of Discovery
Designing Big Data Interactions Using the Language of DiscoveryDesigning Big Data Interactions Using the Language of Discovery
Designing Big Data Interactions Using the Language of Discovery
 
Designing Big Data Interactions Using the Language of Discovery
Designing Big Data Interactions Using the Language of DiscoveryDesigning Big Data Interactions Using the Language of Discovery
Designing Big Data Interactions Using the Language of Discovery
 
The Language of Discovery: Designing Big Data Interactions
The Language of Discovery: Designing Big Data InteractionsThe Language of Discovery: Designing Big Data Interactions
The Language of Discovery: Designing Big Data Interactions
 
User Experience Architecture For Discovery Applications
User Experience Architecture For Discovery ApplicationsUser Experience Architecture For Discovery Applications
User Experience Architecture For Discovery Applications
 
Social Interaction Design For Augmented Reality: Patterns and Principles for ...
Social Interaction Design For Augmented Reality: Patterns and Principles for ...Social Interaction Design For Augmented Reality: Patterns and Principles for ...
Social Interaction Design For Augmented Reality: Patterns and Principles for ...
 
Understanding Frameworks: Beyond Findability IA Summit 2010
Understanding Frameworks: Beyond Findability IA Summit 2010Understanding Frameworks: Beyond Findability IA Summit 2010
Understanding Frameworks: Beyond Findability IA Summit 2010
 
Design Principles for Social Augmented Experiences: Next Wave of AR Panel | W...
Design Principles for Social Augmented Experiences: Next Wave of AR Panel | W...Design Principles for Social Augmented Experiences: Next Wave of AR Panel | W...
Design Principles for Social Augmented Experiences: Next Wave of AR Panel | W...
 
Personal Finance On-line: New Models & Opportunities
Personal Finance On-line: New Models & OpportunitiesPersonal Finance On-line: New Models & Opportunities
Personal Finance On-line: New Models & Opportunities
 
Designing Goal-based Experiences
Designing Goal-based ExperiencesDesigning Goal-based Experiences
Designing Goal-based Experiences
 
Social Media: Strategic Overview & Business Implications
Social Media: Strategic Overview & Business ImplicationsSocial Media: Strategic Overview & Business Implications
Social Media: Strategic Overview & Business Implications
 
Digital Music Services (Strategic Review & Options)
Digital Music Services (Strategic Review & Options)Digital Music Services (Strategic Review & Options)
Digital Music Services (Strategic Review & Options)
 
Search Me: Designing Information Retrieval Experiences
Search Me: Designing Information Retrieval ExperiencesSearch Me: Designing Information Retrieval Experiences
Search Me: Designing Information Retrieval Experiences
 
Designing Frameworks For Interaction and User Experience
Designing Frameworks For Interaction and User Experience Designing Frameworks For Interaction and User Experience
Designing Frameworks For Interaction and User Experience
 
Massively Social Games: Next Generation Experiences
Massively Social Games: Next Generation ExperiencesMassively Social Games: Next Generation Experiences
Massively Social Games: Next Generation Experiences
 
Waves of Change Shaping Digital Experiences
Waves of Change Shaping Digital ExperiencesWaves of Change Shaping Digital Experiences
Waves of Change Shaping Digital Experiences
 
Frameworks Are The Future of Design
Frameworks  Are The Future of DesignFrameworks  Are The Future of Design
Frameworks Are The Future of Design
 

Dernier

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

UX STRAT 2018 | Flying Blind On a Rocket Cycle: Pioneering Experience Centered Product Strategy For Emerging Spaces

  • 1. FLYING BLIND ON A ROCKET CYCLE PIONEERING EXPERIENCE-CENTERED PRODUCT STRATEGY FOR EMERGING SPACES
  • 2. JOE LAMANTIA Currently: VP Design & Development @ Bottomline Technologies Previous 20 years: end-to-end customer experience, all stages of product and service development, and digital / business transformation, focusing on emerging business and technology. Archetype(s): Sometime Entrepreneur / Proto-academic / Arm-chair Pro Cyclist https://www.linkedin.com/in/digitaljoelamantia/ @mojoe JoeLamantia.com [joelamantia.net]
  • 3. !3 Businesses around the world depend on Bottomline Technologies (NASDAQ: EPAY) solutions to help them make complex business payments simple, smart and secure, including some of the world’s largest banks, and private and publicly traded companies.
  • 4. This case study describes building a learning- driven strategy capability to guide an adventurous product development group focused on the new domains of big data analytics and machine intelligence. I’ll share the outcomes of our efforts to launch new products chartered directly around customer experience value; outline the methods, tools, and perspectives that powered product discovery and strategic planning; share a framework and patterns for identifying and understanding emerging domains; and review the application of this toolkit to new situations.
  • 6. ROADS?! WHERE WE’RE GOING, WE DON’T NEED ROADS…!
  • 8.
  • 12. BUSINESS STRATEGY IS ABOUT IDENTIFYING YOUR BUSINESS OBJECTIVES AND DECIDING WHERE TO INVEST TO BEST ACHIEVE THOSE OBJECTIVES. Marty Cagan http://svpg.com/business-strategy-vs-product-strategy/
  • 13. THE PRODUCT STRATEGY SPEAKS TO HOW YOU HOPE TO DELIVER ON THE BUSINESS STRATEGY. Marty Cagan http://svpg.com/business-strategy-vs-product-strategy/
  • 15.
  • 18. OPPORTUNITY ASSESSMENT “I ASK PRODUCT MANAGERS TO ANSWER TEN FUNDAMENTAL QUESTIONS” 1. Exactly what problem will this solve? (value proposition) 2. For whom do we solve that problem? (target market) 3. How big is the opportunity? (market size) 4. What alternatives are out there? (competitive landscape) 5. Why are we best suited to pursue this? (our differentiator) 6. Why now? (market window) 7. How will we get this product to market? (go-to-market strategy) 8. How will we measure success/make money from this product? (metrics/revenue strategy) 9. What factors are critical to success? (solution requirements) 10.Given the above, what’s the recommendation? (go or no-go) http://svpg.com/assessing-product-opportunities/ Assessing Product Opportunities by Marty Cagan | Dec 13, 2006
  • 19. PRODUCT DISCOVERY MODERN PRODUCT DISCOVERY • Introduction [:26] • Modern Product Discovery [:54] • The Evolution of Modern Product Discovery [4:15] • The Agile Manifesto [7:06] • The Rise of User Experience Design [8:47] • The Lean Startup: Eric Ries [9:49] • The Jobs-To-Be-Done Framework: Clayton Christensen and Anthony Ulwick [10:42] • OKRs and Design Sprints [12:12] • The Goal of Modern Product Discovery [14:27] • Putting Discovery Practices Into Context: The Opportunity Solution Tree [21:32] • The Future of Product Discovery [29:42] https://www.producttalk.org/2017/02/evolution-product-discovery/ The Evolution of Modern Product Discovery February 8, 2017 by Teresa Torres 9 Comments
  • 21. WHY ARE YOU HERE…?
  • 23. PRODUCT STRATEGY CHARTS A DESIRED SET OF COURSES THROUGH THE SPACE OF POSSIBLE PRODUCTS FOR A DOMAIN Joe Lamantia http://svpg.com/business-strategy-vs-product-strategy/
  • 27. WHAT AM I LOOKING FOR…?
  • 28.
  • 29. DEEP STRUCTURE CHANGE VECTORS EARLY SIGNALS INFLECTION POINTS EMERGING SPACES HOLISTIC EXPERIENCES
  • 30. EACH ASPECT = POTENTIAL LEVERAGE POINT FOR STRATEGIC ENGAGEMENT
  • 31. DEEP STRUCTURE CHANGE VECTORS EARLY SIGNALS INFLECTION POINTS EMERGING SPACES HOLISTIC EXPERIENCES
  • 32. DEEP STRUCTURE ENTERPRISE / B2B • Business process • Activity • Social structure: Organizational model • Boundaries • Regulation • IT / Systems architecture • Lifecycle • Flows: capital, information, people • Frame: shareholder value, social enterprise CONSUMER / B2C • Value scheme: wealth, love, knowledge, safety • Demographics • Boundaries • Mores • Culture • Social structure: community / group • Frame: active lifestyle, sustainability
  • 33. ONCE UPON A TIME…
  • 34. Information Visibility through Endeca Discovery Applications MDEX Engine Rapidly changing
 data and content Large volumes of 
 highly attributed records Structured and
 unstructured information Discovery Applications Intuitive user experience guides untrained users to discover relationships in data Specialized Database High performance database purpose built for data-driven search, navigation, and analytics Flexible Data Integration Consolidate structured and unstructured data to bridge whitespace between enterprise systems
  • 35. $$$$
  • 37.
  • 40.
  • 41. 1. GET IN THE HEADS OF DATA SCIENTISTS 2. BE THE SPIRIT OF THE PRODUCT
  • 45. UNDERSTAND & EMPATHIZE WITH CUSTOMER PERSPECTIVES >>ARTICULATE CUSTOMER VALUE SOURCES
  • 46. IDENTIFY BUSINESS IMPLICATIONS >> INFORM ALL STAGES OF PRODUCT & SERVICE DEVELOPMENT
  • 48. INVESTIGATING CUSTOMERS: “WHAT DO AP MANAGERS NEED (TO BE MORE EFFECTIVE (AT IMPROVING RECONCILIATION PROCESSES))? WHY?”
  • 49. OUTCOMES VALUE CHAINS MAP, CUSTOMER LANDSCAPE / SEGMENTS, PERSONAS, CAPABILITY MODELS, DOMAIN MODELS
  • 50. EXPLORING HYPOTHESES ABOUT VALUE: “AUTOMATION OF RECONCILIATION ACTIVITIES WILL ENABLE ACCOUNTS PAYABLE GROUPS IN MID-MARKET COMPANIES TO HANDLE 30% MORE TRANSACTIONS.”
  • 51. PRODUCT DEVELOPMENT IMPACT INNOVATION OPPORTUNITIES PRODUCT HYPOTHESES FOR VALIDATION PRODUCT CONCEPTS FOR PROTOTYPING PLANNING GUIDANCE (ROADMAP > EPIC > QA) DELIVERY GUIDANCE: FEATURES AND FUNCTIONS
  • 53. DUAL-TRACK AGILE 1. Hypothesis A “Lorum ipsem…” 2. Hypothesis B 3. Investigate A 4. Hypothesis C 5. Investigate B 6. Investigate C
  • 55.
  • 56. Data Scientist Square - San Francisco Bay Area Job Description Square is hiring a Data Scientist on our Risk team. The Risk team at Square is responsible for enabling growth while mitigating financial loss associated with transactions. We work closely with our Product and Growth teams to craft a fantastic experience for our buyers and sellers. Desired Skills & Experience As a Data Scientist on our Risk team, you will use machine learning and data mining techniques to assess and mitigate the risk of every entity and event in our network. You will sift through a growing stream of payments, settlements, and customer activities to identify suspicious behavior with high precision and recall. You will explore and understand our customer base deeply, become an expert in Risk, and contribute to a world-class underwriting system that helps Square provide delightful service to both buyers and sellers.
 
 To accomplish this, you are comfortable writing production code in Java and conducting exploratory data analysis in R and Python. You can take statistical and engineering ideas from prototype to production. You excel in a small team setting and you apply expert knowledge in engineering and statistics.
 
 Responsibilities 1. Investigate, prototype and productionize features and machine learning models to identify good and bad behavior. 2. Design, build, and maintain robust production machine learning systems. 3. Create visualizations that enable rapid detection of suspicious activity in our user base. 4. Become a domain expert in Risk. 5. Participate in the engineering life-cycle. 6. Work closely with analysts and engineers. Requirements 1. Ability to find a needle in the haystack. With data. 2. Extensive programming experience in Java and Python or R. 3. Knowledge of one or more of the following: classification techniques in machine learning, data mining, applied statistics, data visualization. 4. Concise verbal and written articulation of complex ideas. Even Better 1. Contagious passion for Square’s mission. 2. Data mining or machine learning competition experience. Company Description Square is a revolutionary service that enables anyone to accept credit cards anywhere. Square offers an easy to use, free credit card reader that plugs into a phone or iPad. It's simple to sign up. There is no extra equipment, complicated contracts, monthly fees or merchant account required.
 
 Co-founded by Jim McKelvey and Jack Dorsey in 2009, the company is headquartered in San Francisco.
  • 57. The Conway Model The ‘Subway’ Model
  • 58.
  • 59. WHAT SORT OF PERSON? ▸ They seem different than analysts: ▸ problem set ▸ relationship to discovery tools ▸ skills and professional profile ▸ discovery / analytical methods ▸ perspective ▸ workflow and collaboration ▸ Are they? How?
  • 60. AREAS OF INVESTIGATION ▸ Workflow ▸ Environment ▸ Organizational model ▸ Pain points ▸ Tools ▸ Data landscape ▸ Analytical practices ▸ Project structure ▸ Unmet needs
  • 61. TEXT
  • 62. DISCUSSION GUIDE Can you please walk me through a recent or current project? a. How was the project initiated? b. How defined was the business problem in the beginning? Did the problem change? c. Where/who did you obtain data sets from? How did you make the decision? d.Describe the data you used: How did the data sets look like? How big were they? Were they structured or unstructured? e. What tools or techniques did you use to do the analyses? Did they map to the specific steps you mentioned just now? f. How did you decide these were the tools/techniques to use? To what extent were these decisions made by yourself and to what extent were they standardized by your group/team? g. How did you present the results of your analyses? What tools did you use? What do you like and dislike about your current tool set? h. Which stage of this project was the most challenging? To what extent did the tools satisfy what you intended to do? What features were lacking? i. How much collaboration was there during each stage of the project? i. Background and role of collaborators ii. Collaboration modes iii. Types of information shared Thinking about the projects you have worked on, is there a common approach you take to address these problems? How did you decide on this approach/tools?
  • 63. NEEDS What are the most common and useful statistical techniques you use during discovery and analysis efforts? “(1) The most commonly used statistical techniques used to date (in our strategic planning work) are:  dimensionality reduction (partition clustering, multiple correspondence analysis), factor analysis, partition clustering (k-means, k-medoids, fuzzy clustering), cluster validation techniques (silhouette, dunn’s index, connectivity), multivariate outlier detection, linear regression, and logistic regression.” What statistical capabilities or functions would be very useful if provided within Endeca discovery applications, and where would they be useful? (2) Techniques that would assist with identifying outliers or invalid data.  Much of this work seems to be done by hand.  I believe that we are also getting to the point where we could start using linear regression and splines (for showing trends).”
  • 64.
  • 65. NEEDS For example, would system-generated descriptive statistical visualizations be useful for whole data sets - or for smaller user- selected groups of attributes?   “With regards to your last question on visualization, we have put in significant effort to use visualization in our Endeca installation.  We have built visualizations such as tree maps, flow diagrams, sun burst diagrams, scatter plots showing clusters, and hierarchical edge bundling diagrams to explore our data sets.  Would it be useful for the application to analyze and suggest possible distribution models it sees in the data; for the values of individual attributes, and/or for larger sets of data? Our data tends to be qualitative rather than quantitative so this drives much of our visualizations. So yes, interactive descriptive statistical visualization would be helpful – on the complete data set and individual attributes.”
  • 66. Discovery/Information Needs Support longer term strategic planning: •How can we decrease the time-to-install service for new customers •How can we decrease the time it takes to restore service after a storm causes wide- spread outages •How can we decrease operational cost for each department/line of business •How many call center representatives do I need in my call center •How much offsite technician headcount do we need based on historical/seasonal trends balanced against current customer install base and ongoing sales/marketing efforts?  Evaluate Success: •How effective was a particular marketing campaign •How effective is a new training program for call center representatives •How effective is a self-install approach Understanding variables that impact KPIs.  KPIs include: •Call center volume •% successful resolution by support staff •Time-to-install •Sales volume •Sales revenue  Understanding & Explaining Variance using Retrospective Analyses •Why does Connecticut have a shorter time-to-install than Rhode Island •Why did 2 identical marketing campaigns in 2 different markets have vastly different impact on sales •Is the variance significant, or does it represent random deviation?  Ad-hoc Reporting •How many calls to the call center needed to be escalated to tier 2 support last month •How many new customers complained that a technician was later/didn't show up for the install appointment Analyst Profile: Scott – Operations Analyst Summary Education BA Information Systems (Connecticut State College) MBA  Org Leadership (Johnson & Wales) Scott is a mid-level analyst with a background in Business Information Systems, and MBA in Organizational Leadership.  He works in a 6-person team at Cox-New England (Telecommunications). His current role involves conducting data mining analysis to support operations research and organizational decision making/strategic planning. Scott's work supports both sides of the profit equation: operations research/analysis to support internal cost-cutting and process innovation, and formative/summative evaluation to help drive effective sales/ marketing efforts to increase revenue.  His group is also given target cost savings goals that they need to help individual departments achieve to fulfill a cost reduction organizational mandate.  His group accomplishes this by discovering inefficiencies in process through data mining, predictive modeling and retrospective data analysis. Cox has highly attributed enterprise data on customers, marketing campaigns, pricing variants and special offers, demographics, geography of the area, building and home types, school schedules, weather events, etc. that describe customer usage patterns, consumption of media bandwidth, etc. Each of their products (data, cable, phone, wireless) has different usage profiles that vary along many of the dimensions and variables listed above. His group is focused on residential customers; business customers are handled by a separate unit.    
  • 67.
  • 68. ‘FIVE THINGS ANALYSTS DO WITH DATA’ ▸ Clustering ▸ Dimension Reduction ▸ Anomaly Detection ▸ Characterization ▸ Testing probability model & validation Source: Frontiers in Massive Data Analysis http://www.nap.edu/openbook.php?record_id=18374 } } Structure of data Profile of data } Validity of data
  • 71. Business Analytics Data Science Intuitive Manual Gradual Individual Empirical Augmented Accelerated Cooperative* Nature of sense making activity
  • 73. Sense Maker Segment Sense makers need to create and/or employ insights to accomplish their business goals and satisfy their responsibilities. These insights emerge from independent and collaborative discovery efforts that involve direct interaction with discovery applications, and participation in discovery environments. Insight Consumer Analyst Casual Analyst Data Scientist Analytics Manager Problem Solver
  • 74. Creates data-driven insights, offerings, and resources to transform the organization Work Experience 10 Years Education Ph.D. Statistics, MS Bio-Informatics Job Title Senior Data Scientist Company LInkedIn Summarize & Communicate Review findings with colleagues; summarize ,visualize, and communicate key findings to Insight Consumers/decision makers Prototype & Experiment with data driven feature: How can we prototype/ evaluate this w/out disrupting the site? Gather Data & Analyze Results Use descriptive, inferential, and predictive statistics to evaluate results Analyze & Identify causal/ predictive factors: Who are the best candidates to contact for a job based on recruiter needs and profile content? Dana Data Scientist • Defining and capturing useful measures of online attention • Getting all the data analytic tools to work together properly • No current workflow support or tools for data wrangling, analysis, experimentation,, and prototyping • Effective tools to help experiment with and evaluate value /utility of features and activities for users • Ability to rapidly prototype data-driven features w/out risk of online service disruptions • Open source data manipulation, mining & analysis tools including R, Pig, Hadoop, Python, etc. • Statistical packages such as SAS, SPSS, etc. • Custom analytical tools built using open source components and languages • Leverage data to support the org mission • Enhance products & services with data-driven insights and features • Use data to identify new opportunities and prototype/drive new customer offerings • Create useful data sets/streams, measures, & resources (e.g., data models, algorithms, etc. Key Goals Tools Pain Points Wish List Sample Workflow Dana is a Senior Data Scientist who has worked at LinkedIn for 5 years. Dana’s education includes a Ph.D. in Statistics and an MS in Bio Informatics. Dana’s previous work includes positions in academic research groups as a doctoral candidate and post-doc, as well as software engineering roles in the Internet & technology industries. •Dana works with several other data scientists and her Analytics Manager on a centralized team •Dana and her colleagues aim to create data driven insights, features, resources, and offerings that deliver strategic value to LinkedIn •Dana works with Analysts on other teams to define and create discovery tools, data sets, and methods for use by their groups at LinkedIn. •Dana & team are visible & well established within LinkedIn, and have a voice in product strategy and operational context; they have a high degree of autonomy in defining data science projects •Dana works with Insight Consumers to suggest and determine potential new data driven offerings to prototype and evaluate. • How can we leverage data to increase online engagement with LinkedIn? •How should we measure engagement & what factors drive it? •What aspects of a personal profile are most likely to encourage / discourage new connections between people? •How can we increase people’s activity and contributions to topical discussion groups? • What factors drive the effectiveness of our marketing campaigns? •Why did one of our marketing campaigns work exceptionally well? • How can leverage data to help recruiters identify and communicate effectively with qualified and potentially available candidates? Typical Discovery Scenarios & Problems Background Work Context • Mines, analyzes, & experiments with data to identify patterns, trends, outliers, causal factors, predictive models, & opportunities • Defines and explains newly devised measurements, predictive models, & insights • Compares effectiveness of operations at achieving company goals for engagement, growth, data quality • Produces & explores new data sets • Collaborates with other data scientists to capture new data streams • Prototypes new data driven site features/ offerings • Runs data based experiments to test/ evaluate models, hypotheses & prototypes • Communicates & explains analyses to colleagues & Insight Consumers I’ll do whatever it takes – wrangle, extract, manipulate, analyze, experiment, prototype – to use data to drive value & innovate “ ” Activities
  • 75. Perspectives Analytical The analytical perspective is the center of definition for all analytical roles. Contrast with engineers, who "make stuff". Analytical roles figure things out for some purpose: whether a model to inform a product prototype or provide insight. Empirical The empirical perspective is distinct from the analytical perspective, and marks 'true' data scientists. This revolves around framing and testing hypotheses formally and informally, often requires validation and interrogation of experimental methods and results by others, expects significant degree of transparency at (all) stages of the analytical effort.
  • 76. Empirical Method Experiments Hypotheses Results Questions or beliefs Predictions Conclusions Insights Domain Production Models Data Sets Exploratory ValidationInvestigative TrainingModel Building Analytical Methods Insight Consumer Data Scientist Articulates Directs & applies Creates & refines Effected by Lead to Tested by Use / require Motivate Creates & refines Generate Achieves Informed by & shares Inform Understands Defines & evolves Inform Data Engineer Implements Determines Applied to validates Data Sources Used to define Applied to Development Corpus External Sources Production Corpus Mirrors Applied to Models Reference Initial Interim New Drawn from Analytical Tool Algorithm Script Test Implemented as Implements Inform What is the question? How will we answer the question? What data will we use? What analytical method will we use? What tools will we use? What are the results? What do the results mean? What did we learn / discover? Who should we inform? What is the next question? Manages Data ProductsManages EMPIRICAL DISCOVERY “a hybrid, purposeful, applied, augmented, iterative and serendipitous method for realizing novel insights for business, through analysis of large and diverse data sets.” Data Science and Empirical Discovery: A New Discipline Pioneering a New Analytical Method https://blogs.oracle.com/serendipity/entry/data_science_and_empirical_discovery
  • 78. Analysis Workflow & Activities • Empirical analysis of subsets of data –Understand topology of data, boundaries (sets / subsets, complete corpus, totality of data) • Outlier identification and profiling –How significant are outliers to overall topology »Comparative exclusion and profiling of resulting data subsets to understand their role, discover principal components • Find and analyze patterns, areas of interestingness / deserving attention • Find and analyze central actors / factors (in existing model that produced source data, in topology of working data, in patterns, etc.) –ID and understand their impact on local and global data topology and primary metrics if in several ways / more than one axis / at the same time • Discover and analyze relationships amongst central actors –Understand cycles, trends, changes (dynamic characteristics) for core actors, topology, patterns and structure –Understand causal factors • Codify / create new model reflecting insights & outcomes from experiments
  • 79. Data Science Workflow • Frame problem / goal of effort • Identify and extract data to be used in effort from whole corpus / totality of available data –Exploratory identification and selection of working data for use in experiments • Define experiment(s): hypothesis / null hypothesis, methods, success criteria –Derive insight(s) –Wrangle, process, visualize, interpret • Codify / create new model reflecting insights outcomes from experiments • Validate new model(s) • Provision training data • Train new model • Validation and outcome of training model • Hand-off for implementation on production systems / as production code
  • 80. THE ESSENCE ▸Empirical perspective ▸Business imperatives drive activities ▸Analytical approach ▸Recipe is always the same ▸Engineering always present ▸Data challenges are paramount ▸consume 60% - 80% of time and effort ▸Data volumes range huge to moderate (PB > MB) ▸Domain often drives analysis ▸Data scientists already have self-service ▸Some new problems, many the same ▸Use ‘advanced’ analytics, not conventional BA ▸Innovate by applying known analyses to new data ▸Current workflow fragmented across tools and data stores ▸Success can be a model, product, insight, infrastructure, tool
  • 81. Model of Analytical Workflow Articulates common analytical activities “realistic” - represents wrangling, some iterative dynamics bounded - does not represent business perspective Originated by Ben Lorica - O’Reilly *consistent with our research*
  • 82. UNDERSTAND & EMPATHIZE WITH CUSTOMER PERSPECTIVES >>ARTICULATE CUSTOMER VALUE SOURCES
  • 84.
  • 85. THE ESSENCE ▸Empirical perspective ▸Business imperatives drive activities ▸Analytical approach ▸Recipe is always the same ▸Engineering always present ▸Data challenges are paramount ▸consume 60% - 80% of time and effort ▸Data volumes range huge to moderate (PB > MB) ▸Domain often drives analysis ▸Data scientists already have self-service ▸Some new problems, many the same ▸Use ‘advanced’ analytics, not conventional BA ▸Innovate by applying known analyses to new data ▸Current workflow fragmented across tools and data stores ▸Success can be a model, product, insight, infrastructure, tool
  • 86. “…HOUSTON, WE'VE GOT A PROBLEM”
  • 87. John is tasked with analyzing 30 years of crime data collected by three different authorities. Accordingly, the data arrive in three different formats: one source is a relational database, another is a comma-separated values (CSV) file, and the third file contains data copied from various tables within a portable document format (PDF) report. Knowing the structure required for his visualization tool, John first reviews the different data sets to identify potential problems (step 1 in Figure 1). The relational database allows him to specify a query and generate a file in an acceptable format. For the comma delimited data, the column headings associated with the data were unclear. Using spreadsheet software he adds a row of header information at the top to fit the format required by the visualization tool. While updating the header, John notices that the location of a given crime is encoded in one column (as ‘City, State’) in the CSV file and encoded in two columns (one ‘City’ column and one ‘State’ column) in the relational database. He decides to split the column in the CSV file into two separate columns. John then opens the text file in the spreadsheet but the spreadsheet does not parse the data as desired. After manually moving data fields to appropriate columns and some other manipulation (step 2), John finally has consistent columns and now combines the three files into one, but then notices that some columns have inconsistently formatted cells. The ‘Date’ column is formatted as ‘dd/mm/yy’ in some cells and as ‘mm/dd/yyyy’ in others. John returns to the original files, transforms all the dates to the same format, and recombines the files. John loads the merged data file in a visualization tool (step 3). The tool immediately gives the error message ‘Empty cells in column 3’; it cannot cope with missing data. John returns to the spreadsheet to fill in missing values using a few spreadsheet formulas (back to step 2). He edits the data by hand; sometimes he transforms the data (e.g. one state reports data only every other year so he uses an average for the missing years). At other times there is nothing he can do after diagnosing a new problem (i.e. return to step 1). For example, he finds out that survey question 24 did not exist before 2000, and the most recent year of data from Ohio has not been delivered yet, so he tries to pick the best possible value (e.g. 1) to indicate missing values. John detects other, more nuanced, problems; for example, some cells have a blank space instead of being empty. It took hours to notice that difference. John tries to follow a systematic approach when evaluating the data, but it is difficult to keep track of what he has inspected and how he has modified the data, especially because he discovers different issues across different files. Even after all of this work, he is not sure if he has examined all of the variables or overlooked any outliers. After a while, the data file seems good enough and he decides to move on. It took a few days so it is with a great sense of accomplishment that John finally loads the data for the second time into the visualization tool he wants to use (step 3 again). He constructs several views of the data, including a geospatial representation of the crimes and a scatterplot of age against crime. As soon as he sees the visualized data he realizes that, unfortunately, data quality issues still persist. Extreme outliers appear in the visualization. Some outliers seem to be valid data (e.g. data from the District of Columbia are very different from data from every other state). Others seem suspicious (criminals may vary in age from teenagers to older adults, but apparently babies are also committing crimes in certain states). John iteratively removes those outliers he believes to be dirty data (e.g. criminals under 7 and over 120 years old). Times eries visualizations indicate that, in 1995, some causes of death disappear abruptly while new ones appear.Two days later, an email exchange with colleagues reveals that the classification of causes of death was changed that year. John writes a transformation script to merge the data so he can analyze distinct terms referring to the same (or at least similar) cause of death. Although the ‘real’ analysis is just about to start (step 4), John has made dozens of transformations, repeated the process several times, made important discoveries relating to the quality of the data, and made many decisions impacting the quality of the final ‘clean’ data. He also used visualization repeatedly while walking through the process, but still does not have results to show to his boss. Finally, he is able to work with the usable data, and useful insights come to the surface, but updated data sets arrive (step 5). Without proper documentation (step 6) of his transformations, John might be forced to repeat many of the tedious tasks. “Research directions in data wrangling: Visualizations and transformations for usable and credible data” “a process of iterative data exploration and transformation that enables analysis.” WRANGLING SCENARIO
  • 88. Although the ‘real’ analysis is just about to start (step 4), John has made dozens of transformations, repeated the process several times, made important discoveries relating to the quality of the data, and made many decisions impacting the quality of the final ‘clean’ data. He also used visualization repeatedly while walking through the process, but still does not have results to show to his boss. Finally, he is able to work with the usable data, and useful insights come to the surface, but updated data sets arrive (step 5). Without proper documentation (step 6) of his transformations, John might be forced to repeat many of the tedious tasks. “Research directions in data wrangling: Visualizations and transformations for usable and credible data” “a process of iterative data exploration and transformation that enables analysis.” WRANGLING SCENARIO
  • 89. One or more initial data sets may be used and new versions may come later. The wrangling and analysis phases overlap. While wrangling tools tend to be separated from the visual analysis tools, the ideal system would provide integrated tools (light yellow). The purple line illustrates a typical iterative process with multiple back and forth steps. Much wrangling may need to take place before the data can be loaded within visualization and analysis tools, which typically immediately reveals new problems with the data. Wrangling might take place at all the stages of analysis as users sort out interesting insights from dirty data, or new data become available or needed. At the bottom we illustrate how the data evolves from raw data to usable data that leads to new insights. “a process of iterative data exploration and transformation that enables analysis.” WRANGLING IN THE ANALYTICAL WORKFLOW
  • 91. Discovery in the Analytical Workflow • Commonly recognizable cycle and focus for discovery activities (subset) • Explicitly iterative, ad-hoc, dynamic • Goal = incremental / directional advance in understanding • Core modes of engagement with data = Explore, Analyze • Modeling phase does not involve exploration Discovery
  • 92. DEEP STRUCTURE CHANGE VECTORS EARLY SIGNALS INFLECTION POINTS EMERGING SPACES HOLISTIC EXPERIENCES
  • 96. The Language of Discovery: A concrete descriptive language for human discovery activity in diverse contexts. A simple and consistent vocabulary that is independent of domain, role, information type, etc. The Language of Discovery: A concrete descriptive language for human discovery activity in diverse contexts. A simple and consistent vocabulary that is independent of domain, role, information type, etc.
  • 99. Generative tool for discovery capability and experiences
  • 101. Discovery Modes “a broad, but identifiable discovery activity that is not tied exclusively to a particular context or domain.”
  • 103. Locate To find a specific (possibly known) thing e.g. I need to find a new part with particular technical attributes and then source it from the most qualified supplier - Engineering Verify ‘To confirm or substantiate that an item or set of items meets some specific criterion’ e.g. How can I determine if I am looking at the latest information for a part or supplier? - Supply Chain Specialist Monitor ‘To maintain awareness of the status of an item or data set for purposes of management or control’ e.g. I need to monitor at risk/failing customers/dealers so I can prompt my Account Reps to fix the problems - Sales Manager
  • 104. Compare To examine two or more things to identify similarities & differences e.g. I need to compare our module set teardowns with competitive teardown information to see if we’re staying competitive for cost, quality and functionality - Engineering Comprehend To generate insight by understanding the nature or meaning of something e.g. I need to analyze and understand consumer-customer-market trends to inform brand strategy & communications plan – Director, Brand Image Explore To proactively investigate or examine something for the purpose of knowledge discovery e.g. I need to understand the cost drivers for this commodity so I can negotiate better terms with my suppliers and forecast business risk based on market indices - Procurement
  • 105. Analyze To critically examine the detail of something to identify patterns & relationships e.g. I need to know the cost drivers for a part such as materials that impact cost. Is the relationship a correlation or step function for a part cost driver? - Engineering Evaluate To use judgement to determine the significance or value of something with respect to a specific benchmark or model e.g. I need to determine my current state in my prints so I can evaluate if I have price variation to negotiate a better price - Procurement Synthesize To generate or communicate insight by integrating diverse inputs to create a novel artifact or composite view e.g. I need to prepare a weekly report for my boss (sales mgr) of how things are going - Account Rep
  • 109.
  • 110.
  • 111. Discovery Modes and Activity Explore Wrangle Analyze Augment Sensemaking Transformation data quality computed / enriched data New data triggers new cycles Cumulative Change Direction & Momentum Begin Conclude Goal: Make data useful for analysis Goal: Understand the nature and usefulness of data for analysis. Goal: Accumulate insight through iterative analysis Goal: Achieve insights by analyzing data.
  • 112. Working with data to effect outcomes Explore Wrangle Analyze Augment Sensemaking Transformation data quality computed / enriched data New data triggers new cycles Cumulative Change Direction & Momentum Begin Conclude Advancing insight Can’t do this… …Without these capabilities Apparent Mode and Activity Affinities
  • 113. Explore Wrangle Analyze Augment Sensemaking Transformation source data source & enriched data New data triggers new cycles Cumulative incremental progress Focus of attention: Organization of the data and quality issues Focus of attention: Actual & potential insights Real wrangling Real analysis Actual Discovery Modes and Activity Affinities
  • 114. CAPABILITIES FOR VISUAL DISCOVERY & ANALYSIS TOOLS ▸ Explore data corpus ▸via effectively characterized catalog ▸ Explore individual data sets ▸effective preview / sample / subset ▸ Analyze data ▸within ad-hoc data sets, across ad-hoc data sets ▸ Wrangle data ▸within ad-hoc data sets, across ad-hoc data sets ▸ Verify outcomes: insights, models, data products ▸ Synthesize outcomes ▸ distinct types = insights, model, data product (project) ▸ Publish outcomes ▸ distinct types = insight, data product, model (project) ▸ Integrate specialized / external analytical tools {augment} ▸ analysis tools (R, Python), reference models, validation tools ▸ Integrate external workflow tools {enhancing} ▸ e.g. figshare, model management, projects ▸ Support analytical workflow {enhancing}
  • 115. Discovery Capabilities: Core Explore Wrangle Analyze Augment Sensemaking Transformation data quality computed / enriched data Core discovery capabilities
  • 116. Discovery Capabilities: Enhancing Explore Wrangle Analyze Augment Sensemaking Transformation data quality computed / enriched data Publish & operationalize outcomes Workflow, provenance, versioning, accelerators, collaboration Acquire and access data Enhancing capabilities
  • 117. DEEP STRUCTURE CHANGE VECTORS EARLY SIGNALS INFLECTION POINTS EMERGING SPACES HOLISTIC EXPERIENCES
  • 118. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 118 Activity Cycles & Capabilities Core Capabilities activity specific progressive Influencer By-product PublishImport Precursor • Core capabilities are necessary & primary to complete a given cycle • Enhancing capabilities are secondary within a cycle • Enhancing capabilities are necessary to accumulate assets(?) • Enhancing capabilities are necessary to advance to next cycle(?) asset types Workflow Collaboration PublicationAccelerators Enhancing Capabilities common random access Versioning Successor Provenance Metadata PublishImport Curation Governance Import
  • 119. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 119 Capabililty Evolution Core Capabilities activity specific progressive Influencer By-product PublishImport Precursor • Core capabilities are necessary & primary to complete a given cycle • Enhancing capabilities are secondary within a cycle • Enhancing capabilities are necessary to accumulate assets(?) • Enhancing capabilities are necessary to advance to next cycle(?) asset types Workflow Collaboration PublicationAccelerators Enhancing Capabilities common random access Versioning Successor Provenance Metadata PublishImport Curation Governance Import
  • 122. OPPORTUNITY “IS THERE ANY THERE, THERE?”
  • 123. PRODUCT STRATEGY CHARTS A DESIRED SET OF COURSES THROUGH THE SPACE OF POSSIBLE PRODUCTS FOR A DOMAIN Joe Lamantia PRODUCT STRATEGY
  • 125.
  • 126.
  • 127.
  • 128. Tools on the Market Now Explore Wrangle Analyze Augment Sensemaking Transformation data quality computed / enriched data Cumulative Change Direction & Momentum Begin Conclude Paxata, Trifacta Beyond Core? OSS / hand rolled EID 3.x Wave 1 wrangling tools now in market No good exploration tool in market
  • 129. Tools on the Market Now Explore Wrangle Analyze Augment Sensemaking Transformation data quality computed / enriched data Cumulative Change Direction & Momentum Begin Conclude Alteryx Datameer Modest exploration capabilities
  • 130. Tools on the Market Now Explore Wrangle Analyze Augment Sensemaking Transformation data quality computed / enriched data Cumulative Change Direction & Momentum Begin Conclude Alteryx Modest exploration capabilities Qlik
  • 131. Tools on the Market Now Explore Wrangle Analyze Augment Sensemaking Transformation data quality computed / enriched data Cumulative Change Direction & Momentum Begin Conclude Tableau, Platfora Wave 1 visual analysis tools now in market Modest wrangling capabilities
  • 132. BDD 1.x? Explore Wrangle Analyze Augment Sensemaking Transformation data quality computed / enriched data Cumulative Change Direction & Momentum Begin Conclude
  • 133. BDD Future 1.x? Explore Wrangle Analyze Augment Sensemaking Transformation data quality computed / enriched data Cumulative Change Direction & Momentum Begin Conclude ‘Plugable’ external tools
  • 134. BDD Future 2.x? Explore Wrangle Analyze Augment Sensemaking Transformation data quality computed / enriched data Cumulative Change Direction & Momentum Begin Conclude
  • 135. VISUAL DISCOVERY AND ANALYSIS TOOLS: WAVE 1 Definition: traditional discovery & analysis possible on hadoop stores Value prop = easy access to hadoop stores for analysts w/out data engineer In / coming to market now: platfora, datameer, clearstory, sisense, etc. Segment is viable (people understand the need & have the problem) Tool maturity will increase incrementally, and in customary ways alignment to workflow particulars nuanced and compelling UX broader footprint of supporting capabilities: provenance, publishing, collaboration integration with ecosystem of related tools for activity This class of tools competes with & may replace / displace existing non-hadoop native tools that are still rising with the general analytics wave: qlik, tableau, microstrategy Firms making new investments (for new stacks) will try / buy this new generation Firms extending existing investments less likely to buy new Long view = tools in this segment could ‘eat’ BI marketshare by adding reporting and other structured analytical capabilities that capture customers who do not have large BI stacks now, begin investing here, and subsequently need BI capability
  • 136.
  • 137.
  • 139. DEEP STRUCTURE CHANGE VECTORS EARLY SIGNALS INFLECTION POINTS EMERGING SPACES HOLISTIC EXPERIENCES
  • 141. Oracle Confidential – Internal Oracle Big Data Discovery Overview Richard Tomlinson Director, Product Management September 25, 2014
  • 142. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal Hadoop Data Reservoir Concept Gaining Momentum 142 Data Warehouse Data Reservoir Emerging Sources Existing Sources Source: wikibon.org/wiki/v/Big_Data_Vendor_Revenue_and_Market_Forecast_2013-2017 Source: 451 Research – Total Data Warehousing: 2013-2018 Source: The Forrester WaveTM: Big Data Hadoop Solutions, Q1 2014
  • 143. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal Not Easy to Get Analytic Value from Hadoop 143 • Existing analytic tools fall short – Fail to expose potential of data up front – Rely on upstream ETL processes to cleanse and prepare data – Optimized for SQL not unstructured data – Not built for discovery (assume users know what questions to ask) • Only point solutions emerging – Leads to constant context switching – Need end-to-end capabilities • Early Hadoop tools complex – Pig, Oozie, Sqoop, Hive, Spark, etc • Specialized skills are scarce – Programming languages (e.g. Map Reduce, Python, Scala) – Statistics and machine learning – Command line interfaces
  • 144. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal Requires a Fundamentally New Approach 144 A single intuitive, interactive and visual user interface Explore TransformDiscover Find for anyone to quickly find, explore, transform and analyze data in Hadoop then share results for enterprise leverage
  • 145. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal 145 Oracle Big Data Discovery. The Visual Face of Hadoop Explore TransformDiscover Find
  • 146. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal • Navigate a rich catalog of all data in the Hadoop cluster • Familiar search and guided navigation for ease of use • Access data set summaries, annotation and recommendations • Provision your own data through self-service upload • Data is automatically enriched with extracted locations, terms, sentiment • Browse personal big data projects and those shared by the community 146 Easily Find Relevant Data Sets
  • 147. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal • Understand shape of the data. Visualize attributes by type • Entropy based sorting by information potential • View attribute statistics, data quality and outliers • Use scratch pad to see statistical correlations between attribute combinations • Evaluate whether a data set is worthy of further investment 147 Explore the Data and Understand Potential
  • 148. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal • Intuitive user driven data wrangling • Library of data transformations to replace values, convert types, collapse, reshape, pivot, group, custom tag, merge and much more • Data enrichments for inferring location and language. Theme, entity and sentiment enrichments for text • Preview results, undo, commit and replay transforms • Run on sample data in memory or full data set in Hadoop 148 Transform and Enrich Data to Make it Ready
  • 149. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal • Mash up different data sets for deeper perspectives • Drag and drop from a rich library of interactive visualizations to compose discovery dashboards • Filter through data with powerful search and intuitive guided navigation • Share projects, bookmarks and snapshots with team members for collaboration 149 Analyze the Data to Discover New Insights
  • 150. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal Share Results and Publish for Enterprise Leverage 150 • Share and collaborate with the team – Share projects, bookmarks and snapshots then collaborate and iterate • Publish back to Hadoop – Transforms and enrichments may be applied to original data sets in Hadoop – Publish blended data sets back to HDFS • Leverage results in other tools – Publish data to Hadoop in format optimized for advanced analytic tools (e.g. ORAAH) – Hadoop compliant BI tools (e.g. OBIFS) can burst out to the masses – Leverage any native Hadoop tooling (e.g. Pig, Hive, Impala, Python, etc) – Integrate BDD data sets with DWH to secure, govern and optimize for query performance (e.g. Oracle Big Data SQL) Oracle Big Data Discovery plays well with the big data ecosystem Explore Transfor mDiscover Find Share & Collaborate raw data transformed data data reservoir (HDFS) Publish data warehouse business intelligenc e advanced analytics other hadoop tools Leverage
  • 151. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal Oracle Big Data Discovery. Technical Innovation on Hadoop 151 Oracle Big Data Discovery Workloads Hadoop Cluster (BDA or Commodity) data node data node data node data node data node name node Data Processing, Workflow & Monitoring • Profiling: catalog entry creation, data type & language detection, schema configuration • Sampling: dgraph (index) file creation • Transforms: >100 functions • Enrichments: location (geo), text (cleanup, sentiment, entity, key- phrase, whitelist tagging) Self-Service Provisioning & Data Transfer • Personal Data: Upload CSV, XLS and JSON to HDFS • Enterprise Data: Provision from RDBMS to HDFS In-Memory Discovery Indexes • DGraph: Search, Guided Navigation, Analytics Studio • Web UI: Catalog, Explore, Transform, Analyze, Share Hadoop 2.x Filesystem (HDFS) Workload Mgmt (YARN) Metadata (HCatalog) Other Hadoop Workloads MapReduce Spark Hive Pig Oracle Big Data SQL (BDA only)
  • 152. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 152
  • 153. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 153
  • 154. DEEP STRUCTURE <> ANALYTICAL WORKFLOW CHANGE VECTORS <> BIG DATA TECHNOLOGIES EARLY SIGNALS <> RISE OF DATA SCIENCE INFLECTION POINTS <> DATA SCIENCE MOMENT EMERGING SPACES <> EMPIRICAL DISCOVERY HOLISTIC EXPERIENCES <> VISUAL DISCOVERY TOOL
  • 156. VISUAL DISCOVERY & ANALYSIS TOOLS: WAVE 2 Definition: Augmented discovery & analysis across full business data corpus Value prop = deeper insights from more diverse data, faster insights, effected via a mixed toolkit of (semi)automated analytical techniques (clustering, machine learning, regression / correlation, etc.) enhances and directs analyst attention Vectors of augmentation: data types, degree of automation data = text / lingual, location / spatial, native graph, native stream automation = which specific activities are augmented, to what degree) Wave 2 is at the ‘pioneer’ stage: specifics of capability, value, implementation unknown Limiting factors: Domain specificity: value of general discovery analytics drops once domain boundaries are reached - need to align specifically to domain view of world Expect verticalization of all analytics Low / no tolerance for black boxes - deeper insights require transparency Analytical literacy: level increasing, but orgs can’t benefit from advanced analytical techniques if not understood & trusted
  • 158. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |Oracle Confidential – Internal Feature Selection Joe Lamantia Product Strategy(ist) Oracle Big Data Discovery November, 2014
  • 159. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 159 Feature Selection In machine learning and statistics, feature selection, also known as variable selection, attribute selection or variable subset selection, is the process of selecting a subset of relevant features for use in model construction. The central assumption when using a feature selection technique is that the data contains many redundant or irrelevant features. Redundant features are those which provide no more information than the currently selected features, and irrelevant features provide no useful information in any context. Feature selection techniques are a subset of the more general field of feature extraction. Feature extraction creates new features from functions of the original features, whereas feature selection returns a subset of the features. Feature selection techniques are often used in domains where there are many features and comparatively few samples (or data points). Feature selection is also useful as part of the data analysis process, as it shows which features are important for prediction, and how these features are related.
  • 160. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | BDD Feedback: Data Scientist Interviews “Analysts don’t generally analyze the catalog per se - they analyze line items, or actions, or histories, that kind of thing.” “It’s generally actions that people are interested in.” 160
  • 161. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 161 Data Records Catalog Format Entities Product, location Connections Satisfaction Goals Acquire Transform Events Purchase Status change Structures & Systems User centric Data centric Networks Business unit Community Loyalty factors Themes Profit Efficiency Plans Balance budget Launch product Manage risks Business Perspective Progressive engagement Complexity & difficulty Value of outcome Activities Traffic logging Address change Processes Fulfillment Brand monitoring Analysis PerspectiveData Perspective Domains Supply chain Industry / market Models Conversion Lifetime Customer Value (Decision tree) Measures Attrition rate Unit cost of materials Sensemaking Spectrum How analysts have to engage with data
  • 162. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 162 Data Records Catalog Format Entities Product, location Connections Satisfaction Goals Acquire Transform Events Purchase Status change Structures & Systems User centric Data centric Networks Business unit Community Loyalty factors Themes Profit Efficiency Plans Balance budget Launch product Manage risks Business Perspective Progressive engagement Complexity & difficulty Value of outcome Activities Traffic logging Address change Processes Fulfillment Brand monitoring Analysis PerspectiveData Perspective Domains Supply chain Industry / market Models Conversion Lifetime Customer Value (Decision tree) Measures Attrition rate Unit cost of materials Sensemaking Spectrum How analysts want to engage with data
  • 163. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | BDD Feedback: Data Scientist Interviews “The transforms are for feature engineering, right?” “What other goals are there for the transforms?” “I would assume that’s the only reason for the transforms…” 163
  • 164. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | BDD Feedback: Data Scientist Interviews “Getting the data right is the hard part. Once you get the data right…” 164
  • 165. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | BDD Feedback: Data Scientist Interviews “…feature engineering needs to be an iterative process” “…this is an iterative process. Everything goes in a circle.” “You’re going to do some data cleaning, you’re going to build a model, you’re going to have to go back and look at what you’re missing and what you’re not missing.” • 165
  • 167. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 167 Analytical Activity Explore Wrangle Analyze Augment Sensemaking Transformation Features Goals Realize insights Generate Models Goals Understand data Make data useful Cumulative incremental progress Data quality & Features
  • 168. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 168 Feature Extraction Engineering Generation Selection …
  • 169. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | We can repurpose techniques used during the traditional feature selection stage of the analytical workflow to enhance other stages of the discovery and analysis workflow. A likely candidate is exploration as it is coupled with wrangling. …Allow analyst engagement and focus on more useful constructs like entities or business processes, instead of dealing only with raw values and attributes 169 Thesis
  • 170. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | BDD 1.0 EID EID 170 BDD ? Acquire Ingest & Clean Store & Manage Featurize Wrangle Visual Analysis Interactive Queries Modeling Story-telling Build Deploy Monitor & Maintain Present Disseminate Insight cycle Modeling cycle
  • 171. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | How? • Features are discovered and inferred • statistical & other domain-independent methods • Domain-based • Known features used to train system • Sources • artifacts (scripts, models, dictionaries) • analytical activities • direct indication • 171
  • 172. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Possible Manifestations • Feature-based operations • wrangling: transforms, joins, • exploration: search, visualization, • analysis • Feature recognition: known features identified in new data • Feature-based enrichment • Interest graphs - Individual and group • Modeling capabilities 172
  • 173. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | • Movement toward user-centric engagement with data: • Entity-centric navigation & event linkage across data sets (Platfora) • Answerset (Paxata) • semantic search & enrichments (BDD) • thematic data lenses (platfora) • data harmonization and data stories (clearstory) • natural language interaction / cognitive computing (IBM) • expert network (tamr) 173 What’s happening in this product space?
  • 174. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 174 Capabililty Evolution Core Capabilities activity specific progressive Influencer By-product PublishImport Precursor • Core capabilities are necessary & primary to complete a given cycle • Enhancing capabilities are secondary within a cycle • Enhancing capabilities are necessary to accumulate assets(?) • Enhancing capabilities are necessary to advance to next cycle(?) asset types Workflow Collaboration PublicationAccelerators Enhancing Capabilities common random access Versioning Successor Provenance Metadata PublishImport Curation Governance Import
  • 178. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | BDD Feedback: Data Scientist Interviews “How do you know what changes you want to make until you build your model?  Once you build your model, you know you want to take the square root of this, or the log of this.  That doesn’t happen until you start building a model…” 178
  • 179. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 179 Discovery & Analysis Workflow Acquire Ingest & Clean Store & Manage Featurize Wrangle Visual Analysis Interactive Queries Modeling Story-telling Build Deploy Monitor & Maintain Present Disseminate Insight cycle Modeling cycle Adapted from ‘Data Analysis Just One Component of the Data Science Workflow’ http://radar.oreilly.com/2013/09/data-analysis-just-one-component-of-the-data-science-workflow.html Features Insights
  • 180. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | BDD 1.0 EID EID 180 Analytical Workflow Acquire Ingest & Clean Store & Manage Featurize Wrangle Visual Analysis Interactive Queries Modeling Story-telling Build Deploy Monitor & Maintain Present Disseminate Insight cycle Modeling cycleData Ingest cycle
  • 181. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery Modeling
  • 182. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery Old-school Modeling • Compute is expensive • Good (relevant) data is scarce • All data is difficult to work with, require considerable time and attention just to get provisionally ready • Human attention is limited - at all levels: engineer, analyst, insight consumer,  • ‘Experiments' are small, planned, receive close attention • Rely first on a library of well known methods (carefully vetted by years of practice) • Don’t run the experiment unless you know you can evaluate the results •    be sure you have the time •    be sure have the expertise •    be confident the results will be meaningful /insightful • Automation is only feasible in limited circumstances • Humans interpret experimental results • Complete experiments before evaluating them • ‘Small’ infrastructure - data sets, compute source, evaluation tools, archiving • Modeling is best done by the knowledgeable •     can have negative consequences when done by novices • Toolset aligned to: small / mid-sized data •      requires a high-quotient of human engagement, both directive / evaluative, and to enable execution 182
  • 183. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery New-school Modeling • Compute is cheap • Data is abundant | good (relevant) data is often available  • Data is still challenging to work with, but tooling allows engagement with much greater quantities, of many types • Run many experiments • Try many approaches, using new and old methods • Machines interpret experimental results, at least in part (batch eval for initial ranking of potential insight) • ‘Big’ infrastructure - data sets, compute source, evaluation tools, archiving • Automate where possible: selecting data, prepping data, choosing methods, setting parameters, executing experiments, evaluating results • Modeling is better done by those with knowledge, but it can have utility for non-experts • [forward-looking analogs: genomics, bioinformatics, computational neuroscience] • Toolset wants to be aligned to big data •   profile of human engagement varies over analytical lifecycle, seeking automation where possible in direction / evaluation, and execution 183
  • 184. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery Practices • Combine old-school and new school approaches at different stages of the analytical cycle • Starting points vary by practitioner maturity, understanding of problem, available resources • Experiments often alternate approaches • Use automation where possible 184
  • 185. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery185
  • 186. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery Modeling 186 Exploratory Analysis Identify features Understand relations between features Create new features Characterize Dataset Build Baseline Model Build Complex Model Feature Engineering & Model Tuning New features Straight-forward & well-known modeling methods Explore & understand contents, distribution, quality, etc. Iterative experimentation with several classes of modeling methods Compare to baseline Comparative / reference model Iterative & experimental model & feature combination, tuning, evaluation Recursive feature elimination Modeling, Testing, Training, Evaluation data sets Initial Predictive Model Final Predictive Model Explanatory Model Explanatory Model Discovery cycle Modeling cycle
  • 187. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery Modeling & BDD 187 Exploratory Analysis Identify features Understand feature relationships Create new features Characterize Dataset Build Baseline Model Build Complex Model Feature Engineering & Model Tuning New features Straight-forward & well-known modeling methods Explore & understand contents, distribution, quality, etc. Iterative experimentation with several classes of modeling methods Compare to baseline Comparative / reference model Iterative & experimental model & feature combination, tuning, evaluation Recursive feature elimination Modeling, Testing, Training, Evaluation data sets Initial Predictive Model Final Predictive Model Explanatory Model Explanatory Model Initial capability…
  • 188. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery Modeling & BDD 188 Exploratory Analysis Identify features Understand feature relationships Create new features Characterize Dataset Build Baseline Model Build Complex Model Feature Engineering & Model Tuning New features Straight-forward & well-known modeling methods Explore & understand contents, distribution, quality, etc. Iterative experimentation with several classes of modeling methods Compare to baseline Comparative / reference model Iterative & experimental model & feature combination, tuning, evaluation Recursive feature elimination Modeling, Testing, Training, Evaluation data sets Initial Predictive Model Final Predictive Model Explanatory Model Explanatory Model Subsequent capability…
  • 191. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 191 Business Assets & Activity Cycles Adapted from ‘Data Analysis Just One Component of the Data Science Workflow’ http://radar.oreilly.com/2013/09/data-analysis-just-one-component-of-the-data-science-workflow.html Featurize Wrangle Visual Analysis Interactive Queries Discovery Modeling Features Data Application VectorsEnrichments Acquire Ingest & Clean Manage & Update Model Train EvaluateUpdate Build MonitorStore & Expose Insights ModelsData Train Deploy corpus operational analytical archival insight stream awareness explanatory prescriptive intelligence machine human hybrid systems transactional engagement insight
  • 192. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 192 Tool Archetypes Featurize Wrangle Trifacta Visual Analysis Platfora Interactive Queries Datameer Discovery ModelingData Application Acquire Ingest & Clean Manage & Update Model Train EvaluateUpdate Build Train Deploy MonitorStore & Expose Data science workbenches Sense, yhat Application Foundries Azure ML, IBM Traditional app studios Java Discovery Workbenches BDD x Data Integrators Clover Analysis Workbenches Alteryx, Alpine Analytics Platforms Teradata, Pivotal ML services BigML, Wise.io, Skytree Business Intelligence Suite OBIEE, Cognos Python notebooks iPython, juPyter
  • 193. DEEP STRUCTURE CHANGE VECTORS EARLY SIGNALS INFLECTION POINTS EMERGING SPACES HOLISTIC EXPERIENCES
  • 194. VALUE CHAIN MAP (WARDLEY MAPPING)
  • 195.
  • 196.
  • 197. VALUE CHAIN MAP (WARDLEY MAPPING) ML
  • 198. WORKING THE ECOSYSTEM • Oracle = an ecosystem • ML = commoditizing • Someone will ‘generate the electricity’ = provide ML capability within the Oracle ecosystem • Everyone’s going to need it…
  • 200. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery200
  • 201. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery201
  • 202. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery Oracle Machine Learning Service
  • 203. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery203
  • 204. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery Genesis 204
  • 205. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery Offering • Machine learning service exposed as • Stand-alone productized service (public cloud) • ‘Product’ integrated with relevant Oracle cloud offerings • enable machine learning / analytics pipelines for data spanning service boundaries • ‘White-label’ ML capability within cloud offerings (SaaS, IaaS, PaaS, DaaS, etc.) • enables localized ML / analytics pipelines w/in service boundaries • Collection of Oracle-specific ML accelerators • Data sets & streams, pipelines, algorithms, R / python libs, project templates, etc. • 205
  • 206. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery Oracle Value Prop • Provides ML capability across cloud offerings for expanded data landscape • Big data • Big data + Traditional Enterprise in combination • Streaming Data • IOT • Reinforces ‘data gravity’ effect across Oracle cloud offerings • Entry point for ‘new stack’ (cloud-only) customers needing ML capability • ‘Missing link’ completes analytical pipelines across tool boundaries 206
  • 207. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery Data Landscape 207 Complexity Quantity Traditional Enterprise Big Data IOT Oracle Machine Learning Service Product-native ML Stream / Real-time
  • 208. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery Customer Value Prop • Easy machine learning w/in ecosystem of Oracle cloud offerings • Turnkey • Elasticity and adaptivity: resources, pricing, • Portability across Oracle product / service boundaries • Manifests appropriately for product / service contexts • Application Developers • Analysts / Data Scientists • Business users • Machine consumers 208
  • 209. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery SaaS ML For the Oracle Cloud Ecosystem 209 Oracle Machine Learning Service DaaS Data Service IaaS Infrastructure Service PaaS Platform Service Data & Models Data & Models Data & Models ‘Public’ OML product Customer Applications & Data sources Data & Models
  • 210. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery SaaS White Label ML Capability 210 DaaS Data Service IaaS Infrastructure Service PaaS Platform Service Machine Learning ML ToolsML Tools ML Tools ML Tools ML Tools ML Tools ML Tools ML Tools ML Tools Customer Applications & Data sources Oracle Machine Learning Service ‘Public’ OML product Data & Models Data & Models Data & Models Data & Models
  • 211. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery Oracle Ecosystem • All cloud services can be • data sources for ML service • consumers of published data & models from ML service • OML can publish augmented datasets (e.g. pre-scored matrices) as part of multistep & multi-tool analytical pipelines  • 211
  • 212. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery Initial Capability • Core ML functions: • data upload (no transform - BDD integration) from Oracle sources • modeling / analysis via general purpose, interpretable, methods • model training • model evaluation • Model publication • Processed data publication  212
  • 214. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Joe Lamantia | Product Strategist: Oracle Endeca Big Data Discovery214 Featurize Wrangle Visual Analysis Interactive Queries Discovery ModelingData Application Acquire Ingest & Clean Manage & Update Model Train EvaluateUpdate Build Train Deploy MonitorStore & Expose Discovery Workbenches BDD (now) ML services Oracle Machine Learning Discovery & Modeling Platform BDD & ML (combined analysis offering ?)
  • 215.
  • 217. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 217 Automation Potential Featurize Wrangle Visual Analysis Interactive Queries Discovery Modeling Adapted from ‘Data Analysis Just One Component of the Data Science Workflow’ http://radar.oreilly.com/2013/09/data-analysis-just-one-component-of-the-data-science-workflow.html Features Data Application VectorsEnrichments Acquire Ingest & Clean Manage & Update Model Train EvaluateUpdate Build Train Deploy MonitorStore & Expose Insights ModelsData
  • 218. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 218 Machine Intelligence Value Chain Adapted from ‘Data Analysis Just One Component of the Data Science Workflow’ http://radar.oreilly.com/2013/09/data-analysis-just-one-component-of-the-data-science-workflow.html Featuriz Wrangl Visual Analys Interactiv e Discover Modeling Feature Data Application VectorEnrichmen Acquir Ingest & Manage & Mode Trai EvaluatUpdat Buil MonitoStore & Insight ModelsData Trai Deplo corpus operational analytical archival insight stream awareness intelligence machine human hybrid systems transactional engagement insight Process operations? transactional engagement insight Apps Metric Create Machine Intelligence Operationalize Machine Intelligence
  • 219. DEEP STRUCTURE <> PRODUCT DEVELOPMENT CHANGE VECTORS <> ACQUISITION EARLY SIGNALS <> MARKET ACTIVITY INFLECTION POINTS <> INNOVATION MOMENTS EMERGING SPACES <> PRODUCT STRATEGY GIG HOLISTIC EXPERIENCES <> EXPERIENCE FOCUS
  • 222. The Language of Discovery Category: Primary Research, Design Systems Outcomes: Building on already-published original applied research into information retrieval and usage, the language of discovery posits a domain- independent framework describing the activity primitives of discovery in terms of ‘modes’.   Succeeding professional and industry publications outline the application of this descriptive vocabulary in settings including product design and development, product strategy, and information management. Reference: • Russell-Rose, T., Lamantia, J. and Burrell, M. 2011. A Taxonomy of Enterprise Search and Discovery. Proceedings of EuroHCIR 2011, London, UK. http://ceur-ws.org/Vol-763/paper4.pdf • Russell-Rose, T., Lamantia, J. and Burrell, M. 2011. A Taxonomy of Enterprise Search and Discovery. Proceedings of HCIR 2011, California, USA. https://docs.google.com/a/kent.edu/viewer? a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxoY2lyd29ya3Nob3B8 Z3g6NzdmYjc3OWY2ZjQ2Zjg4MQ • Russell-Rose, T. and Makri, S. 2012 A Model of Consumer Search Behavior. Proceedings of EuroHCIR 2012, Nijmegen, NL. • Designing the Search Experience: http://www.amazon.com/Designing- Search-Experience-Information-Architecture/dp/0123969816 • Presentation - Strata: http://conferences.oreilly.com/strata/ stratany2012/public/schedule/detail/25411 • Presentation - UX Lisbon conference: http://www.joelamantia.com/ user-experience-ux/slides-for-uxlx-talk-the-language-of-discovery-a- grammar-for-designing-big-data-interactions
  • 223. Domain & Market Study: Data Science Outcomes: Comprehensive portrait of all major facets of a new analytical discipline, including its practices, roles, methodology, tools and technologies, workflows, organizational models, skillsets, alignment with business, areas of innovation, and relation to the landscape of business analytics.  Research outcomes and synthesized insights guided product design, management, and strategy efforts including; opportunity identification and profiling, landscape / competitive modeling, technology lifecycle and evolution models, product discovery, concept creation and evaluation, prototyping. Notable aspects: Consistently delivered insights twelve or more months ahead of leading industry analysts pursuing similar agendas. Artifacts & Synthesis • Data Science Highlights: http://www.joelamantia.com/user- research/data-science-highlights-an-investigation-of-the-discipline • Empirical Discovery Concept and Workflow Model: https:// blogs.oracle.com/serendipity/entry/ empirical_discovery_concept_and_workflow • Empirical Discovery: A New Discipline https://blogs.oracle.com/ serendipity/entry/data_science_and_empirical_discovery • Defining Discovery: Core Concepts: https://blogs.oracle.com/ serendipity/entry/defining_discovery_core_concepts • Discovery and the Age of Insight http://www.joelamantia.com/ language-of-discovery/discovery-and-the-age-of-insight • Big Data Is Not Enough http://www.joelamantia.com/user- experience-ux/big-data-is-not-the-insight-slides-from-enterprise- search-europe
  • 224.
  • 225. DEEP STRUCTURE CHANGE VECTORS EARLY SIGNALS INFLECTION POINTS EMERGING SPACES HOLISTIC EXPERIENCES
  • 226. DEEP STRUCTURES ENTERPRISE / B2B • Business process • Activity • Social structure: Organizational model • Boundaries • Regulation • IT / Systems architecture • Lifecycle • Flows: capital, information, people • Frame: shareholder value, social enterprise CONSUMER / B2C • Value scheme: wealth, love, knowledge, safety • Demographics • Boundaries • Mores • Culture • Social structure: community / group • Frame: active lifestyle, sustainability
  • 229. UNDERSTAND & EMPATHIZE WITH CUSTOMER PERSPECTIVES >>ARTICULATE CUSTOMER VALUE SOURCES
  • 230. IDENTIFY BUSINESS IMPLICATIONS >> INFORM ALL STAGES OF PRODUCT & SERVICE DEVELOPMENT
  • 232. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 232 Activity Cycles [Structural View] Initial Activity Final Activity Cycle Successor InfluencerBy-product OutcomeInput Precursor Interim Activity Interim Activity • Cycles are iterative • Activities are progressive • Can begin w/ any activity • Best to begin w/ initial activity • Impact of activity increases with ‘distance’ - can span cycles • Inputs are necessary • Precursors can be incomplete (?) • Influencers are ‘from the future’ • Influencers enhance the local cycle • By-products enhance the precursor • Assets are cumulative • Assets depend on precursor cycles • Assets communicate via cycles asset types
  • 233. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 233 Business Assets & Activity Cycles Adapted from ‘Data Analysis Just One Component of the Data Science Workflow’ http://radar.oreilly.com/2013/09/data-analysis-just-one-component-of-the-data-science-workflow.html Featurize Wrangle Visual Analysis Interactive Queries Discovery Modeling Features Data Application VectorsEnrichments Acquire Ingest & Clean Manage & Update Model Train EvaluateUpdate Build MonitorStore & Expose Insights ModelsData Train Deploy corpus operational analytical archival insight stream awareness explanatory prescriptive intelligence machine human hybrid systems transactional engagement insight
  • 234. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 234 Activity Integration Points / Interfaces Initial Activity Final Activity Cycle Successor InfluencerBy-product OutcomeInput Precursor Interim Activity Interim Activity • Integration necessary for individual activities to communicate w/ one another w/in a cycle • Gaps = demand for enhancing capabilities • Integration is made possible by enhancing capabilities • Cycles = accelerated by good integration • Cycles = slowed by poor integration • Activity speed is not affected by integration? asset types
  • 235. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 235 Data Pipeline Featurize Wrangle Visual Analysis Interactive Queries Discovery Modeling Adapted from ‘Data Analysis Just One Component of the Data Science Workflow’ http://radar.oreilly.com/2013/09/data-analysis-just-one-component-of-the-data-science-workflow.html Features Data Application VectorsEnrichments Acquire Ingest & Clean Manage & Update Model Train EvaluateUpdate Build Train Deploy MonitorStore & Expose Insights ModelsData
  • 236. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 236 Machine Intelligence Value Chain Adapted from ‘Data Analysis Just One Component of the Data Science Workflow’ http://radar.oreilly.com/2013/09/data-analysis-just-one-component-of-the-data-science-workflow.html Featuriz Wrangl Visual Analys Interactiv e Discover Modeling Feature Data Application VectorEnrichmen Acquir Ingest & Manage & Mode Trai EvaluatUpdat Buil MonitoStore & Insight ModelsData Trai Deplo corpus operational analytical archival insight stream awareness intelligence machine human hybrid systems transactional engagement insight Process operations? transactional engagement insight Apps Metric Create Machine Intelligence Operationalize Machine Intelligence
  • 237. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 237 Tool Archetypes Featurize Wrangle Trifacta Visual Analysis Platfora Interactive Queries Datameer Discovery ModelingData Application Acquire Ingest & Clean Manage & Update Model Train EvaluateUpdate Build Train Deploy MonitorStore & Expose Data science workbenches Sense, yhat Application Foundries Azure ML, IBM Traditional app studios Java Discovery Workbenches BDD x Data Integrators Clover Analysis Workbenches Alteryx, Alpine Analytics Platforms Teradata, Pivotal ML services BigML, Wise.io, Skytree Business Intelligence Suite OBIEE, Cognos Python notebooks iPython, juPyter
  • 238. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 238 Activity Cycles & Capabilities Core Capabilities activity specific progressive Influencer By-product PublishImport Precursor • Core capabilities are necessary & primary to complete a given cycle • Enhancing capabilities are secondary within a cycle • Enhancing capabilities are necessary to accumulate assets(?) • Enhancing capabilities are necessary to advance to next cycle(?) asset types Workflow Collaboration PublicationAccelerators Enhancing Capabilities common random access Versioning Successor Provenance Metadata PublishImport Curation Governance Import
  • 239. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | enhancing capabilities common 239 Assets & Capabilities core capabilities asset specific Workflow Collaboration PublicationAccelerators Versioning Provenance Metadata Curation Governance Import
  • 240. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 240 Asset Scope Enterprise Line of Business Enterprise Localized Line of Business Localized • Scope determines / implies boundaries, metrics • Distinct systems (IT) and processes (biz) for each asset, at each level of scope • Each distinct system and process = integration point, create barrier to flow, require interface
  • 241. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Enterprise 241 Asset Communication Line of Business Localized • Scope determines / implies boundaries, metrics • Distinct systems (IT) and processes (biz) for each asset, at each level of scope • Each distinct system and process = integration point, create barrier to flow, require interfaceenhancing capabilities common enhancing capabilities common enhancing capabilities
  • 242. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 242 Capabililty Evolution Core Capabilities activity specific progressive Influencer By-product PublishImport Precursor • Core capabilities are necessary & primary to complete a given cycle • Enhancing capabilities are secondary within a cycle • Enhancing capabilities are necessary to accumulate assets(?) • Enhancing capabilities are necessary to advance to next cycle(?) asset types Workflow Collaboration PublicationAccelerators Enhancing Capabilities common random access Versioning Successor Provenance Metadata PublishImport Curation Governance Import
  • 243. VALUE CHAIN MAP (WARDLEY MAPPING)