Data-driven companies making intelligent products must design for security and privacy to be competitive globally. The EU General Data Protection Regulation (GDPR), implemented May 2018, is the benchmark that global data privacy will be measured against.
This presentation outlines the basic tenets of personal data and details the high-level changes that GDPR-compliant businesses face. It translates the current and near-future impact to teams designing products driven by machine learning and artificial intelligence and shares use cases of how SAP Concur is designing to meet this challenge while still delivering services to its end users that are driven by advanced algorithms.
Presented at The AI Conference, San Francisco, September 2018
When Privacy Scales - Intelligent Product Design under GDPR
1. When Privacy Scales
Intelligent Product Design Under GDPR
Amanda Casari
Principal Product Manager + Data Scientist
Concur Labs @ SAP Concur
@amcasari#TheAIConf
2. CREDITS: NASA EARTH OBSERVATORY IMAGES BY JOSHUA STEVENS,
USING SUOMI NPP VIIRS DATA FROM MIGUEL ROMÁN, NASA'S GODDARD SPACE FLIGHT CENTER
privacy paradox #1
• Users growing more
savvy + cautious about
technology
• Users demand higher
levels of personalization
+ content transfer across
ecosystems
@amcasari#TheAIConf
3. privacy paradox #2
• Data flywheels drive our
ability to deploy AI
products at scale
• “Bias flywheels”
negatively, unevenly +
unfairly impact
communities at scale
CREDITS: NASA EARTH OBSERVATORY IMAGES BY JOSHUA STEVENS,
USING SUOMI NPP VIIRS DATA FROM MIGUEL ROMÁN, NASA'S GODDARD SPACE FLIGHT
CENTER
@amcasari#TheAIConf
4. CREDITS: NASA EARTH OBSERVATORY IMAGES BY JOSHUA STEVENS,
USING SUOMI NPP VIIRS DATA FROM MIGUEL ROMÁN, NASA'S GODDARD SPACE FLIGHT CENTER
privacy paradox #3
• Enterprise software
maximizing new market
growth must be able to
repeatably scale
technology solutions
• Regulatory standards for
privacy widely vary across
geographic regions, even
within countries CREDITS: NASA EARTH OBSERVATORY IMAGES
@amcasari#TheAIConf
6. data: beyond the bits
HTTP://WWW.DEAR-DATA.COM/THEPROJECT
…a “personal documentary” rather
than a quantified-self project which
is a subtle – but important –
distinction. Instead of using data
just to become more efficient, we
argue we can use data to become
more humane and to connect with
ourselves and others at a deeper
level.
Dear Data
Giorgia Lupi + Stefanie Posavec
@amcasari#TheAIConf
7. data: beyond the bits
…based on people’s online
expressions, capitalizing on
data-rich social media, and
we’re measuring how people
present themselves to the
outside world.
Hedonometer
UVM’s Computational Story Lab
HEDONOMETER.ORG
@amcasari#TheAIConf
8. data: beyond the bits
Automated systems are not
inherently neutral. They reflect
the priorities, preferences, and
prejudices - the coded gaze - of
those who have the power to
mold artificial intelligence.
Gender Shades
Algorithmic Justice League
@amcasari#TheAIConf
10. privacy: a primer
• US: “right to privacy” cobbled together via case law (Supreme
Court)
• “The right to privacy refers to the concept that one's personal
information is protected from public scrutiny.”
…so what could be personal information?
…does this apply equally across all forms of information?
@amcasari#TheAIConf
11. privacy: a primer
Any representation of information that permits the identity of an
individual to whom the information applies to be reasonably inferred
by either direct or indirect means. Further, PII is defined as information:
(i) that directly identifies an individual (e.g., name, address, social
security number or other identifying number or code, telephone
number, email address, etc.) or (ii) by which an agency intends to
identify specific individuals in conjunction with other data elements,
i.e., indirect identification. (These data elements may include a
combination of gender, race, birth date, geographic indicator, and
other descriptors). Additionally, information permitting the physical or
online contacting of a specific individual is the same as personally
identifiable information. This information can be maintained in either
paper, electronic or other media.
@amcasari#TheAIConf
*PII in US
12. privacy: a primer
Race (Civil Rights Act of 1964)
Color (Civil Rights Act of 1964)
Sex (Equal Pay Act of 1963; Civil Rights Act of 1964)
Religion (Civil Rights Act of 1964)
National origin (Civil Rights Act of 1964)
Citizenship (Immigration Reform and Control Act)
Age (Age Discrimination in Employment Act of 1967)
Pregnancy (Pregnancy Discrimination Act)
Familial status (Civil Rights Act of 1968)
Disability status (Rehabilitation Act of 1973; Americans with Disabilities Act of 1990)
Veteran status (Vietnam Era Veterans' Readjustment Assistance Act of 1974; Uniformed
Services Employment and Reemployment Rights Act)
Genetic information (Genetic Information Nondiscrimination Act)
@amcasari#TheAIConf
*Legally
recognized
‘protected
classes’ in US
13. privacy: a primer
…does this apply equally across all forms* of
information?
No…. And this is still evolving.
* e.g. papers on your desk at work, your journals
at home, your logins at work, data from cloud
services, data stored on your phone
@amcasari#TheAIConf
14. privacy: a primer
EU: General Data Protection Regulation (GDPR)
defines personal data…
…creates a general law to protect it
(That’s really it. ¯_( )_/¯ )
@amcasari#TheAIConf
15. Personal data is any information that relates to
an identified or identifiable living individual.
Different pieces of information, which collected
together can lead to the identification of a
particular person, also constitute personal data.
@amcasari#TheAIConf
- European Commission
16. privacy: a primer
…okay, so what exactly is personal data under GDPR?
• a name and surname
• a home address
• an email address such as name.surname@company.com
• an identification card number
• location data (for example the location data function on a mobile phone)
• an Internet Protocol (IP) address
• a cookie ID*
• the advertising identifier of your phone
etc….
@amcasari#TheAIConf
18. General Data Protection Regulation
Data Subject Rights
• Breach Notification
• Right to Access
• Right to Be Forgotten
• Data Portability
• Privacy by Design
• Data Protection Officers
19. privacy + intelligent products
@amcasari#TheAIConf
@MROGATI
Monica Rogati
The AI Hierarchy of Needs
Think of AI as the top of
a pyramid of needs. Yes, self-
actualization (AI) is great, but you
first need food, water and shelter
(data literacy, collection and
infrastructure).
20. right to access
privacy by design
data portability
right to be forgotten
@MROGATI
@amcasari#TheAIConf
privacy + intelligent products
21. right to access
privacy by design
data portability
@MROGATI
@amcasari#TheAIConf
privacy + intelligent products
right to be forgotten
23. Personal data that has been de-identified, encrypted
or pseudonymised but can be used to re-identify a
person remains personal data and falls within the scope
of the law.
Personal data that has been rendered anonymous in
such a way that the individual is not or no longer
identifiable is no longer considered personal data. For
data to be truly anonymised, the anonymisation must
be irreversible.
@amcasari#TheAIConf
- European Commission
24. right to be forgotten
@amcasari#TheAIConf
KI-Protect
…secure your data by
letting you enable
pseudonymization and
anonymization of data
fields on the fly
25. right to be forgotten
Concur Labs
Washing Machine
Anonymization of personal
data in natural language
Privacy engineering at scale
@amcasari#TheAIConf
26. right to be forgotten
@amcasari#TheAIConf
Concur Labs
ML Experimentation in Hackathons
Synthetically generated datasets
to statistically represent customer
data
No access to customer data
platforms needed for AI/ML
experimentation + innovation
31. “Privacy is not something that one has, but something that
one seeks to achieve. It requires constant negotiation as
information flows and contexts shift.
To achieve privacy in a networked world, people must
actively try to manage the various social situations in which
information is accessed, consumed, interpreted, and shared.
They cannot simply focus on restricting the flow of
information; they must also account for the ways in which
information is inferred and used.”
- Reframing Privacy, Data & Society
@amcasari#TheAIConf
32. privacy: a primer
@amcasari#TheAIConf
Partnering organizations
New America's Open Technology Institute
Brooklyn Public Library
Metropolitan New York Library Council
Data & Society
Data Privacy Project
[teaches]… how information travels and is shared
online, what risks users commonly encounter
online, and how libraries can better protect patron
privacy.
Its trainings help support libraries’ increasing role
in empowering their communities in a digital
world.
34. @amcasari#TheAIConf
privacy by design
Concur Labs
MapIt
Overt user data
collection
Privacy
engineering on-
device
Basic map app
with location
obscuring for
anonymized
data options
35. privacy by design
Concur Labs
PolicyBot
Enterprise reciprocal data
application
Human-in-the-loop evaluation
process
Constrained Q+A bot to answer
travel + expense policy questions
@amcasari#TheAIConf