Contenu connexe Similaire à Does Synthetic Data Hold The Secret To Artificial Intelligence? (20) Plus de Bernard Marr (20) Does Synthetic Data Hold The Secret To Artificial Intelligence?2. © 2018 Bernard Marr, Bernard Marr & Co. All rights reserved
Title
Text
IntroductionIntroduction
Could synthetic data be the solution to rapidly train artificial intelligence (AI)
algorithms?
There are advantages and disadvantages to synthetic data; however, many
technology experts believe that synthetic data is the key to democratizing
machine learning and to accelerate testing and adoption of artificial intelligence
algorithms into our daily lives.
Does Synthetic Data Hold The Secret To
Artificial Intelligence?
3. © 2018 Bernard Marr, Bernard Marr & Co. All rights reserved
What Is Synthetic Data?
When a computer artificially manufactures data rather than measures and
collects it from real-world situations it’s called synthetic data. The data is
anonymized and created based on the user-specified parameters so that it’s as
close as possible to the properties of data from real-world scenarios.
One way to create synthetic data is to use real-world data but strip the
identifying aspects such as names, emails, social security numbers and
addresses from the data set so that it is anonymized.
A generative model, one that can learn from real data, can also create a data set
that closely resembles the properties of authentic data. As technology gets
better, the gap between synthetic data and real data diminishes.
4. © 2018 Bernard Marr, Bernard Marr & Co. All rights reserved
What Is Synthetic Data?
Synthetic data is useful in many situations. Similar to how a research scientist
might use synthetic material to complete experiments at low risk, data scientists
can leverage synthetic data to minimize time, cost and risk.
In some cases, there isn’t a large enough data set available to train a machine
learning algorithm effectively for every possible scenario so creating a data set
can ensure comprehensive training.
In other cases, real-world data cannot be used for testing, training or quality-
assurance purposes due to privacy concerns, because the data is sensitive or it is
for a highly regulated industry.
5. © 2018 Bernard Marr, Bernard Marr & Co. All rights reserved
The Advantages of Synthetic Data
Huge data sets are what powers deep learning and artificial intelligence
algorithms that are expected to help solve very challenging issues.
Companies such as Google, Facebook and Amazon have had a competitive
advantage due to the amount of data they create daily as part of their business.
Synthetic data allows organizations of every size and resource levels the
possibility to also capitalize on learning that is powered by deep data sets which
ultimately can democratize machine learning.
Creating synthetic data is more efficient and cost-effective than collecting real-
world data in many cases. It can also be created on demand based on
specifications rather than needing to wait to collect data once it occurs in reality.
6. © 2018 Bernard Marr, Bernard Marr & Co. All rights reserved
The Advantages of Synthetic Data
Synthetic data can also complement real-world data so that testing can occur
for every imaginable variable even there isn’t a good example in the real data
set. This allows organizations to accelerate the testing of system performance
and training of new systems.
The limitations for using real data for learning and testing are reduced when
using fabricated data sets. Recent research suggests that it is possible to get the
same results using synthetic data as you would with authentic data sets.
7. © 2018 Bernard Marr, Bernard Marr & Co. All rights reserved
The Disadvantages of Synthetic Data
It can be challenging to create high-quality synthetic data especially if the
system is complex. It’s important that the generative model creating the
synthetic data is excellent or the data it generates will be affected.
If synthetic data isn’t nearly identical to a real-world data set, it can compromise
the quality of decision-making that is being done based on the data.
Even if synthetic data is really good, it is still a replica of specific properties of a
real data set. A model looks for trends to replicate, so some of the random
behaviours might be missed.
8. © 2018 Bernard Marr, Bernard Marr & Co. All rights reserved
Applications of Synthetic Data
Whenever privacy concerns are an issue such as in the financial and healthcare
industries or an enormous data set is required to train machine learning
algorithms, synthetic data sets can propel progress.
Here are just a few applications of synthetic data:
• Synthetic data with record-level data can be used from healthcare
organizations to inform care protocols while protecting patient
confidentiality. Simulated X-rays are combined with actual X-rays to train AI
algorithms to identify conditions.
• Fraudulent activity detection systems can be tested and trained without
exposing personal financial records.
9. © 2018 Bernard Marr, Bernard Marr & Co. All rights reserved
Applications of Synthetic Data
• DevOps teams use synthetic data to test software and ensure quality.
• Machine learning algorithms are often trained with synthetic data.
• Waymo tested its autonomous vehicles by driving 8 million miles on real
roads plus another 5 billion on simulated roadways. Other automakers are
using video games such as Grand Theft Auto to aid its self-driving
technology.
While synthetic data isn’t fool proof, it is an important tool to augment machine
learning algorithms when real data is too expensive to collect, inaccessible due
to privacy concerns or incomplete.
10. © 2017 Bernard Marr , Bernard Marr & Co. All rights reserved
© 2018 Bernard Marr, Bernard Marr & Co. All rights reserved
Bernard Marr is an internationally best-selling author, popular keynote speaker, futurist, and a
strategic business & technology advisor to governments and companies. He helps
organisations improve their business performance, use data more intelligently, and
understand the implications of new technologies such as artificial intelligence, big data,
blockchains, and the Internet of Things.
LinkedIn has ranked Bernard as one of the world’s top 5 business influencers. He is a frequent
contributor to the World Economic Forum and writes a regular column for Forbes. Every day
Bernard actively engages his 1.5 million social media followers and shares content that
reaches millions of readers.
Visit The
Website
© 2017 Bernard Marr , Bernard Marr & Co. All rights reserved
© 2018 Bernard Marr, Bernard Marr & Co. All rights reserved
Bernard Marr is an internationally best-selling author, popular keynote speaker, futurist, and a
strategic business & technology advisor to governments and companies. He helps
organisations improve their business performance, use data more intelligently, and
understand the implications of new technologies such as artificial intelligence, big data,
blockchains, and the Internet of Things.
LinkedIn has ranked Bernard as one of the world’s top 5 business influencers. He is a frequent
contributor to the World Economic Forum and writes a regular column for Forbes. Every day
Bernard actively engages his 1.5 million social media followers and shares content that
reaches millions of readers.
Visit The
Website
11. Title
Subtitle
Be the FIRST to receive news,
articles, insights and event
updates from Bernard Marr & Co
straight to your inbox.
Signing up is EASY! Simply fill out
the online form and we’ll be in
touch!
© 2018 Bernard Marr, Bernard Marr & Co. All rights reserved