Comprehensive Guide on the use of Big Data in Transportation Services from the International Transport Forum. OpenSky loves making big data work for organisations large and small.
http://www.openskydata.com/our-sectors/transport.html
2. What is Big Data?
• www.amadeus.com “At the Big Data Crossroads: turning towards a smarter travel experience”, viewed 22 Aug 2013
• http://www.gartner.com/it-glossary/big-data/, viewed 15 Oct 2013
• http://www.csmonitor.com/USA/Society/2013/0811/The-new-age-of-algorithms-How-it-affects-the-way-we-live/(page)/3 viewed 9 Sep 2013
• http://ec.europa.eu/commission_2010-2014/kroes/en/blog/open-data-agreement viewed 30 Sep 2013
• http://www-03.ibm.com/press/us/en/pressrelease/41068.wss viewed 22 August 2013 viewed 22 Aug 2013
Definitions:
• A vast collection of structured and unstructured data sets
which have become difficult to process using traditional data
processing tools due to the sheer volume and complexity of
the data
• High-volume, high-velocity and high-variety information
assets that demand cost-effective, innovative forms of
information processing for enhanced insight and decision
making
The Three V’s
Big data is not only about the volume of data but also its
velocity and variety
Why so much data?
• Digitisation of our everyday activities, including travel,
shopping, downloading music, billing etc.
• Increasing dependence on electronic devices, all of which
leave digital footprints every time they are used.
What to do with big data?
Digitalisation demands a focus on big data as a new way to
convey knowledge
• Gather the data sets
• Mine the data to discover what is relevant
• Discover patterns and relationships
• Structure, organise, analyse and employ
2
It is estimated that people uncover as
much data in 48 hours (1.8 zettabytes i.e.
1,800,000,000,000,000,000,000 bytes) as
humans gathered from “the dawn of
civilization to the year 2003”
- Eric Schmidt, Google Executive
Chairman
"More data crosses the Internet
every second than were stored in
the entire Internet 20 years ago”
- Andrew McAfee and Erik
Brynjolfsson, "Race Against the
Machine.”
3. What is Big Data? (cont.)
Major criticisms of Big Data:
1. Hidden bias - the “Signal Problem”
2. Erodes privacy, threat of “Big Brother” behaviour
3. Promotes inequality
What is the Signal Problem?
There can be hidden bias in big data - the ‘Signal Problem’:
“Data is assumed to accurately reflect the social world but
there are significant gaps, with little or no signal coming from
particular communities”¹.
How can we address the Signal Problem?
For each data set, we need to ask:
1. Which people are excluded?
2. Which places are less visible?
3. What happens if you live in the shadow of big data sets?
1. http://blogs.hbr.org/cs/2013/04/the_hidden_biases_in_big_data.html viewed 6 Sep 2013
• http://www.csmonitor.com/USA/Society/2013/0811/The-new-age-of-algorithms-How-it-affects-the-way-we-live/(page)/6 viewed 9 Sep 2013
• http://www.fastcoexist.com/3017102/a-new-underclass-the-people-who-big-data-leaves-behind viewed 30 Sep 2013
• http://forbesindia.com/blog/technology/the-big-problem-with-big-data/, viewed 18 Oct 2013
3
Big data enhances our knowledge of what exists, not what is
necessarily the ‘right’ response.
Benefits of using big data
• More informed decision making – for government,
business, and individuals
• Assist in identification of trends
• Gain competitive advantage
• Support greater innovation
• Increase productivity
• Leverage technology opportunities
Challenges of using big data
• Separating the signal from the noise
• Data fragmentation across multiple systems
• Recruiting skilled workers
• Privacy and security
• Limitations of data - risks of responding to problems
using data alone
• Access and leveraging its full potential
4. Using Big Data
Three positive changes Big Data brings to research:
• Size, not sample: Allows a focus on size, not sample,
improving accuracy of studies and responses to needs of
governments, companies and people. New big data
technology means studies will not have to rely on sample
sizes because the amount of data collected will be vast.
• Messy, not meticulous: Accepts messiness in data. The
benefits of more data outweigh our obsession with precision
of small amounts of data.
• Correlation, not cause: While knowing the cause is
desirable, we don’t always need to understand how
something functions to make it work to our benefit.
Strengthening the application of Big Data:
1. Consider more than just the numbers: Build on
information created from big data to address known
weaknesses/limitations from ‘signal problems’, to make it
meaningful/usable/relevant.
2. Visualise the data: Look at the data in visual form to
enhance understanding of what and how to process the
data.
• http://www.csmonitor.com/USA/Society/2013/0811/The-new-age-of-algorithms-How-it-affects-the-way-we-live viewed 9 Sep 2013
• http://blogs.hbr.org/cs/2013/08/visualizing_how_online_word-of.html viewed 6 Sep 2013
• http://blogs.hbr.org/cs/2013/08/a_better_way_to_tackle_all_tha.html viewed 9 Sep 2013
• http://blogs.hbr.org/cs/2013/07/five_roles_you_need_on_your_bi.html viewed 10 Sep 2013
3. “Machine learning”: Algorithms learn from and react to
data like humans, identifying and using patters, etc.
• Reduces ‘time to decision’.
• Optimises function of complex systems in real-time e.g.
commuter train services.
4. What skills do I need in the workforce?
a) Data Hygienists - Ensure consistently clean and accurate
data.
b) Data Explorers - Sift through data to discover that which
you need.
c) Business Solution Architects - Compile and structure data
for analysis.
d) Data Scientists - Create analytic models.
e) Campaign Experts - Analyse and execute models for
optimal results.
4
Big Data gives us a more holistic understanding of
problems and systems, thus enhancing our ability to make
better decisions.
5. Visualisation of Data
Visualisation of data is paramount for its successful use:
1. Provides insight into ‘where to look’ and ‘what questions to
ask’ of the data.
2. Confirmation: Enables us to check our assumptions about
systems and reflects better an assessment of risk based on
those assumptions when making decisions.
3. Education: Enhances reporting and develops intuition about
specific data sets.
4. Exploration: Helps build a model to allow users to identify
an effective analytical model that will allow them to predict
and better manage a system through visual exploration.
Risks to success of data visualisation:
1. Data quality.
2. Context: the source of insight allows for a holistic
understanding of the data.
3. Biases: syntax and semantics of visualised data can
influence a viewer’s understanding and interpretation of the
data. It is important to be aware of this in order to provide
an impartial visualisation.
• http://blogs.hbr.org/2013/03/when-data-visualization-works-and/ viewed 30 Sep 2013
• http://oliverobrien.co.uk/2012/04/the-london-data-table/ viewed 30 Sep 2013
5
Case Study
London’s Data Table – CASA, University College London
2012
Description: A table cut into the outline of London with an
overhead projector portraying various “Processing sketches”,
providing a visualisation of real-time transport data including
buses, cars, trains, shared bikes, flights.
• Provided near-real-time broadcasts of location, speed and
aircraft ID of flights over London, including QR codes for each
plane, allowing smartphone users to scan it to access further
flight information.
The London Data Table
6. Moving toward Open Data
Open Data
Open data is the idea that data should be freely available to
everyone to use as they wish. Open data supports and
enhances big data’s availability and potential. It is already
changing the way the governments address issues
domestically and internationally.
Benefits of Open Data
• Open data becomes actionable intelligence.
• Could provide an economic boost and increased job creation
(e.g. The EU’s move toward open data directive is expected
to create 58,000 jobs in the UK through 2017 and add £216
billion to the country’s economy).
Challenges of Open Data
• Enabling ‘mass mobilisers’ (training journalists and civic
groups) to disseminate and make data understandable by
the general public, not just statisticians.
• Data format: Presenting the data in a way which makes it
accessible to all users (especially the public, which often is
left behind in the availability and agency to use the data).
• Finding skilled workers, educating the workforce.
• http://blogs.hbr.org/2013/03/we-need-open-data-to-change-th/ viewed 30 Sep 2013
• http://blogs.hbr.org/2013/03/open-data-has-little-value-if/ viewed 30 Sep 2013
• http://www.govdata.eu/en/europeanopen.aspx viewed 30 Sep 2013
• http://www.computerweekly.com/feature/EU-open-data-promotion-could-benefit-UK-economy-says-CEBR viewed 1 Oct 2013
6
Case study
European Open Government Data Initiative
(EU OGDI)
Description: A free, open-source, cloud-based collection of
software assets that government organisations can take
advantage of. They can load and store public data using the
Microsoft Cloud.
• Aims to increase Availability, Transparency, Added Value,
Non-discrimination and Non-exclusivity of data for the
betterment of practices, policies, and enhanced job creation
across EU member countries.
• EU OGDI also held a public consultation to understand more
about the barriers to Open Government Data. Results
included: Cost of provisioning and delivery, the availability of
data in all languages, the governance of data classification
and the potential reuse of data.
7. Using Big Data in the Transport sector
How are Governments using big data?
• Traffic Controlling
• Transport Planning and Modeling
• Route Planning
• Congestion Management
• Intelligent Transport Systems
How is the Private Sector using big data?
• Travel Industry
• Route Planning and Logistics
• Revenue Management
• Competitive Advantage
• Technological Enhancements
How are Individuals using big data?
• Route Planning (save time/increase fuel-efficiency)
• Travel (tourism)
• http://blog.rmi.org/blog_how_big_data_drives_intelligent_transportation viewed 22 Aug 2013
• http://www.oecd.org/sti/ieconomy/Session_5_Letouz%C3%A9.pdf viewed 30 Sep 2013
• http://www.omnitrans-international.com/en/general/news/2013-07-04-using-big-data-in-transport-modelling- viewed 22 Aug 2013
GSM and Transport Modeling
Global System for Mobile Communications (GSM) data is
location-based information retrieved from mobile phones.
GSM data is used to extract Origin-Destination (O-D)
matrices:
• Decreased cost of data collection.
• Improved accuracy of transport models and their
validation.
• Allows more frequent/easier updates of ‘base year’
matrices.
7
Case study
Orange Telecom’s ‘Data for Development Challenge’
2012
Goudappel Coffeng, Omnitrans International and KDD-Lab
responded to the challenge to build the best transport model of
Ivory Coast using only publicly-available data.
• GSM analysis tools were used to process location of
callers/recipients and tie them to a region (region defined by
GSM cell site antenna’s reception area)
• Used departure/arrival times and origins and destinations
combined with frequency of trips to show approximate
home/work locations and create average O-D matrices for the
region to be used as a transport model
8. Examples of where Government and the
Private Sector is using Big Data
Mode Name Project Type Year Value Technology/
Consulting
Partner
Road City of Dublin Congestion & Traffic
Management
2010 €66 million IBM
Road City of Stockholm Traffic Patterns &
Congestion
2006-2011 €218 million IBM
Road/
Maritime
City of Da Nang,
Vietnam
Congestion & Traffic
Management
2013-
ongoing
Smart Cities Challenge
worth €37 million
IBM
Air Lufthansa Revenue Management 2013 SAP/HANA
Air Air France-KLM Revenue Management
Air Swiss International
Airlines
Revenue Management
Air Frontier Airlines Revenue Management
Air British Airways Competitive Advantage 2012 “Significant amount” of
€7b investment in new
products, technology,
etc.
Opera
Solutions
Road Munich Airport Competitive Advantage &
Tech Enhancement
2013 Lufthansa &
Amadeus
• www.amadeus.com “At the Big Data Crossroads: turning towards a smarter travel experience”, viewed 22 Aug 2013
• http://www.ibmbigdatahub.com/blog/travel-and-transportation-age-big-data viewed 22 Aug 2013
8
9. Examples: IGOs and Big Data
9
• http://oecdeducationtoday.blogspot.fr/2013/07/big-data-and-pisa.html viewed 30 Sep 2013
• http://search.oecd.org/officialdocuments/publicdisplaydocumentpdf/?cote=DSTI/ICCP(2012)9/FINAL&docLanguage=En viewed 30 Sep 2013
• https://datakindworldbank.eventbrite.com/ viewed 3 Oct 2013
• http://blogs.worldbank.org/category/tags/big-data viewed 3 Oct 2013
• http://www.scribd.com/doc/142012481/DC-Big-Data-Exploration-Final-Report?cid=CTR_TwitterWBopenfinances_D_EXT viewed 3 Oct 2013
OECD
Education sector - The PISA Global Survey (July 2013)
• The Education sector is exploring how to maximise its
creation of big data the PISA global survey which
examines the skills of 15-year-olds in ways that are
comparable across countries.
OECD Report: Exploring Data-Driven Innovation as a New
Source of Growth: Mapping the Policy Issues Raised by ‘Big
Data’ (June 2013)
• Describes how big data can be a source of growth for
countries and outlines the policy opportunities and
challenges it presents.
• Includes options to increase the use and value of big data
across the transport and logistics sectors.
World Bank
The Big Data Exploration Initiative (2013)
• Joint initiative organised by the World Bank, United Nations
Development Programme (UNDP), UN Development
Business, UN Global Pulse and Qatar Computing Research
Institute.
• Focuses on International Development Policy, particularly
reducing poverty and addressing fraud and corruption
through data.
• Hosts and participates in ‘DataDives’ (see example on
right).
• Regular blog posts on the World Bank’s Data Blog.
• Contributes to reports and papers on big data’s impact on
international development policy.
Case Study: DC DataDive
World Bank, Big Data Exploration
15-17 March 2013
Over 150 topics experts, data scientists, development
practitioners and others worked with World Bank experts from the
Poverty and Fraud & Corruption teams to explore new ways of
using big data to maximise its impact on poverty, fraud and
corruption.
Process:
The WB and partner organisations defined six key projects for the
event. Projects were designed to address the WB’s needs and
generate tangible insights within a 24-48 hour period.
Project examples:
o Analysing World Bank Data for Signs of Fraud and Corruption
o Predicting Small-Scale Poverty Measures from Night
Illuminations
At the event, data was provided by the WB and contributing
organisations. Data scientists then processed the data in real-time
using big data processing programmes. The analysis was
displayed on video screens in the room. Data scientists
collaborated with the topic experts and development practitioners
to ensure a quality process for optimum results.
Lastly, the entire group discussed outcomes and developed key
recommendations on using big data sources to monitor poverty
and corruption. Additionally, entirely new streams of data were
created that the WB and partners can use in future research.
10. Individuals are using big data via websites
and mobile phone applications
10
• http://siliconangle.com/blog/2012/01/25/big-data-means-big-success-for-embarks-iphone-app/ viewed 2 Sep 2013
• http://finance.yahoo.com/news/parkme-launches-real-time-parking-130000830.html viewed 2 Sep 2013
• http://blogs.hbr.org/cs/2013/04/the_hidden_biases_in_big_data.html viewed 6 Sep 2013
• Embark: Uses publicly accessed data including transit companies and the government as well as its own users to provide the
best, real-time traffic route for commuters. Especially popular in urban areas. (UK and USA)
• ParkMe: Uses publicly accessed data from partnerships with parking operators to give real-time parking information, including
on and off-street parking as well as best value parking. Aims to reduce parking frustration, especially in urban areas. (Global-
approximately 32 countries)
• StreetBump: Uses a mixture of city data and business partnerships to display nearby parking spots to drivers. (USA)
• Spothero: Uses a mixture of city data and business partnerships to display nearby parking spots to drivers. (USA)
• SweepAround.us (website): Provides free online database of information that indicates when Street Sweepers approach users
homes, so they can move their cars and avoid tickets. (USA)
11. Case Study: City of Dublin, Public Transit
System
Background
Began: 2010 for 3+ years
Value: €66 million (Jointly funded by IBM and Industrial
Development Agency of Ireland)
Problem
Traffic congestion in public transport network throughout city,
especially buses
Goals
-Reduce congestion and improve traffic flow
-Better mobility for commuters
• http://www-03.ibm.com/press/us/en/pressrelease/41068.wss viewed 22 Aug 2013
• http://www-03.ibm.com/press/us/en/pressrelease/29745.wss viewed 23 Aug 2013
• http://www.theguardian.com/local-government-network/2013/jun/05/dublin-city-smart-approach-data viewed 10 Sep 2013
• http://www.thestreet.com/story/11926701/1/big-data-helps-city-of-dublin-improve-its-public-bus-transportation-network-and-reduce-congestion.html viewed 10 Sep 2013
How?
In collaboration with IBM:
1. Advanced analytics on data collected from each bus’s
journey
2. Improved reporting and monitoring: Created a digital
map of city overlaid with real-time positions of Dublin’s
buses using stream computing and geospatial data
Result
Examples of project benefits include:
• Journey information is released and updated by Dublin city
council every minute, allowing residents to find online the
quickest route to their destination
• Due to improved reporting, the city can identify optimal
traffic-calming measures to reduce congestion and can
identify the best place(s) to add additional bus lanes and
bus-only traffic systems
11
12. Case Study: British Airways, Competitive
Advantage - The ‘Know Me’ programme
Background
Began: Early 2012, in development (some aspects have been
rolled out already and data has been collected for years)
Value: Unknown
Problem
Competition: from low-cost carriers on the low end and
country carriers backed by sovereign wealth on the high end
Goals
Achieve competitive advantage by:
1. Understanding customers better than any competitor
2. Using accumulated customer knowledge for each
individual customer’s benefit
How?
Support from big data analytics firm Opera Solutions.
Also through use of Google Image search to help staff
recognize “captains of industry” upon entering
airports/lounges to provide tailored attention.
Using customer insight via customer information from BA’s
Executive Club loyalty programme and BA’s website.
Apply big data to customer decision points in BA’s Know Me
programme:
1. Personal recognition
2. Service excellence and recovery
3. Offers that inspire and motivate.
Results
Examples of project benefits include:
• Improved in-flight service: Outfitted crew with iPads
(approx. 2000 front line employees) for identification of
high spending passengers, resulting in higher quality
service to customers
• Successfully addressing prior difficulties: If regular
customers have previously experienced delays/problems
on previous flights, the Know Me programme informs
current crew so they can apologise for previous issues and
pay special attention to those customers
• www.amadeus.com “At the Big Data Crossroads: turning towards a smarter travel experience”, viewed 22 Aug 2013
• http://blog.operasolutions.com/bid/311798/Big-Data-Takes-the-Travel-Industry-in-New-Direction viewed 23 Aug 2013
• http://www.tnooz.com/2012/07/09/news/british-airways-and-the-know-me-saga-should-companies-run-image-checks-on-customers/ viewed 9 Sep 2013
• http://abcnews.go.com/Travel/airline-google-spot-customers/story?id=16740530 viewed 10 Sep 2013
12
13. Case Study: City of Da Nang, Vietnam, Traffic
Management System
Background
Began: 2013- ongoing
Value: €37 million (Part of IBM’s Smart Cities Challenge)
Problem
Traffic congestion throughout the city with a fast-growing
population
Goals
-Reduce congestion
-Create a sustainable traffic system to manage long-term
effects of high growth in population
-Better, more efficient mobility for commuters
• http://qz.com/115427/vietnam-taps-big-data-to-avoid-chinas-traffic-catastrophe/#115427/vietnam-taps-big-data-to-avoid-chinas-traffic-catastrophe viewed 22 Aug 2013
• http://www-03.ibm.com/press/us/en/pressrelease/41754.wss viewed 23 Aug 2013
• http://businesstoday.intoday.in/story/lessons-in-big-data-vietnam-apac-big-data-and-cloud-summit/1/197954.html viewed 10 Sep 2013
How?
In collaboration with IBM and its Smarter Cities Technology:
1. Big data technologies (including apps) and predictive
analytics to create a new traffic control centre
a. Able to monitor traffic and control the city’s
traffic light system through a dashboard
b. Tools that will forecast and prevent potential
congestion and better coordinate city responses
to issues like accidents and weather
2. Software and sensors embedded in roads, highways,
and buses. Synchronize stop lights to minimize traffic
jams
Results
Examples of project benefits include:
• 135 e-government services added covering everything
from school admission to registration of property.
• Successful implementation of sensors that monitor traffic
on roads and well as water level in flood-prone Han river
(helps regulate Da Nang’s port).
• Successful implementation of Intel’s Intelligent Power
Node (supports power management, energy efficient)
13
14. Additional Useful Links
1. OECD Report (June 2013): Mapping the Policy Issues Raised by Big Data: Report in which five sectors’ connections to big data
are discussed including transport and logistics:
http://search.oecd.org/officialdocuments/publicdisplaydocumentpdf/?cote=DSTI/ICCP(2012)9/FINAL&docLanguage=En
2. UN Global Pulse (UN’s Big Data Initiative): http://www.unglobalpulse.org/sites/default/files/BigDataforDevelopment-
UNGlobalPulseJune2012.pdf
3. European Union Open Data Portal: http://open-data.europa.eu/
4. World Bank Report, City of Stockholm’s Congestion Charging project:
http://siteresources.worldbank.org/INTTRANSPORT/Resources/StockholmcongestionCBAEliassonn.pdf
5. The Economist, The multiplexed metropolis: on cities and data: http://www.economist.com/news/briefing/21585002-
enthusiasts-think-data-services-can-change-cities-century-much-electricity
6. IBM White Paper, Big data and analytics in travel and transportation:
http://public.dhe.ibm.com/common/ssi/ecm/en/gbw03215usen/GBW03215USEN.PDF
7. Harvard Business Review Blog Network: What the Companies Winning at Big Data Do Differently:
http://blogs.hbr.org/cs/2013/06/what_the_companies_winning_at.html
8. Ireland’s (2013 EU Presidency) Policy Priorities within the Transport, Telecommunications and Energy Council (TTE):
http://eu2013.ie/ireland-and-the-presidency/the-eu-and-policy-areas/transport,-telecommunications-and-energy/
9. How automotive companies use Big Data: http://www.livemint.com/Specials/P6e4ijI7XVxKKhyEEzzqMO/Auto-makers-bet-on-
big-data-for-business-insights.html?ref=mr
10. People who do not generate data: http://www.fastcoexist.com/3017102/a-new-underclass-the-people-who-big-data-leaves-
behind
14