Contenu connexe
Similaire à Hadoop in Education
Similaire à Hadoop in Education (20)
Plus de DataWorks Summit (20)
Hadoop in Education
- 1. Hadoop In Education: The advent of data-driven
applications
© 2010 Apollo Group – Confidential & Proprietary
- 2. Online Learning is in high demand
Adults learn at the online University of Phoenix on their
own schedules of available time and numbers who prefer
that modality more than the ground (“traditional”)
equivalent is on the rise.
Online students and faculty do not have to be
geographically co-located as in the traditional
settings, allowing for richer and diverse interactions
across geographical boundaries and time-differences.
As people spend more time online, it is only natural to
expect that the learners will want their education online as
well.
Recent press on huge enrolment in MOOCs (Massive
Open Online Course) again proves that there is a great
latent demand for online courses.
© 2010 Apollo Group – Confidential & Proprietary 2
- 3. What should online learning look like?
© 2010 Apollo Group – Confidential & Proprietary 3
- 4. Online Education challenges
Every learner is unique in aptitude, preparation, and
motivation.
A good teacher is continuously observing and
intervening appropriately to keep the learners engaged
and learning.
–If we just take the traditional classroom online, all the
visual and audio feedback are taken away from the
trained teacher!
© 2010 Apollo Group – Confidential & Proprietary 4
- 5. Online Education Opportunities
What if, instead,
–We collect detailed interaction data-sets and
converted them into actionable insights for the
teacher so that (s)he can focus only where (s)he is
needed and not exhaust her/himself by being the filter?
–With algorithms we harness the best practices that
are working for student and teacher and recommend
them in appropriate contexts and take away
unnecessary and inefficient guessing?
Wait, would that not be Web 2.0 in Education?
With top-name universities, start-up companies, learning
platform or learning content companies … this innovation
race is already on!
© 2010 Apollo Group – Confidential & Proprietary 5
- 6. Data driven learner guidance
Data Driven Apps:
Assignments,
discussions, Faculty
Guidance,
Recommendations
Faculty Processed Content Usage
Student Logs Interaction Student/Faculty Interaction
Student Assessment Logs Logs
Data Driven Apps
for
Effectiveness/Reco
mmendation of
Content & Instruction Designer
Faculty
Assessments
© 2010 Apollo Group – Confidential & Proprietary 6
- 8. New Learning System Architecture
Browser Mobile
Client Client
RESTful Services
Log data
Curriculum Class
Curriculum Quizzes &
Curriculum
Log data
Curriculum
Discussions Content
Data Collection and Log Processing Pipeline
© 2010 Apollo Group – Confidential & Proprietary 8
- 9. Considerations
Enable logging • Built GWT/JavaScript framework to
automatically enable client side logging.
without much effort • Automatically enable server-side logging
from developers using servlet filters.
Common pipelines • Used canonical log records with Avro as the
serialization format.
for processing log • Service specific information logged as JSON
data and processed using Hive UDFs.
Time-sync clients • Server responds with its timestamp on
every call.
and server to • Client includes this information in the next
simplify log ordering call.
© 2010 Apollo Group – Confidential & Proprietary 9
- 10. Client/Server – Built for Log Collection
View Controller Model
Event Bus
API Calls
Instrumentation
Filter
Log Data
RESTful
Log and Data Canonical Services
Processing Log File
Pipeline (Avro)
© 2010 Apollo Group – Confidential & Proprietary 10
- 11. Connecting the Data and Processing Pipeline
S3 Log
Application &
Server Processing
Server
Log Collection Pipeline
Servers ~7 TB/Week/Class
Oozie
Workflows
HBase Hive
Tables Tables
Services,
Dashboards, ~700 GB /Week/Class
M/L Tools
RDBMS
Traditional
BI Tools
© 2010 Apollo Group – Confidential & Proprietary 11
- 12. Considerations
User session • User in a discussion forum in a browser
split across • User receives grade notifications on
multiple mobile phone
• User views notification
devices
Merging and • Only partial ordering of events possible
without application specific information
ordering of • Full ordering required to extract features
events from logs
© 2010 Apollo Group – Confidential & Proprietary 12
- 13. Feature Extraction after Joins – Some challenges
View Question Get Question Request Question
User Interaction
Partial Event Order
Reordered Events
Select Answer Submit Answer Get Question
View Hint Request Question Display Question
Select Another Display Question Select Answer
Answer Select Answer View Hint
Submit Answer View Hint Select Answer
Receive Feedback Select Answer Submit Answer
Submit Answer Submit Answer
Question Feedback Question Feedback
Exploring generic alignment algorithms that use declared application
semantics
© 2010 Apollo Group – Confidential & Proprietary 13
- 15. A Data Driven Application: The Faculty
Dashboard for Action
© 2010 Apollo Group – Confidential & Proprietary 15
- 16. A Data Driven Application: The Faculty
Dashboard for Action
© 2010 Apollo Group – Confidential & Proprietary 16
- 17. Story 1: How detailed logs help
© 2010 Apollo Group – Confidential & Proprietary 17
- 18. Story 1: How detailed logs help
© 2010 Apollo Group – Confidential & Proprietary 18
- 19. Story 2: The Carnegie Learning Math Tutor
Enhanced Activities: Adaptive
CL’s Cognitive Tutor provides adaptive online curriculum in high school and middle school math.
– Interactive lessons
– Practice problems
– Response-sensitive feedback and support (e.g. hints, examples)
– Intelligent guidance through curricular units, with detailed tracking of skill proficiency
– Personalized preferences
© 2010 Apollo Group – Confidential & Proprietary 19
- 20. Example Features from Detailed Logs from the
Math Tutor
Baker, et.al: Towards Sensor Free Affect Detection in Cognitive Tutor
Algebra, retrieved from http://users.wpi.edu/~rsbaker/publications.html
Frustration Engaged Concentration
The percent of past actions on The minimum number of previous incorrect
the skills involved in the clip that actions and help requests for any skill in the
were incorrect. clip.
Were there any actions in the clip Among the skills involved in the clip, the
where the student made a wrong minimum value for previous incorrect actions
answer rather than requesting and help requests for that skill.
help when their probability of The duration (in seconds) of the fastest action
knowing the skill was under 0.7? in the clip.
The percentage of clip actions involving a hint
followed by an error.
© 2010 Apollo Group – Confidential & Proprietary 20
- 21. Why the features matter
From Stephen Fancsali, Variable Construction and Causal Discovery for
Cognitive Tutor Log Data: Initial Results, Educational Data Mining 2012
Helps design “intervention” features in the data driven math product to help the
learner
© 2010 Apollo Group – Confidential & Proprietary 21
- 22. Questions?
© 2010 Apollo Group – Confidential & Proprietary 22