4. Dynamic Scene Analysis
• Interaction of multiple agents in a specific
context and particular environment
• Activities reoccur over time and co-occur in
time
• Scene analysis gives an understanding of:
– where objects are located,
– what is happening,
– how they interact over a
period of time
6/6/2012 4
7. Related Work
• All the works start with some feature
extraction.
• Existing works in the literature are:
1. Trajectory-based
o Many require object detection
o Difficulties in handling occlusions
2. Optical Flow based
o Tracks motion between frames
o Preferable for complex videos as
it is fast and robust
6/6/2012 7
8. Video Scene Understanding Using Multi-scale Analysis [Yang et al.]
6/6/2012 8
─ Uses optical flow and
Bag-of-words representation
─ Each pixel is assigned
a codeword
─ Use diffusion maps - Clustering reveals the motion patterns, done using a spectral
analysis technique
• Trajectories used to find a set of behavior rules ,
followed by clustering
• Hidden Markov Models are used to detect primitive
events
• Event rule representation is based on Stochastic
Context-Free Grammar and extended with temporal
logic
• Event rule induction is performed to discover the hidden
temporal structures between primitive events using the
Minimum Description Length algorithm
Trajectory Series Analysis based Event Rule Induction for Visual
Surveillance [Zhang et al.]
9. Random Field Topic Model for Semantic Region Analysis in Crowded Scenes from
Tracklets [Zhou et al.]
6/6/2012 9
• tracklets are observed within a short period
• A Random Field Topic Model is integrated with
Markov Random Field to enforce spatial and
temporal coherence during the learning process
• Tracklets are grouped into one topic
• Pairwise MRF: connects neighboring tracklets
• Tracklets which are spatially and temporally close,
have similar distributions over semantic regions
Random Field Topic Model for Semantic Region Analysis
in Crowded Scenes from Tracklets [Zhou et al.]
10. General Steps
(1) Feature
Extraction
(2) Event
Modeling
(3) Event
Recognition
Atomic Event
• Involves a single object
• Represented by motion patterns
• Indicates the spatial properties
Composite Event
• Multiple atomic events taking
place in space & time: complex
activities
• Behavioral interaction: results in
spatio-temporal patterns
6/6/2012 10
11. Problem Statement
Given, a video of a scene acquired by a static
camera:
– Identify regions of different dynamics
– Learn spatio-temporal patterns in the scene and
interpret the semantics within
– Detect abnormal events based on a normalcy
model
6/6/2012 11
17. Feature Extraction: Mean-shift Tracking
17
1. We need to detect and track objects of interest, i.e. vehicles.
2. Target characterization
a) By a circular region in the image i.e. color PDF of target
pixels
3. Target localization
a) Update the model in frame
21. Spectral Clustering: Data Representation
o Nodes (1,2,..,n) –
Trajectories
o Edge weights (w) –
Similarity measure
(Dynamic Time Warping
Distance)
1 2 ..
.. n
w
w
Graph
Adjacency
matrix
t1 t2 … tn
t1 0 0.5 … 0.75
t2 0.5 0 … 0.66
… … … … …
tn 0.75 0.66 … 06/6/2012 21
We aim to clustering trajectories into distinct events in the
scene.
22. Spectral Clustering: Steps
(n x n) Affinity
Matrix
• Form Laplacian Matrix: Compute K largest
eigenvectors
• K estimated from the distortion score
Eigenvector
Matrix
• Cluster eigenvectors
• Assign trajectory points to corresponding
clusters
K-means
Clustering
6/6/2012 22
25. Video Association Mining
• We want to uncover unknown patterns in the
scene
• We want to focus is on relationships occurring
within time-intervals rather than just points in
time
• Temporal Pattern Mining: Used to discover
interesting patterns in the scene
• Association Rule Mining: Helps predict future
scene dynamics
6/6/2012 25
27. What is a Frequent Pattern?
• Frequent Temporal Pattern (FTP): Occurs many
times in the data; indicates co-occurring and
recurring activities in the scene
• A temporal pattern composed of k events is called
a k-pattern
• Relationships amongst events are encoded using
Allen’s temporal logic
• Each temporal pattern is appended with its time
duration
C
A
B
relationship
event
duration
3-pattern
6/6/2012
27
29. Interval-Based Event Miner: Algorithm
Level-by-Level Discovery Process
• IEMiner: based on the Apriori principle of item-set
mining
• Apriori principle: Every subset of a frequent k-pattern set
also has to be frequent
(1)
Candidate
Generation
(2)
Support
Counting
Frequent k-patterns
Candidate (k+1)-patterns
6/6/2012 29
30. Input: List of Event Sequences
• Each event sequence consists of a
sequence of triplets:
{event_label,start_time,end_time}
No Event Sequence
1 A 0 5 B 0 9 C 9 11
2 C 0 7 A 3 11 B 9 11
3 A 0 11 C 1 6 D 1 5
4 A 0 4 C 0 3 E 6 7 G 7 11
Obtain single
frequent events
Event Count
A 4
B 2
C 4
D 1
E 1
G 1
6/6/2012 30
FREQUENT
31. (1) Candidate Generation
Bottom-up approach
FIRST STEP:
GENERATE SET OF 2-PATTERNS
6/6/2012 31
No Event Sequence
1 A 0 5 B 0 9 C 9 11
2 C 0 7 A 3 11 B 9 11
3 A 0 11 C 1 6 D 1 5
4 A 0 4 C 0 3 E 6 7 G 7 11
Form composite
events
C
A
A
B
A starts B
C overlaps A
.
.
.
32. (1) Candidate Generation
Bottom-up approach
SECOND STEP: GENERATE(K+1)-PATTERNS FROM FREQUENT
K-PATTERNS AND 2-PATTERNS
LEVEL 2: K = 2
6/6/2012
32
C
A
A
B
A starts B
C overlaps A
.
.
.
A
A
A equals C
A overlaps B
.
.
.
C
B
Candidate 3-patterns
C
A
B
overlaps(C overlaps A) B
.
.
.
2-patterns2-patterns
33. (2) Support Counting
Single-pass Procedure
6/6/2012 33
• support of a TP indicates the number of
event sequences in which the pattern occurs
• For a pattern to be classified as frequent, it
should have a support value higher than
the user-specified min. support threshold
Determine frequency of
candidate patterns by
counting occurrences
34. (1) Candidate Generation
Bottom-up approach
SECOND STEP: GENERATE(3+1)-PATTERNS FROM FREQUENT
3-PATTERNS AND 2-PATTERNS
LEVEL 3: K = 3
6/6/2012 34
.
.
.
A
C equals D
A overlaps B
.
.
.
D
B
Candidate 4-patterns
.
.
C
A
B
meets
(B overlaps A) C
2-patterns3-patterns
C
C
A
B
D
equals (meets
(B overlaps A) C) D
35. (2) Support Counting
Single-pass Procedure
6/6/2012 35
Determine frequency of
candidate patterns by
counting occurrences
• At each iteration: Increment the level
• Terminates when the Candidate Set is EMPTY
36. Minimum Support Threshold
vs.
Number of Frequent Patterns
Junction Dataset
0.02 vs. 92 patterns
Roundabout Dataset
0.02 vs. 29 patterns
6/6/2012 36
37. Pruning Redundant Patterns
• Our pruning criteria:
6/6/2012 37
Relation_1 Relation_2
overlaps overlaps
during during
equals equals
CASE 1
Relation_1 Relation_2
overlaps starts
during
equals finishes
CASE 2
38. 6/6/2012 38
k-patterns before after
2-patterns 55 40
3-patterns 33 26
4-patterns 4 3
k-patterns before after
2-patterns 23 17
3-patterns 5 4
4-patterns 1 1
JUNCTION
ROUNDABOUT
Pruning Redundant Patterns
CASE 3 overlaps(C overlaps A) A
40. Learning Association Rules
• Temporal association rules (TAR) describe time-
dependent correlations
• TARs are constructed from pairs of FTPs: The
left-hand side is a sub-pattern of the right-hand
pattern
k-pattern(X) k+1-pattern(Y)
• A rule’s strength is measured
by:
and rules are retained if confidence value is above
a threshold
6/6/2012 40
49. (0) Trajectory Classification
• The classification problem entails classifying
trajectories from test sequences to event categories:
{A,B,C,…}
• Classification is based on the nearest-neighbor scheme
3??? B
B
A A B
A A B 2??
A
1? C C
C C
D D
D
D D
6/6/2012 49
50. (1) Spatial Outliers
• In the physical scene layout, these events deviate
from the normal direction-of-flow
• The trajectory direction is computed as:
• The test trajectory direction is compared to cluster
prototypes direction using the DTW distance
measure
• Abnormal trajectories exceed the threshold defined
per event cluster
6/6/2012 50
53. (2) Spatio-temporal Anomaly Detection
• Abnormal activities at this stage violate both
spatial and temporal constraints
• Hierarchical pattern matching
(level 1 to level k): Patterns from test sequence
are matched against the trained sets of FTPs
– Level 1: Single Frequent Events
– Level 2: 2-patterns
– Level 3: 3-patterns
– Level 4: 4-patterns
• Next…
6/6/2012 53
54. (2) Spatio-temporal Anomaly Detection
• Law of transitivity has to be incorporated in the
pattern-matching process, in order to reduce false
positives
• If duration of test patterns exceeds a threshold with
respect to duration of trained frequent patterns,
indicates the presence of a rare event
6/6/2012 54
C
B
C
A
A
B
A before B C equals A C before B
55. Anomaly Detection: Accuracy
• Based on the ground truth:
– True Positives (TP): normal test sequence is classified as
normal
– True Negatives (TN): abnormal test sequence is classified
as abnormal
– False Positives (FP): abnormal behavior classified as
normal
– False Negatives (FN): normal behavior classified as
abnormal
6/6/2012 55
58. Contributions
• Clustering of motion
trajectories using a spatial
technique and the DTW
measure
• Utilizing interval-based
temporal mining techniques
for event recognition in
dynamic scenes
• Hierarchical spatio-temporal
anomaly detection based on
quantitative measures
6/6/2012 58
A DCB
C
A
B
D
Point-based
Interval-based
duration
59. Future Directions
• Using a fully unsupervised robust visual
surveillance tracking system
• Performing motion segmentation and anomaly
detection in real-time
• Applying this approach to more complex
scenarios as well as other domains
6/6/2012 59
60. Conclusion
• The goal is to organize the video into different
event groups and find their temporal
dependencies
• Single-agent events are modeled by trajectories
• Multi-agent interactions are represented by
temporal patterns
• Association rules are useful in predicting future
activities
• Ability to model individual behavior of vehicles
in the scene, helps in localizing anomalies
6/6/2012 60
61. Motion Segmentation: Spectral Clustering
• We aim to clustering trajectories into distinct
events in the scene.
• Spectral clustering
– obtains data points in a low-dimensional space
– ability to deal with non-convex shaped clusters
6/6/2012 65
62. Mean-shift Tracking
• Mean-shift theory: find the center of mass for
ROI, move circle to centre of mass and
continue until convergence
1) Obtain target model and location
2) Minimize the distance between the target
and candidate model
3) Kernel is moved from previous location
to current location until convergence
6/6/2012 66
Notes de l'éditeur
template
No need
Put what’s going on video
The aim is to organize the video into sets of events with associated temporal dependencies
All applications
Automatically segment the scene into semantic regions and learn their models
Event - The occurrence of an activity in a particular place during a time interval
Primitive events
Composite events – combination of atomic events
And (2) are not the problemmodify
OUTLINE…highlight n put everywhere
London: busy traffic intersection
London
OUTLINE
Color == feature space (using histograms)
Vehicle trajectories
More than 25 fps
Smoothen using a moving average filter
OUTLINE
-Learned from tracking
-Model – category of activities with similar semantic meaning
-CENTROID TRAJECTORIES
Interesting == recurrent
TPM is better than sequence mining coz rich set of relations (Allen) rather than just follows!
OUTLINE
K >=2
Mention the TIME DURATION here
Support missing here
IEMiner details…illustrate steps with event sequences
Each event sequence is 12 seconds long
Extend frequent k-pattern sets, one item at a time
Extend frequent k-pattern sets, one item at a time
Extend frequent k-pattern sets, one item at a time
OUTLINE
OUTLINE:
The traffic light sequence is accurately modeled by the forward TARs
OUTLINE
Put this as AD
Explain in detaillllllll
5 different anomalies were detected (name these)
Having extracted trajectories from this sequence
Traffic violations
Put the comparison results table here
Infrequent temporal pattern
Horizontal n vertical together!