SlideShare une entreprise Scribd logo
1  sur  141
Télécharger pour lire hors ligne
How to Come Up With New
    Research Ideas?
          Jia-Bin Huang
     jbhuang0604@gmail.com


            Taiwan
           May , 2010




                             1 / 94
What this talk is about?
   Five approaches to come up with new ideas in computer vision.
   Extensive case studies (i.e., more than one hundred papers).
   A common sense talk. No complicate theories or equations.
       I wish someone told me this before.

Reference
   The content of this talk is greatly inspired by “Raskar Idea
   Hexagon".




                                                                   2 / 94
What this talk is about?
   Five approaches to come up with new ideas in computer vision.
   Extensive case studies (i.e., more than one hundred papers).
   A common sense talk. No complicate theories or equations.
       I wish someone told me this before.

Reference
   The content of this talk is greatly inspired by “Raskar Idea
   Hexagon".




                                                                   2 / 94
What this talk is about?
   Five approaches to come up with new ideas in computer vision.
   Extensive case studies (i.e., more than one hundred papers).
   A common sense talk. No complicate theories or equations.
       I wish someone told me this before.

Reference
   The content of this talk is greatly inspired by “Raskar Idea
   Hexagon".




                                                                   2 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?



                                                                3 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                                4 / 94
Active Topics in Computer Vision
[Szeliski Computer Vision: Algorithms and Applications 2010]

     Digital image processing                Blocks world, line labeling
       Generalized cylinders                    Pictorial structures
      Stereo correspondence                       Intrinsic images
            Optical flow                        Structure from motion
          Image pyramids                      Scale-space processing
           Shape from X                      Physically-based modeling
           Regularization                     Markov Random Fields
           Kalman filters                     3D range data processing
        Projective invariants                       Factorization
       Physics-based vision                          Graph cuts
          Particle filtering                 Energy-based segmentation
   Face recognition and detection               Subspace methods
  Image-based modeling/rendering            Texture synthesis/inpainting
    Computational photography                Feature-based recognition
     MRF inference algorithms                         Learning

                                                                           5 / 94
What can we learn from the past?
   The topics are diverse and evolve over time.




   The ways to come up with new ideas are similar. There are
   patterns to follow.




                                                               6 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                                7 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                                8 / 94
Seek different dimensions   neXt = X d




   The only difference between a rut
   and a grave is their dimensions. -
            Ellen Glasgow



                                         9 / 94
Seek different dimensions                     neXt = X d




Idea
     Can we increase/replace/transform the dimensions of the original
     problem to get new problems/solutions?

What kind of dimensions can we work on?
 1   Concrete dimensions (e.g., space, time, frequency)
 2   Abstract dimensions (e.g., properties)




                                                                   10 / 94
EX 1-1. Content-Aware Media Resizing
[Avidan et al. SIGGRAPH 07] [Rubinstein et al. SIGGRAPH 08]




Ideas
     Extend dimensions from 2D image to 3D video: image re-targeting
     ⇒ video re-targeting
     Other dimensions? E.g., 4D light field, infrared image, range
     image.
                                                                    11 / 94
EX 1-2. Video Stitching
[Rav-Acha et al. CVPR 05]




         Input video                Dynamic Panorama
Ideas
     Extend dimensions from image to video, i.e., Image Panorama ⇒
     Video Mosaics with Non-Chronological Time
     Increase the time dimension in both input and output



                                                                12 / 94
EX 1-3. Multi-Image Fusion
[Agarwala et al. SIGGRAPH 04]




Ideas
     Extend from single input image to multiple input images ⇒ Digital
     Photomontage
     Increase the dimension in input only.
                                                                    13 / 94
EX 1-4. Computation Photography (Coded
Photography)
[Raskar et al. SIGGRAPH 04, 06, 08] [Levin et al. SIGGRAPH 07]




Ideas
     Coded Photography: reversibly encode information about the
     scene in a single photograph
     Coding in Time (Exposure), Coded Illumination, Coding in Space
     (aperture), and Coded Wavelength
     Replace the dimension to code information of the light field


                                                                   14 / 94
EX 1-1. Photography in Low Light Conditions




       Flash                 Blurred                  Noisy

What we can do ?
   Flash → Changes the overall scene appearance (cold and gray)
   Long exposure time (hand shake) → Blurred image
   Short exposure time (insufficient light) → Noisy image




                                                              15 / 94
EX 1-1-1. Flash/non-Flash Photography
[Petschnigg et al. SIGGRAPH 2004]




        Flash                No flash      Detail transfer with denoising

Ideas
     The original problem (taking a good photo in low light
     environments from single image) is difficult.
     Increase the dimension of input (flash/no-flash image pair) make
     the problem much easier.

                                                                     16 / 94
EX 1-1-2. Image Deblurring with Blurred/Noisy Image
Pairs
[Yuan et al. SIGGRAPH 2007]




     Blurred            Noisy      Enhanced noisy    Deblurred result

Ideas
     The original problem (taking a good photo in low light and flash
     prohibited environments from single image) is difficult.
     Increase the dimension of input (Blurred/Noisy image pair) make
     the problem much easier.


                                                                       17 / 94
EX 1-1-3. Robust Flash Deblurring
[Zhou et al. CVPR 2010]




Ideas
     The original problem (taking a good photo in low light
     environments from single image) is difficult.
     Increase the dimension of input (Blurred/Flash image pair) make
     the problem much easier.



                                                                   18 / 94
EX 1-1-4. Dark Flash Photography
[Krishnan et al. SIGGRAPH 2009]




Ideas
     The original problem (taking a good photo in low light
     environments from single image) is difficult.
     Increase the dimension of input (Dark Flash/Noisy image pair)
     make the problem much easier.
                                                                     19 / 94
EX 1-2. Brute-Force Vision
[Hays and Efros SIGGRAPH 07] [Dale et al. ICCV 09] [Agarwal et al. ICCV 09]
[Furukawa et al. ICCV 09]




Ideas
     Utilize a large collection of photos.
                                                                              20 / 94
EX 2-1. X Alignment/Registration (pixel, object, scene)
[Liu et al. CVPR 08, ECCV 08] [Berg et al. CVPR 05]




                                                      21 / 94
EX 2-2. Shape from X (shading, texture, specular)
[Lobay and Forsyth IJCV 06] [Fleming et al JOV 04] [Adato et al ICCV 07]




                shading                                 specular




                texture                               specular flow
                                                                           22 / 94
EX 2-3. Depth from X (stereo, (de-)focus, coded
aperture, diffusion, occlusion, semantic label)
[Levin et al. SIGGRAPH 07] [Hoiem et al. ICCV 07] [Liu et al. CVPR 10] [Zhou et al.
CVPR 10]




         Coded Aperture                           Semantic Labels




           Occlusion                                 Diffusion

                                                                                  23 / 94
EX 2-4. Infer X from a single image (geometric,
geography, illumination)
[Hoiem et al. ICCV 05] [Hays and Efros CVPR 08] [Lalonde et al. ICCV 09]




                                                                     Geometric




                                                                    Geography




                                                                    Illumination
                                                                              24 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                               25 / 94
Combine two or more topics   neXt = X + Y




   To steal ideas from one person is
   plagiarism. To steal from many is
       research. - Wilson Mizner



                                            26 / 94
Combine two or more topics                 neXt = X + Y



Idea
     Can we combine two or more topics to get new problems or
     solutions?

What kind of topics can we combine?
 1   X, Y are methods
 2   X, Y are problems
 3   X, Y are areas




                                                                27 / 94
EX 1-1. Viola-Jones Object Detection Framework
[Viola and Jones CVPR 2001]




 Simple feature      Integral img   Boosting      Cascade structure

Ideas
     Paper title: Rapid Object Detection using a Boosted Cascade of
     Simple Features
     Viola-Jones object detection framework = Integral Images (simple
     feature)(1984) + AdaBoost(1997) + Cascade Architecture(long
     time ago)


                                                                      28 / 94
EX 1-2. SIFT Flow = SIFT + Optical Flow
[Liu et al. ECCV 08 CVPR 09]




    Motion hallucination
                                   Label transfer
Ideas
     Dense sampling in time : optical flow :: dense sampling in world
     images : SIFT flow
                                                                       29 / 94
EX 1-3. Visual Tracking with Online Multiple Instance
Boosting
[Babenko et al. CVPR 09]




Ideas
     MILTrack = Multiple Instance Boosting (2005) + Online Boosting
     Tracking (2006)
                                                                      30 / 94
EX 2-1. High Dynamic Range Image Reconstruction
from Hand-held Cameras
[Lu et al. CVPR 2009]




Ideas
     HDR from from Hand-held Cameras = High Dynamic Range
     Image Reconstruction + Image Deblurring
                                                            31 / 94
EX 2-2. Human Body Understanding
[Guan et al. ICCV 09]




Ideas
     Human Body Understanding = Shape Reconstruction + Pose
     Estimation

                                                              32 / 94
EX 2-3. Image Understanding
detection, tracking, recognition, segmentation, reconstruction, scene classification,
event recognition




                                                                                       33 / 94
EX 2-3-1. Detection + Tracking
[Andriluka et al. CVPR 08]




Ideas
      People detection and people tracking are highly correlated
      problems.
      Combine two problems can potentially achieve improved
      performance on individual tasks.



                                                                   34 / 94
EX 2-3-2. Object Attribute + Recognition
[Farhadi et al. CVPR 09] [Lampert et al. CVPR 09]




Ideas
     Describe image by attributes
     Enable knowledge transfer to recognition class with no visual
     examples
                                                                     35 / 94
EX 2-3-2. Object Recognition + Detection
[Yeh et al. CVPR 09]




Ideas
     Concurrent object localization and recognition
                                                      36 / 94
EX 2-3-3. Image Segmentation + Object Recognition
+ Event Recognition
[Li et al. CVPR 09]




Ideas
      Combine scene classification, image segmentation, image
      annotation
      All three tasks are mutually beneficial
                                                               37 / 94
EX 3-1. SixthSense - A Wearable Gestural Interface
[Mistry and Maes TED 2009]




Ideas
     SixthSense = Computer Vision (e.g., tracking, recognition) +
     Internet
                                                                    38 / 94
EX 3-2. Sikuli:Picture-driven computing
[Yeh et al. UIST 09] [Chang et al. CHI 10]




Ideas
      1. Readability/usability, 2. GUI serialization, 3. Computer vision
      on computer-generated figures
                                                                           39 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                               40 / 94
Re-think the research directions          ¯
                                   neXt = X




If at first, the idea is not absurd, then
         there is no hope for it -
              Albert Einstein



                                              41 / 94
Re-think the research directions                         ¯
                                                  neXt = X



Ideas
     Are the current research directions really make sense? What’s the
     key problem?

What could we do?
 1   Re-formulate the original problem.
 2   Analyze, compare existing approaches. Provide insight to the
     problems.




                                                                    42 / 94
EX 1-1. Beyond Sliding Windows
[Lampert et al. CVPR 08]




             Rectangle set              Branch and bound search

Ideas
     Sliding window search ⇔ brand-and-bound search
     Represent a set of rectangles with 4 intervals
     Use brand-and-bound to find the optimal rectangle (object
     localization) efficiently

                                                                  43 / 94
EX 1-2. Beyond Categories
[Malisiewicz and Efros CVPR 08, NIPS 09]




Ideas
     Explicit categorization ⇔ Implicit categorization
     Ask "what is this like?" (association), instead of "what is it?"
     (categorization)
                                                                        44 / 94
EX 1-3. Motion-Invariant Photography
[Levin et al. SIGGRAPH 08] [Cho et al. ICCP 10]




Ideas
     Still camera ⇔ Moving camera (parabolic exposures)
     Enable the use of spatial-invariant blur kernel estimation


                                                                  45 / 94
EX 1-4. Super-resolution from Single Image
[Glasner et al. ICCV 09]




Ideas
      Clasical multi-image SR/Example-based SR ⇔ Single SR
      framework
                                                             46 / 94
EX 2-1. In Defense of ...
[Boiman et al. CVPR 08] [Hartley PAMI 97]



Nearest-Neighbor Based Image Classification
     Quantization of local image descriptors (used to generate
     "bags-of-words", codebooks).
     Computation of "Image-to-Image" distance, instead of
     "Image-to-Class" distance
     The performance ranks among the top leading learning-based
     image classifiers

The 8-point Algorithm for the fundamental matrix
     Normalization, Normalization, Normalization!
     Performs almost as well as the best iterative algorithm


                                                                  47 / 94
EX 2-2. Understanding blind deconvolution
[Levin et al. CVPR 2009]




Ideas
     Blind deconvolution: recover sharp image x from the blurred one
     (y = k ⊗ x + n).
     MAPx,k estimation often favors no-blur explanations.
     MAPk can be accurately estimated since the kernel size is often
     smaller than the image size.
     Blind deconvolution should be address in this way: MAPk +
     non-blind deconvolution.




                                                                       48 / 94
EX 2-3. Understanding camera trade-offs
[Levin et al. ECCV 08]




Ideas
      Traditional optics evaluation: 2D image sharpness (eg, Modulation
      Transfer Function)
      Modern camera evaluation: How well does the recorded data
      allow us to estimate the visual world - the lightfield?
                                                                    49 / 94
EX 2-4. What is a good image segment?
[Bagon et al. ECCV 08]




Ideas
     Good image segment as one which can be easily composed using
     its own pieces, but is difficult to compose using pieces from other
     parts of the image
                                                                    50 / 94
EX 2-5. Lambertian Reflectance and Linear
Subspaces
[Basri and Jacobs PAMI 03]




Ideas
     The set of all Lambertian reflectance functions (the mapping from
     surface normals to intensities) obtained with arbitrary distant light
     sources lies close to a 9D linear subspace.
     Explain prior empirical results using linear subspace methods.

                                                                        51 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                               52 / 94
Use powerful tools, find suitable problems neXt = X ↑




If the only tool you have is a hammer,
  you tend to see every problem as a
        nail. - Abraham Maslow



                                                  53 / 94
Use powerful tools, find suitable problems neXt = X ↑


What kinds of tools should we understand?
   Calculus of Variations
   Dimensionality Reduction
   Spectral Methods (specifically, spectral clustering)
   Probabilistic Graphical Model
   Structured Prediction
   Bilateral Filtering
   Sparse Representation
   and more ... spectral method/theory, information theory, (convex)
   optimization, etc



                                                                   54 / 94
EX 1. Calculus of Variations (1/2)
From Calculus to Calculus of Variations
      Calculus                    Calculus of Variations
      Functions              Functionals (functions of functions)
                                               x
      f: Rn → R             f: F → R, f (u) = x12 L(x, u(x), u (x))dx
                  (x)                                        df (u)
    Derivative dfdx                          Variation        du
 lim∆x→0 f (x+∆x)−f (x)
              ∆x              lim   →0
                                         f (u+ δx)−f (u) ∂
                                                   f (x + ∆u)|
                                                         ∂            =0
   Local extremum                       Local extremum
       df (x)
        dx = 0                      Euler-Lagrange equation

Total Variation (TV)
                       x1
           TV(y) =    x0    |y |dx: The "oscillation strength" of y(x)




                                                                           55 / 94
EX 1. Calculus of Variations (2/2)
Total Variation Denoising/Inpainting




Applications in computer vision
    Optical flow [Horn and Schunck AI 81]
    Shape from shading [Horn and Brooks CVGIP 86]
    Edge detection [PAMI 87]
    Anisotropic diffusion [Perona and Malik PAMI 90]
    Active contours model [Kass et al. IJCV 98]
    Image segmentation [Morel and Solimini 95]
    Image restoration [Aubert and Vese SIAM Journal on NA 97]   56 / 94
EX 1. Calculus of Variations (2/2)
Total Variation Denoising/Inpainting




Applications in computer vision
    Optical flow [Horn and Schunck AI 81]
    Shape from shading [Horn and Brooks CVGIP 86]
    Edge detection [PAMI 87]
    Anisotropic diffusion [Perona and Malik PAMI 90]
    Active contours model [Kass et al. IJCV 98]
    Image segmentation [Morel and Solimini 95]
    Image restoration [Aubert and Vese SIAM Journal on NA 97]   56 / 94
EX 2. Dimensionality Reduction (1/2)

Why we need dimensionality reduction?
Since high-dimensional data is everywhere (e.g., images, human gene
distributions, weather prediction), we need dimensionality reduction for
 1   processing data efficiently.
 2   estimating the distributions of data accuratly (curse of
     dimensionality)
 3   finding meaningful representation of data

Classification of dimensionality reduction methods
               Global structure preserved    Local structure preserved
  Linear               PCA, LDA                      LPP, NPE
 Nonlinear     ISOMAP, Kernel PCA, DM              LLE, LE, HE


                                                                     57 / 94
EX 2. Dimensionality Reduction (1/2)

Why we need dimensionality reduction?
Since high-dimensional data is everywhere (e.g., images, human gene
distributions, weather prediction), we need dimensionality reduction for
 1   processing data efficiently.
 2   estimating the distributions of data accuratly (curse of
     dimensionality)
 3   finding meaningful representation of data

Classification of dimensionality reduction methods
               Global structure preserved    Local structure preserved
  Linear               PCA, LDA                      LPP, NPE
 Nonlinear     ISOMAP, Kernel PCA, DM              LLE, LE, HE


                                                                     57 / 94
EX 2. Dimensionality Reduction (2/2)
Applications in computer vision
    Subspace as constraints
        Structure from motion [Tomasi and Kanade IJCV 92], Optical flow
        [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face
        alignment [Saragih et al. ICCV 09]
    Face recognition (e.g., PCA, LDA, LPP)
        PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97],
        LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09]
    Motion segmentation
        subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV
        06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09]
    Lighting
        linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades
        et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02]
    Visual tracking
        incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR
        08]
                                                                        58 / 94
EX 2. Dimensionality Reduction (2/2)
Applications in computer vision
    Subspace as constraints
        Structure from motion [Tomasi and Kanade IJCV 92], Optical flow
        [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face
        alignment [Saragih et al. ICCV 09]
    Face recognition (e.g., PCA, LDA, LPP)
        PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97],
        LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09]
    Motion segmentation
        subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV
        06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09]
    Lighting
        linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades
        et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02]
    Visual tracking
        incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR
        08]
                                                                        58 / 94
EX 2. Dimensionality Reduction (2/2)
Applications in computer vision
    Subspace as constraints
        Structure from motion [Tomasi and Kanade IJCV 92], Optical flow
        [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face
        alignment [Saragih et al. ICCV 09]
    Face recognition (e.g., PCA, LDA, LPP)
        PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97],
        LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09]
    Motion segmentation
        subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV
        06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09]
    Lighting
        linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades
        et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02]
    Visual tracking
        incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR
        08]
                                                                        58 / 94
EX 2. Dimensionality Reduction (2/2)
Applications in computer vision
    Subspace as constraints
        Structure from motion [Tomasi and Kanade IJCV 92], Optical flow
        [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face
        alignment [Saragih et al. ICCV 09]
    Face recognition (e.g., PCA, LDA, LPP)
        PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97],
        LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09]
    Motion segmentation
        subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV
        06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09]
    Lighting
        linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades
        et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02]
    Visual tracking
        incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR
        08]
                                                                        58 / 94
EX 2. Dimensionality Reduction (2/2)
Applications in computer vision
    Subspace as constraints
        Structure from motion [Tomasi and Kanade IJCV 92], Optical flow
        [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face
        alignment [Saragih et al. ICCV 09]
    Face recognition (e.g., PCA, LDA, LPP)
        PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97],
        LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09]
    Motion segmentation
        subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV
        06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09]
    Lighting
        linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades
        et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02]
    Visual tracking
        incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR
        08]
                                                                        58 / 94
EX 3. Spectral Clustering (1/3)
Why spectral clustering is popular?
      Can be solved efficiently by standard linear algebra software
      Very often outperform traditional clustering algorithms

Spectral clustering algorithm
Input: a set of data points
  1   Construct a similarity graph, e.g., -neighbor, k-nearest neighbor,
      fully connected
  2   Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym )
  3   Compute the first k (with smallest eigenvalues) eigenvectors of L,
      v1 , · · · , vk
  4   Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns
  5   Cluster the row vectors yi with the k-means algorithm into cluster
      C1 , · · · , Ck
Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci }
                                                                         59 / 94
EX 3. Spectral Clustering (1/3)
Why spectral clustering is popular?
      Can be solved efficiently by standard linear algebra software
      Very often outperform traditional clustering algorithms

Spectral clustering algorithm
Input: a set of data points
  1   Construct a similarity graph, e.g., -neighbor, k-nearest neighbor,
      fully connected
  2   Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym )
  3   Compute the first k (with smallest eigenvalues) eigenvectors of L,
      v1 , · · · , vk
  4   Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns
  5   Cluster the row vectors yi with the k-means algorithm into cluster
      C1 , · · · , Ck
Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci }
                                                                         59 / 94
EX 3. Spectral Clustering (1/3)
Why spectral clustering is popular?
      Can be solved efficiently by standard linear algebra software
      Very often outperform traditional clustering algorithms

Spectral clustering algorithm
Input: a set of data points
  1   Construct a similarity graph, e.g., -neighbor, k-nearest neighbor,
      fully connected
  2   Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym )
  3   Compute the first k (with smallest eigenvalues) eigenvectors of L,
      v1 , · · · , vk
  4   Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns
  5   Cluster the row vectors yi with the k-means algorithm into cluster
      C1 , · · · , Ck
Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci }
                                                                         59 / 94
EX 3. Spectral Clustering (1/3)
Why spectral clustering is popular?
      Can be solved efficiently by standard linear algebra software
      Very often outperform traditional clustering algorithms

Spectral clustering algorithm
Input: a set of data points
  1   Construct a similarity graph, e.g., -neighbor, k-nearest neighbor,
      fully connected
  2   Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym )
  3   Compute the first k (with smallest eigenvalues) eigenvectors of L,
      v1 , · · · , vk
  4   Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns
  5   Cluster the row vectors yi with the k-means algorithm into cluster
      C1 , · · · , Ck
Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci }
                                                                         59 / 94
EX 3. Spectral Clustering (1/3)
Why spectral clustering is popular?
      Can be solved efficiently by standard linear algebra software
      Very often outperform traditional clustering algorithms

Spectral clustering algorithm
Input: a set of data points
  1   Construct a similarity graph, e.g., -neighbor, k-nearest neighbor,
      fully connected
  2   Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym )
  3   Compute the first k (with smallest eigenvalues) eigenvectors of L,
      v1 , · · · , vk
  4   Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns
  5   Cluster the row vectors yi with the k-means algorithm into cluster
      C1 , · · · , Ck
Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci }
                                                                         59 / 94
EX 3. Spectral Clustering (1/3)
Why spectral clustering is popular?
      Can be solved efficiently by standard linear algebra software
      Very often outperform traditional clustering algorithms

Spectral clustering algorithm
Input: a set of data points
  1   Construct a similarity graph, e.g., -neighbor, k-nearest neighbor,
      fully connected
  2   Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym )
  3   Compute the first k (with smallest eigenvalues) eigenvectors of L,
      v1 , · · · , vk
  4   Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns
  5   Cluster the row vectors yi with the k-means algorithm into cluster
      C1 , · · · , Ck
Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci }
                                                                         59 / 94
EX 3. Spectral Clustering (1/3)
Why spectral clustering is popular?
      Can be solved efficiently by standard linear algebra software
      Very often outperform traditional clustering algorithms

Spectral clustering algorithm
Input: a set of data points
  1   Construct a similarity graph, e.g., -neighbor, k-nearest neighbor,
      fully connected
  2   Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym )
  3   Compute the first k (with smallest eigenvalues) eigenvectors of L,
      v1 , · · · , vk
  4   Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns
  5   Cluster the row vectors yi with the k-means algorithm into cluster
      C1 , · · · , Ck
Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci }
                                                                         59 / 94
EX 3. Spectral Clustering (2/3)


Why it works?
   Graph Cut Point of View: Construct a partition that minimize the
   weight across the cut (the well-known mincut problem) while
   balancing the clusters (e.g., RatioCut, Normalized cut).
   Random Walks Point of View: When minimizing Ncut, we
   actually look for a cut through the graph such that a random walk
   seldom transitions from one cluster to another.
   Perturbation Theory Point of View: The distance between
   eigenvectors from the ideal and nearly ideal graph Laplacian is
   bounded by a constant times a norm of the error matrix. If the
   perturbations are not small enough, then the k-means algorithm
   will still separate the groups from each other.



                                                                  60 / 94
EX 3. Spectral Clustering (2/3)


Why it works?
   Graph Cut Point of View: Construct a partition that minimize the
   weight across the cut (the well-known mincut problem) while
   balancing the clusters (e.g., RatioCut, Normalized cut).
   Random Walks Point of View: When minimizing Ncut, we
   actually look for a cut through the graph such that a random walk
   seldom transitions from one cluster to another.
   Perturbation Theory Point of View: The distance between
   eigenvectors from the ideal and nearly ideal graph Laplacian is
   bounded by a constant times a norm of the error matrix. If the
   perturbations are not small enough, then the k-means algorithm
   will still separate the groups from each other.



                                                                  60 / 94
EX 3. Spectral Clustering (2/3)


Why it works?
   Graph Cut Point of View: Construct a partition that minimize the
   weight across the cut (the well-known mincut problem) while
   balancing the clusters (e.g., RatioCut, Normalized cut).
   Random Walks Point of View: When minimizing Ncut, we
   actually look for a cut through the graph such that a random walk
   seldom transitions from one cluster to another.
   Perturbation Theory Point of View: The distance between
   eigenvectors from the ideal and nearly ideal graph Laplacian is
   bounded by a constant times a norm of the error matrix. If the
   perturbations are not small enough, then the k-means algorithm
   will still separate the groups from each other.



                                                                  60 / 94
EX 3. Spectral Clustering (3/3)
[Shi and Malik PAMI 02]

                  Eigenvectors carry contour information.




                                                            61 / 94
EX 4. Probabilistic Graphical Model (1/2)




What is probabilistic graphical models?
    A marriage between probability theory and graph theory.
    A natural tool for dealing with uncertainty and complexity
    Provides a way to view all probablistic systems (e.g., mixture
    models, factor analysis, hidden Markov models, Kalman filters and
    Ising models) as instances of a common underlying formalism.




                                                                 62 / 94
EX 4. Probabilistic Graphical Model (2/2)




                                            63 / 94
EX 5. Structured Prediction (1/2)


What is structured prediction?
    Structured prediction is a framework for solving problems of
    classification or regression in which the output variables are
    mutually dependent or constrained.
    Lots of examples
        Natural language parsing
        Machine translation
        Object segmentation
        Gene prediction
        Protein alignment
        Numerous tasks in computational linguistics, speech, vision,
        biology.




                                                                       64 / 94
EX 5. Structured Prediction (1/2)


What is structured prediction?
    Structured prediction is a framework for solving problems of
    classification or regression in which the output variables are
    mutually dependent or constrained.
    Lots of examples
        Natural language parsing
        Machine translation
        Object segmentation
        Gene prediction
        Protein alignment
        Numerous tasks in computational linguistics, speech, vision,
        biology.




                                                                       64 / 94
EX 5. Structured Prediction (2/2)
Applications [Lampert et al. ECCV 08] [Desai et al. ICCV 09]




                                                               65 / 94
EX 6. Bilateral Filtering (1/3)
What’s Bilateral Filtering?
    A technique to smooth images while preserving edges
    Ubiquitous in image processing, computational photography




                                                                66 / 94
EX 6. Bilateral Filtering (2/3)
[Bennett and McMillan SIGGRAPH 05] [Eisemann and Durand SIGGRAPH 04] [Jones
et al. SIGGRAPH 03] [Winnem¨oller et al. SIGGRAPH 06] [Bae et al. SIGGRAPH 02]




                                                                            67 / 94
EX 6. Bilateral Filtering (3/3)
How does bilateral filter relate with other methods?




Intepretation
    Bilateral filter is equivalent to mode filtering in local histograms
    Bilateral filter can be interpreted in term of robust statistics since it
    is related to a cost function
    Bilateral filter is a discretization of a particular kind of a
    PDE-based anisotropic diffusion
                                                                         68 / 94
EX 6. Bilateral Filtering (3/3)
How does bilateral filter relate with other methods?




Intepretation
    Bilateral filter is equivalent to mode filtering in local histograms
    Bilateral filter can be interpreted in term of robust statistics since it
    is related to a cost function
    Bilateral filter is a discretization of a particular kind of a
    PDE-based anisotropic diffusion
                                                                         68 / 94
EX 7. Sparse Representation (1/4)

Ideas
   Natural signals (e.g. audio, image) usually admit sparse
   representation (i.e., can be well represented by a linear
   combination of a few atom signals)
   Successfully applied to various areas in signal/image precessing,
   vision and graphics.




                                                                  69 / 94
EX 7. Sparse Representation (2/4)
Image Restoration [Aharon et al. TSP 06] [Julien et al. TIP 08]




                           denoising                              Inpainting




                          Demoisaic                               Inpainting

                                                                               70 / 94
EX 7. Sparse Representation (3/4)
Classification [Wright et al. PAMI 09] [Julien et al. CVPR ECCV NIPS 08]




            face recognition                         edge detection




         texture classification                     pixel classification



                                                                          71 / 94
EX 7. Sparse Representation (4/4)
Compressive sensing [donoho TIT 06] [Candes and Tao TIT 05 06]




        and more (e.g., low-rank matrix completion, robust PCA)
                                                                  72 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                               73 / 94
Add an appropriate adjective   neXt = Adj + X




  There is only one religion, though
 there are a hundred versions of it. -
       George Bernard Shaw



                                                74 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
Add an appropriate adjective                    neXt = Adj + X

What kinds of adjective can we use?
   linear ⇔ non-linear
   generative/reconstructive ⇔ discriminative
   rule-based / hand-designed ⇔ leanring-based
   single scale ⇔ multi-scale
   signle step ⇔ progressive
   batch processing ⇔ incremental / online processing
   fixed ⇔ adaptive / dynamic to data
   parametric ⇔ non-parametric
   Z - invariant (Z = translation / scale / rotation / noise, facial
   expression / pose / lighting / occlusion)
   Z - aware (Z = motion / content / semantic / context / occlusion)


                                                                       75 / 94
EX 1. Linear ⇔ Non-linear




    Hard to find a straingt line to seperate them into two cluster?

Ideas
   Linear methods may not capture the nonlinear structure in the
   original data representation
   Nonlinear methods
        Kernel tricks (e.g., Kernel PCA, Kernel LDA, Kernel SVM, etc)
        Manifold learning (e.g., ISOMAP, LLE, Laplacian eigenmap, etc)

                                                                         76 / 94
EX 1. Linear ⇔ Non-linear




    Hard to find a straingt line to seperate them into two cluster?

Ideas
   Linear methods may not capture the nonlinear structure in the
   original data representation
   Nonlinear methods
        Kernel tricks (e.g., Kernel PCA, Kernel LDA, Kernel SVM, etc)
        Manifold learning (e.g., ISOMAP, LLE, Laplacian eigenmap, etc)

                                                                         76 / 94
EX 2. Generative ⇔ Discriminative


Classification task : X → Y
    Generative classifier estimate class-conditional pdfs P(X|Y) and
    prior probabilities P(Y)
        Naive Bayes, Mixtures of Gaussians, Mixtures of experts, Hidden
        Markov Models (HMM), Sigmoidal belief networks, Bayesian
        networks, Markov random fields (MRF)
    Discriminative classifier estimate posterior probabilities P(Y|X)
        Logistic regression, SVMs, Traditional neural networks, Nearest
        neighbor, Conditional Random Fields (CRF)
    Bayes’ rule
                                      P(X|Y)P(Y)
                           P(Y|X) =
                                         P(X)
    Two different perspectives in viewing a problem


                                                                          77 / 94
EX 2. Generative ⇔ Discriminative


Classification task : X → Y
    Generative classifier estimate class-conditional pdfs P(X|Y) and
    prior probabilities P(Y)
        Naive Bayes, Mixtures of Gaussians, Mixtures of experts, Hidden
        Markov Models (HMM), Sigmoidal belief networks, Bayesian
        networks, Markov random fields (MRF)
    Discriminative classifier estimate posterior probabilities P(Y|X)
        Logistic regression, SVMs, Traditional neural networks, Nearest
        neighbor, Conditional Random Fields (CRF)
    Bayes’ rule
                                      P(X|Y)P(Y)
                           P(Y|X) =
                                         P(X)
    Two different perspectives in viewing a problem


                                                                          77 / 94
EX 2. Generative ⇔ Discriminative


Classification task : X → Y
    Generative classifier estimate class-conditional pdfs P(X|Y) and
    prior probabilities P(Y)
        Naive Bayes, Mixtures of Gaussians, Mixtures of experts, Hidden
        Markov Models (HMM), Sigmoidal belief networks, Bayesian
        networks, Markov random fields (MRF)
    Discriminative classifier estimate posterior probabilities P(Y|X)
        Logistic regression, SVMs, Traditional neural networks, Nearest
        neighbor, Conditional Random Fields (CRF)
    Bayes’ rule
                                      P(X|Y)P(Y)
                           P(Y|X) =
                                         P(X)
    Two different perspectives in viewing a problem


                                                                          77 / 94
EX 3. Rule-based / Hand-designed ⇔ Leanring-based




               Hard to find rules to recognize digits?

Ideas
   It may be difficult to design a set of rule to do certain task such as
   handwritten digit recognition
   Turn to machine learning methods instead


                                                                      78 / 94
EX 4. Single scale ⇔ Multi-scale
[Zelnik-Manor and Perona NIPS 04]




Ideas
     We live in a multi-scale world (atom ↔ universe)
     Image pyraimds / scale-space theory / wavelet representation →
     all attempt to capture the multi-scale properties in signal/images.



                                                                       79 / 94
EX 5. Single step ⇔ Progressive
[Yuan et al. SIGGRAPH 08]




Ideas
     Some problems are difficult to solve in one step → solve it
     progressively
                                                                  80 / 94
EX 6. Batch processing ⇔ Incremental / Online
processing
Ideas
   Online methods can handle potentially infinite data samples and
   time-varied data

Examples
   PCA → Incremental PCA (many variants)
   LDA → Incremental LDA (many variants)
   SVM → Incremental and decremental SVM [Cauwenberghs and
   Poggio NIPS 01]
   Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] →
   Online dictionary learning [Mairal et al. ICML/JMLR 09]
   AdaBoosting → Online boosting [Grabner and Bischof CVPR 06]
   Multiple instance boosting → Online multiple instance boosting
   [Babenko et al. CVPR 09]
                                                                    81 / 94
EX 6. Batch processing ⇔ Incremental / Online
processing
Ideas
   Online methods can handle potentially infinite data samples and
   time-varied data

Examples
   PCA → Incremental PCA (many variants)
   LDA → Incremental LDA (many variants)
   SVM → Incremental and decremental SVM [Cauwenberghs and
   Poggio NIPS 01]
   Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] →
   Online dictionary learning [Mairal et al. ICML/JMLR 09]
   AdaBoosting → Online boosting [Grabner and Bischof CVPR 06]
   Multiple instance boosting → Online multiple instance boosting
   [Babenko et al. CVPR 09]
                                                                    81 / 94
EX 6. Batch processing ⇔ Incremental / Online
processing
Ideas
   Online methods can handle potentially infinite data samples and
   time-varied data

Examples
   PCA → Incremental PCA (many variants)
   LDA → Incremental LDA (many variants)
   SVM → Incremental and decremental SVM [Cauwenberghs and
   Poggio NIPS 01]
   Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] →
   Online dictionary learning [Mairal et al. ICML/JMLR 09]
   AdaBoosting → Online boosting [Grabner and Bischof CVPR 06]
   Multiple instance boosting → Online multiple instance boosting
   [Babenko et al. CVPR 09]
                                                                    81 / 94
EX 6. Batch processing ⇔ Incremental / Online
processing
Ideas
   Online methods can handle potentially infinite data samples and
   time-varied data

Examples
   PCA → Incremental PCA (many variants)
   LDA → Incremental LDA (many variants)
   SVM → Incremental and decremental SVM [Cauwenberghs and
   Poggio NIPS 01]
   Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] →
   Online dictionary learning [Mairal et al. ICML/JMLR 09]
   AdaBoosting → Online boosting [Grabner and Bischof CVPR 06]
   Multiple instance boosting → Online multiple instance boosting
   [Babenko et al. CVPR 09]
                                                                    81 / 94
EX 6. Batch processing ⇔ Incremental / Online
processing
Ideas
   Online methods can handle potentially infinite data samples and
   time-varied data

Examples
   PCA → Incremental PCA (many variants)
   LDA → Incremental LDA (many variants)
   SVM → Incremental and decremental SVM [Cauwenberghs and
   Poggio NIPS 01]
   Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] →
   Online dictionary learning [Mairal et al. ICML/JMLR 09]
   AdaBoosting → Online boosting [Grabner and Bischof CVPR 06]
   Multiple instance boosting → Online multiple instance boosting
   [Babenko et al. CVPR 09]
                                                                    81 / 94
EX 6. Batch processing ⇔ Incremental / Online
processing
Ideas
   Online methods can handle potentially infinite data samples and
   time-varied data

Examples
   PCA → Incremental PCA (many variants)
   LDA → Incremental LDA (many variants)
   SVM → Incremental and decremental SVM [Cauwenberghs and
   Poggio NIPS 01]
   Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] →
   Online dictionary learning [Mairal et al. ICML/JMLR 09]
   AdaBoosting → Online boosting [Grabner and Bischof CVPR 06]
   Multiple instance boosting → Online multiple instance boosting
   [Babenko et al. CVPR 09]
                                                                    81 / 94
EX 7. Fixed ⇔ Adaptive / Dynamic
[Elad and Aharon TIP 06]




Ideas
     Adaptive approaches usually outperform the predefined/fixed
     ones.
                                                                 82 / 94
EX 8. Parametric ⇔ Non-parametric
Probability density estimation
    Parametric
        Assumes a specific functional form with paramter θ
             e.g., Gaussian distribution with unknown mean and variance, mixture
             of Gaussians
        Parameter estimation
             Estimative approach: p(x) = p(x|θbest )
             Bayesian approach p(x) = a(θ)p(x|θ)dθ
    Non-parametric
        Do not assume a specific form of the probability distributions
             e.g., Histogram, kernel density estimation (or Parzen window method)




                                                                              83 / 94
EX 8. Parametric ⇔ Non-parametric
Probability density estimation
    Parametric
        Assumes a specific functional form with paramter θ
             e.g., Gaussian distribution with unknown mean and variance, mixture
             of Gaussians
        Parameter estimation
             Estimative approach: p(x) = p(x|θbest )
             Bayesian approach p(x) = a(θ)p(x|θ)dθ
    Non-parametric
        Do not assume a specific form of the probability distributions
             e.g., Histogram, kernel density estimation (or Parzen window method)




                                                                              83 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 9. Z - invariant

Make your method robust to potential performance degradation
   noise (e.g., Gaussian additive noise, impluse noise, non-uniform
   noise) (e.g., image restoration)
   translation shift (e.g., near-duplicate image/video detection, image
   search)
   scale change (e.g., object detection, feature extraction)
   perspective distortion (e.g., feature extraction)
   deformation (e.g., non-rigid registration, part-based object
   detection)
   pose variation (e.g., human pose estimation)
   lighting variation (e.g., face recognition)
   partial occlusion (e.g., object detection and recognition)


                                                                    84 / 94
EX 10. Z - aware
[Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10]




                       motion-aware video resizing
Make your method be aware of potential failure cases
     Motion (e.g., video processing)
     Content (e.g., image processing)
     Semantic (e.g., image and video indexing/retrival)
     Context (e.g., image understanding)
     Occlusion (e.g., detection/tracking)
                                                           85 / 94
EX 10. Z - aware
[Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10]




                       motion-aware video resizing
Make your method be aware of potential failure cases
     Motion (e.g., video processing)
     Content (e.g., image processing)
     Semantic (e.g., image and video indexing/retrival)
     Context (e.g., image understanding)
     Occlusion (e.g., detection/tracking)
                                                           85 / 94
EX 10. Z - aware
[Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10]




                       motion-aware video resizing
Make your method be aware of potential failure cases
     Motion (e.g., video processing)
     Content (e.g., image processing)
     Semantic (e.g., image and video indexing/retrival)
     Context (e.g., image understanding)
     Occlusion (e.g., detection/tracking)
                                                           85 / 94
EX 10. Z - aware
[Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10]




                       motion-aware video resizing
Make your method be aware of potential failure cases
     Motion (e.g., video processing)
     Content (e.g., image processing)
     Semantic (e.g., image and video indexing/retrival)
     Context (e.g., image understanding)
     Occlusion (e.g., detection/tracking)
                                                           85 / 94
EX 10. Z - aware
[Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10]




                       motion-aware video resizing
Make your method be aware of potential failure cases
     Motion (e.g., video processing)
     Content (e.g., image processing)
     Semantic (e.g., image and video indexing/retrival)
     Context (e.g., image understanding)
     Occlusion (e.g., detection/tracking)
                                                           85 / 94
Outline


1   Introduction

2   Five ways to come up with new ideas
       Seek different dimensions                        neXt = X d
       Combine two or more topics                    neXt = X + Y
       Re-think the research directions                         ¯
                                                         neXt = X
       Use powerful tools, find suitable problems       neXt = X ↑
       Add an appropriate adjective                neXt = Adj + X

3   What is a bad idea?




                                                               86 / 94
What is a bad idea?



   Naive combination of two or more methods
       Avoid a pipeline system paper
   Blind application of tools
       Use X feature and Y classifier without motivation and justification
   Follow the hype
       Too many competitors
   Do just because it can be done
       Do the right things, not just do things right




                                                                           87 / 94
88 / 94
89 / 94
90 / 94
91 / 94
92 / 94
93 / 94
Thank you for your kind attention.
             Questions?
For more complete materials, please visit my blog
http://jbhuang0604.blogspot.com/




                                                    94 / 94

Contenu connexe

Tendances

[DL輪読会]Few-Shot Unsupervised Image-to-Image Translation
[DL輪読会]Few-Shot Unsupervised Image-to-Image Translation[DL輪読会]Few-Shot Unsupervised Image-to-Image Translation
[DL輪読会]Few-Shot Unsupervised Image-to-Image TranslationDeep Learning JP
 
統計的学習の基礎 第2章後半
統計的学習の基礎 第2章後半統計的学習の基礎 第2章後半
統計的学習の基礎 第2章後半Prunus 1350
 
【DL輪読会】WIRE: Wavelet Implicit Neural Representations
【DL輪読会】WIRE: Wavelet Implicit Neural Representations【DL輪読会】WIRE: Wavelet Implicit Neural Representations
【DL輪読会】WIRE: Wavelet Implicit Neural RepresentationsDeep Learning JP
 
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding ModelNIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding ModelSeiya Tokui
 
深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)Masahiro Suzuki
 
適切な研究課題の設定が論文掲載の第一歩
適切な研究課題の設定が論文掲載の第一歩 適切な研究課題の設定が論文掲載の第一歩
適切な研究課題の設定が論文掲載の第一歩 英文校正エディテージ
 
SPADE :Semantic Image Synthesis with Spatially-Adaptive Normalization
SPADE :Semantic Image Synthesis with Spatially-Adaptive NormalizationSPADE :Semantic Image Synthesis with Spatially-Adaptive Normalization
SPADE :Semantic Image Synthesis with Spatially-Adaptive NormalizationTenki Lee
 
「樹木モデルとランダムフォレスト-機械学習による分類・予測-」-データマイニングセミナー
「樹木モデルとランダムフォレスト-機械学習による分類・予測-」-データマイニングセミナー「樹木モデルとランダムフォレスト-機械学習による分類・予測-」-データマイニングセミナー
「樹木モデルとランダムフォレスト-機械学習による分類・予測-」-データマイニングセミナーKoichi Hamada
 
相関と因果について考える:統計的因果推論、その(不)可能性の中心
相関と因果について考える:統計的因果推論、その(不)可能性の中心相関と因果について考える:統計的因果推論、その(不)可能性の中心
相関と因果について考える:統計的因果推論、その(不)可能性の中心takehikoihayashi
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks남주 김
 
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...Yusuke Uchida
 
Unsupervised Image-to-Image Translation Networksの紹介
Unsupervised Image-to-Image Translation Networksの紹介Unsupervised Image-to-Image Translation Networksの紹介
Unsupervised Image-to-Image Translation Networksの紹介KCS Keio Computer Society
 
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with TransformersDeep Learning JP
 
言語モデル入門 (第二版)
言語モデル入門 (第二版)言語モデル入門 (第二版)
言語モデル入門 (第二版)Yoshinari Fujinuma
 
新分野に飛び入って半年で業績を作るには
新分野に飛び入って半年で業績を作るには新分野に飛び入って半年で業績を作るには
新分野に飛び入って半年で業績を作るにはAsai Masataro
 
人が注目する箇所を当てるSaliency Detectionの最新モデル UCNet(CVPR2020)
人が注目する箇所を当てるSaliency Detectionの最新モデル UCNet(CVPR2020)人が注目する箇所を当てるSaliency Detectionの最新モデル UCNet(CVPR2020)
人が注目する箇所を当てるSaliency Detectionの最新モデル UCNet(CVPR2020)Shintaro Yoshida
 
UISTで登壇発表しようぜ (UIST勉強会講演2/2)
UISTで登壇発表しようぜ (UIST勉強会講演2/2)UISTで登壇発表しようぜ (UIST勉強会講演2/2)
UISTで登壇発表しようぜ (UIST勉強会講演2/2)Masa Ogata
 
【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN ImageryDeep Learning JP
 
NIP2015読み会「End-To-End Memory Networks」
NIP2015読み会「End-To-End Memory Networks」NIP2015読み会「End-To-End Memory Networks」
NIP2015読み会「End-To-End Memory Networks」Yuya Unno
 
ReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language ModelsReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language Modelsharmonylab
 

Tendances (20)

[DL輪読会]Few-Shot Unsupervised Image-to-Image Translation
[DL輪読会]Few-Shot Unsupervised Image-to-Image Translation[DL輪読会]Few-Shot Unsupervised Image-to-Image Translation
[DL輪読会]Few-Shot Unsupervised Image-to-Image Translation
 
統計的学習の基礎 第2章後半
統計的学習の基礎 第2章後半統計的学習の基礎 第2章後半
統計的学習の基礎 第2章後半
 
【DL輪読会】WIRE: Wavelet Implicit Neural Representations
【DL輪読会】WIRE: Wavelet Implicit Neural Representations【DL輪読会】WIRE: Wavelet Implicit Neural Representations
【DL輪読会】WIRE: Wavelet Implicit Neural Representations
 
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding ModelNIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
NIPS2013読み会 DeViSE: A Deep Visual-Semantic Embedding Model
 
深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)
 
適切な研究課題の設定が論文掲載の第一歩
適切な研究課題の設定が論文掲載の第一歩 適切な研究課題の設定が論文掲載の第一歩
適切な研究課題の設定が論文掲載の第一歩
 
SPADE :Semantic Image Synthesis with Spatially-Adaptive Normalization
SPADE :Semantic Image Synthesis with Spatially-Adaptive NormalizationSPADE :Semantic Image Synthesis with Spatially-Adaptive Normalization
SPADE :Semantic Image Synthesis with Spatially-Adaptive Normalization
 
「樹木モデルとランダムフォレスト-機械学習による分類・予測-」-データマイニングセミナー
「樹木モデルとランダムフォレスト-機械学習による分類・予測-」-データマイニングセミナー「樹木モデルとランダムフォレスト-機械学習による分類・予測-」-データマイニングセミナー
「樹木モデルとランダムフォレスト-機械学習による分類・予測-」-データマイニングセミナー
 
相関と因果について考える:統計的因果推論、その(不)可能性の中心
相関と因果について考える:統計的因果推論、その(不)可能性の中心相関と因果について考える:統計的因果推論、その(不)可能性の中心
相関と因果について考える:統計的因果推論、その(不)可能性の中心
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
 
Unsupervised Image-to-Image Translation Networksの紹介
Unsupervised Image-to-Image Translation Networksの紹介Unsupervised Image-to-Image Translation Networksの紹介
Unsupervised Image-to-Image Translation Networksの紹介
 
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
 
言語モデル入門 (第二版)
言語モデル入門 (第二版)言語モデル入門 (第二版)
言語モデル入門 (第二版)
 
新分野に飛び入って半年で業績を作るには
新分野に飛び入って半年で業績を作るには新分野に飛び入って半年で業績を作るには
新分野に飛び入って半年で業績を作るには
 
人が注目する箇所を当てるSaliency Detectionの最新モデル UCNet(CVPR2020)
人が注目する箇所を当てるSaliency Detectionの最新モデル UCNet(CVPR2020)人が注目する箇所を当てるSaliency Detectionの最新モデル UCNet(CVPR2020)
人が注目する箇所を当てるSaliency Detectionの最新モデル UCNet(CVPR2020)
 
UISTで登壇発表しようぜ (UIST勉強会講演2/2)
UISTで登壇発表しようぜ (UIST勉強会講演2/2)UISTで登壇発表しようぜ (UIST勉強会講演2/2)
UISTで登壇発表しようぜ (UIST勉強会講演2/2)
 
【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
【DL輪読会】StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
 
NIP2015読み会「End-To-End Memory Networks」
NIP2015読み会「End-To-End Memory Networks」NIP2015読み会「End-To-End Memory Networks」
NIP2015読み会「End-To-End Memory Networks」
 
ReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language ModelsReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language Models
 

En vedette

How to Read Academic Papers
How to Read Academic PapersHow to Read Academic Papers
How to Read Academic PapersJia-Bin Huang
 
Research 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeXResearch 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeXJia-Bin Huang
 
What Makes a Creative Photograph?
What Makes a Creative Photograph?What Makes a Creative Photograph?
What Makes a Creative Photograph?Jia-Bin Huang
 
Writing Fast MATLAB Code
Writing Fast MATLAB CodeWriting Fast MATLAB Code
Writing Fast MATLAB CodeJia-Bin Huang
 
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)Jia-Bin Huang
 
Computer Vision Crash Course
Computer Vision Crash CourseComputer Vision Crash Course
Computer Vision Crash CourseJia-Bin Huang
 
Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.Jia-Bin Huang
 
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)Jia-Bin Huang
 
Jia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum VitaeJia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum VitaeJia-Bin Huang
 
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...Jia-Bin Huang
 
Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013Jia-Bin Huang
 
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Jia-Bin Huang
 
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Jia-Bin Huang
 
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Jia-Bin Huang
 
Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
 
Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015Jia-Bin Huang
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialJia-Bin Huang
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015Jia-Bin Huang
 
Richard Matthew Stallman - A Brief Biography
Richard Matthew Stallman - A Brief BiographyRichard Matthew Stallman - A Brief Biography
Richard Matthew Stallman - A Brief BiographyHackerEarth
 

En vedette (20)

How to Read Academic Papers
How to Read Academic PapersHow to Read Academic Papers
How to Read Academic Papers
 
Research 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeXResearch 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeX
 
What Makes a Creative Photograph?
What Makes a Creative Photograph?What Makes a Creative Photograph?
What Makes a Creative Photograph?
 
Writing Fast MATLAB Code
Writing Fast MATLAB CodeWriting Fast MATLAB Code
Writing Fast MATLAB Code
 
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)
 
Computer Vision Crash Course
Computer Vision Crash CourseComputer Vision Crash Course
Computer Vision Crash Course
 
Raskar 2012, Idea Hexagon
Raskar 2012, Idea HexagonRaskar 2012, Idea Hexagon
Raskar 2012, Idea Hexagon
 
Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.
 
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)
 
Jia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum VitaeJia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum Vitae
 
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
 
Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013
 
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
 
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
 
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
 
Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)
 
Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorial
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
 
Richard Matthew Stallman - A Brief Biography
Richard Matthew Stallman - A Brief BiographyRichard Matthew Stallman - A Brief Biography
Richard Matthew Stallman - A Brief Biography
 

Similaire à How to come up with new research ideas

17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptxssuser2023c6
 
17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis Functions17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis FunctionsAndres Mendez-Vazquez
 
Generalization abstraction
Generalization abstractionGeneralization abstraction
Generalization abstractionEdward Blurock
 
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...Association for Computational Linguistics
 
Visual thinking colin_ware_lectures_2013_9_visual thinking_1
Visual thinking colin_ware_lectures_2013_9_visual thinking_1Visual thinking colin_ware_lectures_2013_9_visual thinking_1
Visual thinking colin_ware_lectures_2013_9_visual thinking_1Elsa von Licy
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3butest
 
Algebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptxAlgebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptxWillSoo1
 
Algebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptxAlgebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptxWillSoo1
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learningkkkc
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.pptyang947066
 
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdfMcSwathi
 
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text ClassificationSai Srinivas Kotni
 
Transfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingTransfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingSebastian Ruder
 
Intro to concept maps v3 march 2012
Intro to concept maps v3 march 2012Intro to concept maps v3 march 2012
Intro to concept maps v3 march 2012Andre Daniels
 
Presentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data MiningPresentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data Miningbutest
 

Similaire à How to come up with new research ideas (20)

17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx
 
17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis Functions17 Machine Learning Radial Basis Functions
17 Machine Learning Radial Basis Functions
 
Generalization abstraction
Generalization abstractionGeneralization abstraction
Generalization abstraction
 
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
 
Grade 5 Math
Grade 5 MathGrade 5 Math
Grade 5 Math
 
Visual thinking colin_ware_lectures_2013_9_visual thinking_1
Visual thinking colin_ware_lectures_2013_9_visual thinking_1Visual thinking colin_ware_lectures_2013_9_visual thinking_1
Visual thinking colin_ware_lectures_2013_9_visual thinking_1
 
KNN
KNNKNN
KNN
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3
 
Algebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptxAlgebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptx
 
Algebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptxAlgebra Alternative [Auto-saved].pptx
Algebra Alternative [Auto-saved].pptx
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
 
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text Classification
 
07 bestpractice
07 bestpractice07 bestpractice
07 bestpractice
 
Transfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingTransfer Learning for Natural Language Processing
Transfer Learning for Natural Language Processing
 
Intro to concept maps v3 march 2012
Intro to concept maps v3 march 2012Intro to concept maps v3 march 2012
Intro to concept maps v3 march 2012
 
Presentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data MiningPresentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data Mining
 
16 17 bag_words
16 17 bag_words16 17 bag_words
16 17 bag_words
 
Clustering
ClusteringClustering
Clustering
 

Plus de Jia-Bin Huang

How to write a clear paper
How to write a clear paperHow to write a clear paper
How to write a clear paperJia-Bin Huang
 
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)Jia-Bin Huang
 
Real-time Face Detection and Recognition
Real-time Face Detection and RecognitionReal-time Face Detection and Recognition
Real-time Face Detection and RecognitionJia-Bin Huang
 
Pose aware online visual tracking
Pose aware online visual trackingPose aware online visual tracking
Pose aware online visual trackingJia-Bin Huang
 
Face Expression Enhancement
Face Expression EnhancementFace Expression Enhancement
Face Expression EnhancementJia-Bin Huang
 
Image Smoothing for Structure Extraction
Image Smoothing for Structure ExtractionImage Smoothing for Structure Extraction
Image Smoothing for Structure ExtractionJia-Bin Huang
 
Three Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiucThree Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiucJia-Bin Huang
 
Static and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture RecognitionStatic and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture RecognitionJia-Bin Huang
 
Real-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes RecognitionReal-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes RecognitionJia-Bin Huang
 
UIUC CS 498 - Computational Photography - Final project presentation
UIUC CS 498 - Computational Photography - Final project presentation UIUC CS 498 - Computational Photography - Final project presentation
UIUC CS 498 - Computational Photography - Final project presentation Jia-Bin Huang
 
Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
 
Information Preserving Color Transformation for Protanopia and Deuteranopia (...
Information Preserving Color Transformation for Protanopia and Deuteranopia (...Information Preserving Color Transformation for Protanopia and Deuteranopia (...
Information Preserving Color Transformation for Protanopia and Deuteranopia (...Jia-Bin Huang
 
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Jia-Bin Huang
 
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Jia-Bin Huang
 
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Jia-Bin Huang
 

Plus de Jia-Bin Huang (15)

How to write a clear paper
How to write a clear paperHow to write a clear paper
How to write a clear paper
 
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
 
Real-time Face Detection and Recognition
Real-time Face Detection and RecognitionReal-time Face Detection and Recognition
Real-time Face Detection and Recognition
 
Pose aware online visual tracking
Pose aware online visual trackingPose aware online visual tracking
Pose aware online visual tracking
 
Face Expression Enhancement
Face Expression EnhancementFace Expression Enhancement
Face Expression Enhancement
 
Image Smoothing for Structure Extraction
Image Smoothing for Structure ExtractionImage Smoothing for Structure Extraction
Image Smoothing for Structure Extraction
 
Three Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiucThree Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiuc
 
Static and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture RecognitionStatic and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture Recognition
 
Real-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes RecognitionReal-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes Recognition
 
UIUC CS 498 - Computational Photography - Final project presentation
UIUC CS 498 - Computational Photography - Final project presentation UIUC CS 498 - Computational Photography - Final project presentation
UIUC CS 498 - Computational Photography - Final project presentation
 
Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)
 
Information Preserving Color Transformation for Protanopia and Deuteranopia (...
Information Preserving Color Transformation for Protanopia and Deuteranopia (...Information Preserving Color Transformation for Protanopia and Deuteranopia (...
Information Preserving Color Transformation for Protanopia and Deuteranopia (...
 
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
 
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
 
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
 

Dernier

Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterMateoGardella
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 

Dernier (20)

Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 

How to come up with new research ideas

  • 1. How to Come Up With New Research Ideas? Jia-Bin Huang jbhuang0604@gmail.com Taiwan May , 2010 1 / 94
  • 2. What this talk is about? Five approaches to come up with new ideas in computer vision. Extensive case studies (i.e., more than one hundred papers). A common sense talk. No complicate theories or equations. I wish someone told me this before. Reference The content of this talk is greatly inspired by “Raskar Idea Hexagon". 2 / 94
  • 3. What this talk is about? Five approaches to come up with new ideas in computer vision. Extensive case studies (i.e., more than one hundred papers). A common sense talk. No complicate theories or equations. I wish someone told me this before. Reference The content of this talk is greatly inspired by “Raskar Idea Hexagon". 2 / 94
  • 4. What this talk is about? Five approaches to come up with new ideas in computer vision. Extensive case studies (i.e., more than one hundred papers). A common sense talk. No complicate theories or equations. I wish someone told me this before. Reference The content of this talk is greatly inspired by “Raskar Idea Hexagon". 2 / 94
  • 5. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 3 / 94
  • 6. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 4 / 94
  • 7. Active Topics in Computer Vision [Szeliski Computer Vision: Algorithms and Applications 2010] Digital image processing Blocks world, line labeling Generalized cylinders Pictorial structures Stereo correspondence Intrinsic images Optical flow Structure from motion Image pyramids Scale-space processing Shape from X Physically-based modeling Regularization Markov Random Fields Kalman filters 3D range data processing Projective invariants Factorization Physics-based vision Graph cuts Particle filtering Energy-based segmentation Face recognition and detection Subspace methods Image-based modeling/rendering Texture synthesis/inpainting Computational photography Feature-based recognition MRF inference algorithms Learning 5 / 94
  • 8. What can we learn from the past? The topics are diverse and evolve over time. The ways to come up with new ideas are similar. There are patterns to follow. 6 / 94
  • 9. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 7 / 94
  • 10. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 8 / 94
  • 11. Seek different dimensions neXt = X d The only difference between a rut and a grave is their dimensions. - Ellen Glasgow 9 / 94
  • 12. Seek different dimensions neXt = X d Idea Can we increase/replace/transform the dimensions of the original problem to get new problems/solutions? What kind of dimensions can we work on? 1 Concrete dimensions (e.g., space, time, frequency) 2 Abstract dimensions (e.g., properties) 10 / 94
  • 13. EX 1-1. Content-Aware Media Resizing [Avidan et al. SIGGRAPH 07] [Rubinstein et al. SIGGRAPH 08] Ideas Extend dimensions from 2D image to 3D video: image re-targeting ⇒ video re-targeting Other dimensions? E.g., 4D light field, infrared image, range image. 11 / 94
  • 14. EX 1-2. Video Stitching [Rav-Acha et al. CVPR 05] Input video Dynamic Panorama Ideas Extend dimensions from image to video, i.e., Image Panorama ⇒ Video Mosaics with Non-Chronological Time Increase the time dimension in both input and output 12 / 94
  • 15. EX 1-3. Multi-Image Fusion [Agarwala et al. SIGGRAPH 04] Ideas Extend from single input image to multiple input images ⇒ Digital Photomontage Increase the dimension in input only. 13 / 94
  • 16. EX 1-4. Computation Photography (Coded Photography) [Raskar et al. SIGGRAPH 04, 06, 08] [Levin et al. SIGGRAPH 07] Ideas Coded Photography: reversibly encode information about the scene in a single photograph Coding in Time (Exposure), Coded Illumination, Coding in Space (aperture), and Coded Wavelength Replace the dimension to code information of the light field 14 / 94
  • 17. EX 1-1. Photography in Low Light Conditions Flash Blurred Noisy What we can do ? Flash → Changes the overall scene appearance (cold and gray) Long exposure time (hand shake) → Blurred image Short exposure time (insufficient light) → Noisy image 15 / 94
  • 18. EX 1-1-1. Flash/non-Flash Photography [Petschnigg et al. SIGGRAPH 2004] Flash No flash Detail transfer with denoising Ideas The original problem (taking a good photo in low light environments from single image) is difficult. Increase the dimension of input (flash/no-flash image pair) make the problem much easier. 16 / 94
  • 19. EX 1-1-2. Image Deblurring with Blurred/Noisy Image Pairs [Yuan et al. SIGGRAPH 2007] Blurred Noisy Enhanced noisy Deblurred result Ideas The original problem (taking a good photo in low light and flash prohibited environments from single image) is difficult. Increase the dimension of input (Blurred/Noisy image pair) make the problem much easier. 17 / 94
  • 20. EX 1-1-3. Robust Flash Deblurring [Zhou et al. CVPR 2010] Ideas The original problem (taking a good photo in low light environments from single image) is difficult. Increase the dimension of input (Blurred/Flash image pair) make the problem much easier. 18 / 94
  • 21. EX 1-1-4. Dark Flash Photography [Krishnan et al. SIGGRAPH 2009] Ideas The original problem (taking a good photo in low light environments from single image) is difficult. Increase the dimension of input (Dark Flash/Noisy image pair) make the problem much easier. 19 / 94
  • 22. EX 1-2. Brute-Force Vision [Hays and Efros SIGGRAPH 07] [Dale et al. ICCV 09] [Agarwal et al. ICCV 09] [Furukawa et al. ICCV 09] Ideas Utilize a large collection of photos. 20 / 94
  • 23. EX 2-1. X Alignment/Registration (pixel, object, scene) [Liu et al. CVPR 08, ECCV 08] [Berg et al. CVPR 05] 21 / 94
  • 24. EX 2-2. Shape from X (shading, texture, specular) [Lobay and Forsyth IJCV 06] [Fleming et al JOV 04] [Adato et al ICCV 07] shading specular texture specular flow 22 / 94
  • 25. EX 2-3. Depth from X (stereo, (de-)focus, coded aperture, diffusion, occlusion, semantic label) [Levin et al. SIGGRAPH 07] [Hoiem et al. ICCV 07] [Liu et al. CVPR 10] [Zhou et al. CVPR 10] Coded Aperture Semantic Labels Occlusion Diffusion 23 / 94
  • 26. EX 2-4. Infer X from a single image (geometric, geography, illumination) [Hoiem et al. ICCV 05] [Hays and Efros CVPR 08] [Lalonde et al. ICCV 09] Geometric Geography Illumination 24 / 94
  • 27. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 25 / 94
  • 28. Combine two or more topics neXt = X + Y To steal ideas from one person is plagiarism. To steal from many is research. - Wilson Mizner 26 / 94
  • 29. Combine two or more topics neXt = X + Y Idea Can we combine two or more topics to get new problems or solutions? What kind of topics can we combine? 1 X, Y are methods 2 X, Y are problems 3 X, Y are areas 27 / 94
  • 30. EX 1-1. Viola-Jones Object Detection Framework [Viola and Jones CVPR 2001] Simple feature Integral img Boosting Cascade structure Ideas Paper title: Rapid Object Detection using a Boosted Cascade of Simple Features Viola-Jones object detection framework = Integral Images (simple feature)(1984) + AdaBoost(1997) + Cascade Architecture(long time ago) 28 / 94
  • 31. EX 1-2. SIFT Flow = SIFT + Optical Flow [Liu et al. ECCV 08 CVPR 09] Motion hallucination Label transfer Ideas Dense sampling in time : optical flow :: dense sampling in world images : SIFT flow 29 / 94
  • 32. EX 1-3. Visual Tracking with Online Multiple Instance Boosting [Babenko et al. CVPR 09] Ideas MILTrack = Multiple Instance Boosting (2005) + Online Boosting Tracking (2006) 30 / 94
  • 33. EX 2-1. High Dynamic Range Image Reconstruction from Hand-held Cameras [Lu et al. CVPR 2009] Ideas HDR from from Hand-held Cameras = High Dynamic Range Image Reconstruction + Image Deblurring 31 / 94
  • 34. EX 2-2. Human Body Understanding [Guan et al. ICCV 09] Ideas Human Body Understanding = Shape Reconstruction + Pose Estimation 32 / 94
  • 35. EX 2-3. Image Understanding detection, tracking, recognition, segmentation, reconstruction, scene classification, event recognition 33 / 94
  • 36. EX 2-3-1. Detection + Tracking [Andriluka et al. CVPR 08] Ideas People detection and people tracking are highly correlated problems. Combine two problems can potentially achieve improved performance on individual tasks. 34 / 94
  • 37. EX 2-3-2. Object Attribute + Recognition [Farhadi et al. CVPR 09] [Lampert et al. CVPR 09] Ideas Describe image by attributes Enable knowledge transfer to recognition class with no visual examples 35 / 94
  • 38. EX 2-3-2. Object Recognition + Detection [Yeh et al. CVPR 09] Ideas Concurrent object localization and recognition 36 / 94
  • 39. EX 2-3-3. Image Segmentation + Object Recognition + Event Recognition [Li et al. CVPR 09] Ideas Combine scene classification, image segmentation, image annotation All three tasks are mutually beneficial 37 / 94
  • 40. EX 3-1. SixthSense - A Wearable Gestural Interface [Mistry and Maes TED 2009] Ideas SixthSense = Computer Vision (e.g., tracking, recognition) + Internet 38 / 94
  • 41. EX 3-2. Sikuli:Picture-driven computing [Yeh et al. UIST 09] [Chang et al. CHI 10] Ideas 1. Readability/usability, 2. GUI serialization, 3. Computer vision on computer-generated figures 39 / 94
  • 42. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 40 / 94
  • 43. Re-think the research directions ¯ neXt = X If at first, the idea is not absurd, then there is no hope for it - Albert Einstein 41 / 94
  • 44. Re-think the research directions ¯ neXt = X Ideas Are the current research directions really make sense? What’s the key problem? What could we do? 1 Re-formulate the original problem. 2 Analyze, compare existing approaches. Provide insight to the problems. 42 / 94
  • 45. EX 1-1. Beyond Sliding Windows [Lampert et al. CVPR 08] Rectangle set Branch and bound search Ideas Sliding window search ⇔ brand-and-bound search Represent a set of rectangles with 4 intervals Use brand-and-bound to find the optimal rectangle (object localization) efficiently 43 / 94
  • 46. EX 1-2. Beyond Categories [Malisiewicz and Efros CVPR 08, NIPS 09] Ideas Explicit categorization ⇔ Implicit categorization Ask "what is this like?" (association), instead of "what is it?" (categorization) 44 / 94
  • 47. EX 1-3. Motion-Invariant Photography [Levin et al. SIGGRAPH 08] [Cho et al. ICCP 10] Ideas Still camera ⇔ Moving camera (parabolic exposures) Enable the use of spatial-invariant blur kernel estimation 45 / 94
  • 48. EX 1-4. Super-resolution from Single Image [Glasner et al. ICCV 09] Ideas Clasical multi-image SR/Example-based SR ⇔ Single SR framework 46 / 94
  • 49. EX 2-1. In Defense of ... [Boiman et al. CVPR 08] [Hartley PAMI 97] Nearest-Neighbor Based Image Classification Quantization of local image descriptors (used to generate "bags-of-words", codebooks). Computation of "Image-to-Image" distance, instead of "Image-to-Class" distance The performance ranks among the top leading learning-based image classifiers The 8-point Algorithm for the fundamental matrix Normalization, Normalization, Normalization! Performs almost as well as the best iterative algorithm 47 / 94
  • 50. EX 2-2. Understanding blind deconvolution [Levin et al. CVPR 2009] Ideas Blind deconvolution: recover sharp image x from the blurred one (y = k ⊗ x + n). MAPx,k estimation often favors no-blur explanations. MAPk can be accurately estimated since the kernel size is often smaller than the image size. Blind deconvolution should be address in this way: MAPk + non-blind deconvolution. 48 / 94
  • 51. EX 2-3. Understanding camera trade-offs [Levin et al. ECCV 08] Ideas Traditional optics evaluation: 2D image sharpness (eg, Modulation Transfer Function) Modern camera evaluation: How well does the recorded data allow us to estimate the visual world - the lightfield? 49 / 94
  • 52. EX 2-4. What is a good image segment? [Bagon et al. ECCV 08] Ideas Good image segment as one which can be easily composed using its own pieces, but is difficult to compose using pieces from other parts of the image 50 / 94
  • 53. EX 2-5. Lambertian Reflectance and Linear Subspaces [Basri and Jacobs PAMI 03] Ideas The set of all Lambertian reflectance functions (the mapping from surface normals to intensities) obtained with arbitrary distant light sources lies close to a 9D linear subspace. Explain prior empirical results using linear subspace methods. 51 / 94
  • 54. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 52 / 94
  • 55. Use powerful tools, find suitable problems neXt = X ↑ If the only tool you have is a hammer, you tend to see every problem as a nail. - Abraham Maslow 53 / 94
  • 56. Use powerful tools, find suitable problems neXt = X ↑ What kinds of tools should we understand? Calculus of Variations Dimensionality Reduction Spectral Methods (specifically, spectral clustering) Probabilistic Graphical Model Structured Prediction Bilateral Filtering Sparse Representation and more ... spectral method/theory, information theory, (convex) optimization, etc 54 / 94
  • 57. EX 1. Calculus of Variations (1/2) From Calculus to Calculus of Variations Calculus Calculus of Variations Functions Functionals (functions of functions) x f: Rn → R f: F → R, f (u) = x12 L(x, u(x), u (x))dx (x) df (u) Derivative dfdx Variation du lim∆x→0 f (x+∆x)−f (x) ∆x lim →0 f (u+ δx)−f (u) ∂ f (x + ∆u)| ∂ =0 Local extremum Local extremum df (x) dx = 0 Euler-Lagrange equation Total Variation (TV) x1 TV(y) = x0 |y |dx: The "oscillation strength" of y(x) 55 / 94
  • 58. EX 1. Calculus of Variations (2/2) Total Variation Denoising/Inpainting Applications in computer vision Optical flow [Horn and Schunck AI 81] Shape from shading [Horn and Brooks CVGIP 86] Edge detection [PAMI 87] Anisotropic diffusion [Perona and Malik PAMI 90] Active contours model [Kass et al. IJCV 98] Image segmentation [Morel and Solimini 95] Image restoration [Aubert and Vese SIAM Journal on NA 97] 56 / 94
  • 59. EX 1. Calculus of Variations (2/2) Total Variation Denoising/Inpainting Applications in computer vision Optical flow [Horn and Schunck AI 81] Shape from shading [Horn and Brooks CVGIP 86] Edge detection [PAMI 87] Anisotropic diffusion [Perona and Malik PAMI 90] Active contours model [Kass et al. IJCV 98] Image segmentation [Morel and Solimini 95] Image restoration [Aubert and Vese SIAM Journal on NA 97] 56 / 94
  • 60. EX 2. Dimensionality Reduction (1/2) Why we need dimensionality reduction? Since high-dimensional data is everywhere (e.g., images, human gene distributions, weather prediction), we need dimensionality reduction for 1 processing data efficiently. 2 estimating the distributions of data accuratly (curse of dimensionality) 3 finding meaningful representation of data Classification of dimensionality reduction methods Global structure preserved Local structure preserved Linear PCA, LDA LPP, NPE Nonlinear ISOMAP, Kernel PCA, DM LLE, LE, HE 57 / 94
  • 61. EX 2. Dimensionality Reduction (1/2) Why we need dimensionality reduction? Since high-dimensional data is everywhere (e.g., images, human gene distributions, weather prediction), we need dimensionality reduction for 1 processing data efficiently. 2 estimating the distributions of data accuratly (curse of dimensionality) 3 finding meaningful representation of data Classification of dimensionality reduction methods Global structure preserved Local structure preserved Linear PCA, LDA LPP, NPE Nonlinear ISOMAP, Kernel PCA, DM LLE, LE, HE 57 / 94
  • 62. EX 2. Dimensionality Reduction (2/2) Applications in computer vision Subspace as constraints Structure from motion [Tomasi and Kanade IJCV 92], Optical flow [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face alignment [Saragih et al. ICCV 09] Face recognition (e.g., PCA, LDA, LPP) PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97], LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09] Motion segmentation subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV 06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09] Lighting linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02] Visual tracking incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR 08] 58 / 94
  • 63. EX 2. Dimensionality Reduction (2/2) Applications in computer vision Subspace as constraints Structure from motion [Tomasi and Kanade IJCV 92], Optical flow [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face alignment [Saragih et al. ICCV 09] Face recognition (e.g., PCA, LDA, LPP) PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97], LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09] Motion segmentation subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV 06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09] Lighting linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02] Visual tracking incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR 08] 58 / 94
  • 64. EX 2. Dimensionality Reduction (2/2) Applications in computer vision Subspace as constraints Structure from motion [Tomasi and Kanade IJCV 92], Optical flow [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face alignment [Saragih et al. ICCV 09] Face recognition (e.g., PCA, LDA, LPP) PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97], LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09] Motion segmentation subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV 06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09] Lighting linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02] Visual tracking incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR 08] 58 / 94
  • 65. EX 2. Dimensionality Reduction (2/2) Applications in computer vision Subspace as constraints Structure from motion [Tomasi and Kanade IJCV 92], Optical flow [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face alignment [Saragih et al. ICCV 09] Face recognition (e.g., PCA, LDA, LPP) PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97], LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09] Motion segmentation subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV 06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09] Lighting linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02] Visual tracking incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR 08] 58 / 94
  • 66. EX 2. Dimensionality Reduction (2/2) Applications in computer vision Subspace as constraints Structure from motion [Tomasi and Kanade IJCV 92], Optical flow [Irani IJCV 02], Layer extraction [Ke and Kanade CVPR 01], Face alignment [Saragih et al. ICCV 09] Face recognition (e.g., PCA, LDA, LPP) PCA [Turk and Pentland PAMI 91], LDA [Belhumeur et al. PAMI 97], LPP [He et al. PAMI 05], Random [Wright et al. PAMI 09] Motion segmentation subspace separation [Kanatani ICCV 01] [Yan and Pollefeys ECCV 06] [Rao et al. CVPR 08] [Lauer and Schnorr ICCV 09] Lighting linear subspace [Belhumeur and Kriegman IJCV 98] [Georghiades et al. PAMI 01] [Lee et al. PAMI 05] [Basri and Jacobs PAMI 02] Visual tracking incremental subspace learning [Ross et al. IJCV 08] [Li et al. CVPR 08] 58 / 94
  • 67. EX 3. Spectral Clustering (1/3) Why spectral clustering is popular? Can be solved efficiently by standard linear algebra software Very often outperform traditional clustering algorithms Spectral clustering algorithm Input: a set of data points 1 Construct a similarity graph, e.g., -neighbor, k-nearest neighbor, fully connected 2 Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym ) 3 Compute the first k (with smallest eigenvalues) eigenvectors of L, v1 , · · · , vk 4 Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns 5 Cluster the row vectors yi with the k-means algorithm into cluster C1 , · · · , Ck Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci } 59 / 94
  • 68. EX 3. Spectral Clustering (1/3) Why spectral clustering is popular? Can be solved efficiently by standard linear algebra software Very often outperform traditional clustering algorithms Spectral clustering algorithm Input: a set of data points 1 Construct a similarity graph, e.g., -neighbor, k-nearest neighbor, fully connected 2 Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym ) 3 Compute the first k (with smallest eigenvalues) eigenvectors of L, v1 , · · · , vk 4 Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns 5 Cluster the row vectors yi with the k-means algorithm into cluster C1 , · · · , Ck Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci } 59 / 94
  • 69. EX 3. Spectral Clustering (1/3) Why spectral clustering is popular? Can be solved efficiently by standard linear algebra software Very often outperform traditional clustering algorithms Spectral clustering algorithm Input: a set of data points 1 Construct a similarity graph, e.g., -neighbor, k-nearest neighbor, fully connected 2 Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym ) 3 Compute the first k (with smallest eigenvalues) eigenvectors of L, v1 , · · · , vk 4 Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns 5 Cluster the row vectors yi with the k-means algorithm into cluster C1 , · · · , Ck Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci } 59 / 94
  • 70. EX 3. Spectral Clustering (1/3) Why spectral clustering is popular? Can be solved efficiently by standard linear algebra software Very often outperform traditional clustering algorithms Spectral clustering algorithm Input: a set of data points 1 Construct a similarity graph, e.g., -neighbor, k-nearest neighbor, fully connected 2 Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym ) 3 Compute the first k (with smallest eigenvalues) eigenvectors of L, v1 , · · · , vk 4 Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns 5 Cluster the row vectors yi with the k-means algorithm into cluster C1 , · · · , Ck Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci } 59 / 94
  • 71. EX 3. Spectral Clustering (1/3) Why spectral clustering is popular? Can be solved efficiently by standard linear algebra software Very often outperform traditional clustering algorithms Spectral clustering algorithm Input: a set of data points 1 Construct a similarity graph, e.g., -neighbor, k-nearest neighbor, fully connected 2 Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym ) 3 Compute the first k (with smallest eigenvalues) eigenvectors of L, v1 , · · · , vk 4 Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns 5 Cluster the row vectors yi with the k-means algorithm into cluster C1 , · · · , Ck Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci } 59 / 94
  • 72. EX 3. Spectral Clustering (1/3) Why spectral clustering is popular? Can be solved efficiently by standard linear algebra software Very often outperform traditional clustering algorithms Spectral clustering algorithm Input: a set of data points 1 Construct a similarity graph, e.g., -neighbor, k-nearest neighbor, fully connected 2 Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym ) 3 Compute the first k (with smallest eigenvalues) eigenvectors of L, v1 , · · · , vk 4 Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns 5 Cluster the row vectors yi with the k-means algorithm into cluster C1 , · · · , Ck Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci } 59 / 94
  • 73. EX 3. Spectral Clustering (1/3) Why spectral clustering is popular? Can be solved efficiently by standard linear algebra software Very often outperform traditional clustering algorithms Spectral clustering algorithm Input: a set of data points 1 Construct a similarity graph, e.g., -neighbor, k-nearest neighbor, fully connected 2 Construct graph Laplacian, e.g., (un)normalized (L, Lrw , Lsym ) 3 Compute the first k (with smallest eigenvalues) eigenvectors of L, v1 , · · · , vk 4 Let V ∈ Rn×k be a matrix contains v1 , ·, vk as columns 5 Cluster the row vectors yi with the k-means algorithm into cluster C1 , · · · , Ck Output: Clusters A1 , · · · , Ak with Ai = {j|yj ∈ Ci } 59 / 94
  • 74. EX 3. Spectral Clustering (2/3) Why it works? Graph Cut Point of View: Construct a partition that minimize the weight across the cut (the well-known mincut problem) while balancing the clusters (e.g., RatioCut, Normalized cut). Random Walks Point of View: When minimizing Ncut, we actually look for a cut through the graph such that a random walk seldom transitions from one cluster to another. Perturbation Theory Point of View: The distance between eigenvectors from the ideal and nearly ideal graph Laplacian is bounded by a constant times a norm of the error matrix. If the perturbations are not small enough, then the k-means algorithm will still separate the groups from each other. 60 / 94
  • 75. EX 3. Spectral Clustering (2/3) Why it works? Graph Cut Point of View: Construct a partition that minimize the weight across the cut (the well-known mincut problem) while balancing the clusters (e.g., RatioCut, Normalized cut). Random Walks Point of View: When minimizing Ncut, we actually look for a cut through the graph such that a random walk seldom transitions from one cluster to another. Perturbation Theory Point of View: The distance between eigenvectors from the ideal and nearly ideal graph Laplacian is bounded by a constant times a norm of the error matrix. If the perturbations are not small enough, then the k-means algorithm will still separate the groups from each other. 60 / 94
  • 76. EX 3. Spectral Clustering (2/3) Why it works? Graph Cut Point of View: Construct a partition that minimize the weight across the cut (the well-known mincut problem) while balancing the clusters (e.g., RatioCut, Normalized cut). Random Walks Point of View: When minimizing Ncut, we actually look for a cut through the graph such that a random walk seldom transitions from one cluster to another. Perturbation Theory Point of View: The distance between eigenvectors from the ideal and nearly ideal graph Laplacian is bounded by a constant times a norm of the error matrix. If the perturbations are not small enough, then the k-means algorithm will still separate the groups from each other. 60 / 94
  • 77. EX 3. Spectral Clustering (3/3) [Shi and Malik PAMI 02] Eigenvectors carry contour information. 61 / 94
  • 78. EX 4. Probabilistic Graphical Model (1/2) What is probabilistic graphical models? A marriage between probability theory and graph theory. A natural tool for dealing with uncertainty and complexity Provides a way to view all probablistic systems (e.g., mixture models, factor analysis, hidden Markov models, Kalman filters and Ising models) as instances of a common underlying formalism. 62 / 94
  • 79. EX 4. Probabilistic Graphical Model (2/2) 63 / 94
  • 80. EX 5. Structured Prediction (1/2) What is structured prediction? Structured prediction is a framework for solving problems of classification or regression in which the output variables are mutually dependent or constrained. Lots of examples Natural language parsing Machine translation Object segmentation Gene prediction Protein alignment Numerous tasks in computational linguistics, speech, vision, biology. 64 / 94
  • 81. EX 5. Structured Prediction (1/2) What is structured prediction? Structured prediction is a framework for solving problems of classification or regression in which the output variables are mutually dependent or constrained. Lots of examples Natural language parsing Machine translation Object segmentation Gene prediction Protein alignment Numerous tasks in computational linguistics, speech, vision, biology. 64 / 94
  • 82. EX 5. Structured Prediction (2/2) Applications [Lampert et al. ECCV 08] [Desai et al. ICCV 09] 65 / 94
  • 83. EX 6. Bilateral Filtering (1/3) What’s Bilateral Filtering? A technique to smooth images while preserving edges Ubiquitous in image processing, computational photography 66 / 94
  • 84. EX 6. Bilateral Filtering (2/3) [Bennett and McMillan SIGGRAPH 05] [Eisemann and Durand SIGGRAPH 04] [Jones et al. SIGGRAPH 03] [Winnem¨oller et al. SIGGRAPH 06] [Bae et al. SIGGRAPH 02] 67 / 94
  • 85. EX 6. Bilateral Filtering (3/3) How does bilateral filter relate with other methods? Intepretation Bilateral filter is equivalent to mode filtering in local histograms Bilateral filter can be interpreted in term of robust statistics since it is related to a cost function Bilateral filter is a discretization of a particular kind of a PDE-based anisotropic diffusion 68 / 94
  • 86. EX 6. Bilateral Filtering (3/3) How does bilateral filter relate with other methods? Intepretation Bilateral filter is equivalent to mode filtering in local histograms Bilateral filter can be interpreted in term of robust statistics since it is related to a cost function Bilateral filter is a discretization of a particular kind of a PDE-based anisotropic diffusion 68 / 94
  • 87. EX 7. Sparse Representation (1/4) Ideas Natural signals (e.g. audio, image) usually admit sparse representation (i.e., can be well represented by a linear combination of a few atom signals) Successfully applied to various areas in signal/image precessing, vision and graphics. 69 / 94
  • 88. EX 7. Sparse Representation (2/4) Image Restoration [Aharon et al. TSP 06] [Julien et al. TIP 08] denoising Inpainting Demoisaic Inpainting 70 / 94
  • 89. EX 7. Sparse Representation (3/4) Classification [Wright et al. PAMI 09] [Julien et al. CVPR ECCV NIPS 08] face recognition edge detection texture classification pixel classification 71 / 94
  • 90. EX 7. Sparse Representation (4/4) Compressive sensing [donoho TIT 06] [Candes and Tao TIT 05 06] and more (e.g., low-rank matrix completion, robust PCA) 72 / 94
  • 91. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 73 / 94
  • 92. Add an appropriate adjective neXt = Adj + X There is only one religion, though there are a hundred versions of it. - George Bernard Shaw 74 / 94
  • 93. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 94. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 95. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 96. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 97. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 98. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 99. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 100. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 101. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 102. Add an appropriate adjective neXt = Adj + X What kinds of adjective can we use? linear ⇔ non-linear generative/reconstructive ⇔ discriminative rule-based / hand-designed ⇔ leanring-based single scale ⇔ multi-scale signle step ⇔ progressive batch processing ⇔ incremental / online processing fixed ⇔ adaptive / dynamic to data parametric ⇔ non-parametric Z - invariant (Z = translation / scale / rotation / noise, facial expression / pose / lighting / occlusion) Z - aware (Z = motion / content / semantic / context / occlusion) 75 / 94
  • 103. EX 1. Linear ⇔ Non-linear Hard to find a straingt line to seperate them into two cluster? Ideas Linear methods may not capture the nonlinear structure in the original data representation Nonlinear methods Kernel tricks (e.g., Kernel PCA, Kernel LDA, Kernel SVM, etc) Manifold learning (e.g., ISOMAP, LLE, Laplacian eigenmap, etc) 76 / 94
  • 104. EX 1. Linear ⇔ Non-linear Hard to find a straingt line to seperate them into two cluster? Ideas Linear methods may not capture the nonlinear structure in the original data representation Nonlinear methods Kernel tricks (e.g., Kernel PCA, Kernel LDA, Kernel SVM, etc) Manifold learning (e.g., ISOMAP, LLE, Laplacian eigenmap, etc) 76 / 94
  • 105. EX 2. Generative ⇔ Discriminative Classification task : X → Y Generative classifier estimate class-conditional pdfs P(X|Y) and prior probabilities P(Y) Naive Bayes, Mixtures of Gaussians, Mixtures of experts, Hidden Markov Models (HMM), Sigmoidal belief networks, Bayesian networks, Markov random fields (MRF) Discriminative classifier estimate posterior probabilities P(Y|X) Logistic regression, SVMs, Traditional neural networks, Nearest neighbor, Conditional Random Fields (CRF) Bayes’ rule P(X|Y)P(Y) P(Y|X) = P(X) Two different perspectives in viewing a problem 77 / 94
  • 106. EX 2. Generative ⇔ Discriminative Classification task : X → Y Generative classifier estimate class-conditional pdfs P(X|Y) and prior probabilities P(Y) Naive Bayes, Mixtures of Gaussians, Mixtures of experts, Hidden Markov Models (HMM), Sigmoidal belief networks, Bayesian networks, Markov random fields (MRF) Discriminative classifier estimate posterior probabilities P(Y|X) Logistic regression, SVMs, Traditional neural networks, Nearest neighbor, Conditional Random Fields (CRF) Bayes’ rule P(X|Y)P(Y) P(Y|X) = P(X) Two different perspectives in viewing a problem 77 / 94
  • 107. EX 2. Generative ⇔ Discriminative Classification task : X → Y Generative classifier estimate class-conditional pdfs P(X|Y) and prior probabilities P(Y) Naive Bayes, Mixtures of Gaussians, Mixtures of experts, Hidden Markov Models (HMM), Sigmoidal belief networks, Bayesian networks, Markov random fields (MRF) Discriminative classifier estimate posterior probabilities P(Y|X) Logistic regression, SVMs, Traditional neural networks, Nearest neighbor, Conditional Random Fields (CRF) Bayes’ rule P(X|Y)P(Y) P(Y|X) = P(X) Two different perspectives in viewing a problem 77 / 94
  • 108. EX 3. Rule-based / Hand-designed ⇔ Leanring-based Hard to find rules to recognize digits? Ideas It may be difficult to design a set of rule to do certain task such as handwritten digit recognition Turn to machine learning methods instead 78 / 94
  • 109. EX 4. Single scale ⇔ Multi-scale [Zelnik-Manor and Perona NIPS 04] Ideas We live in a multi-scale world (atom ↔ universe) Image pyraimds / scale-space theory / wavelet representation → all attempt to capture the multi-scale properties in signal/images. 79 / 94
  • 110. EX 5. Single step ⇔ Progressive [Yuan et al. SIGGRAPH 08] Ideas Some problems are difficult to solve in one step → solve it progressively 80 / 94
  • 111. EX 6. Batch processing ⇔ Incremental / Online processing Ideas Online methods can handle potentially infinite data samples and time-varied data Examples PCA → Incremental PCA (many variants) LDA → Incremental LDA (many variants) SVM → Incremental and decremental SVM [Cauwenberghs and Poggio NIPS 01] Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] → Online dictionary learning [Mairal et al. ICML/JMLR 09] AdaBoosting → Online boosting [Grabner and Bischof CVPR 06] Multiple instance boosting → Online multiple instance boosting [Babenko et al. CVPR 09] 81 / 94
  • 112. EX 6. Batch processing ⇔ Incremental / Online processing Ideas Online methods can handle potentially infinite data samples and time-varied data Examples PCA → Incremental PCA (many variants) LDA → Incremental LDA (many variants) SVM → Incremental and decremental SVM [Cauwenberghs and Poggio NIPS 01] Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] → Online dictionary learning [Mairal et al. ICML/JMLR 09] AdaBoosting → Online boosting [Grabner and Bischof CVPR 06] Multiple instance boosting → Online multiple instance boosting [Babenko et al. CVPR 09] 81 / 94
  • 113. EX 6. Batch processing ⇔ Incremental / Online processing Ideas Online methods can handle potentially infinite data samples and time-varied data Examples PCA → Incremental PCA (many variants) LDA → Incremental LDA (many variants) SVM → Incremental and decremental SVM [Cauwenberghs and Poggio NIPS 01] Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] → Online dictionary learning [Mairal et al. ICML/JMLR 09] AdaBoosting → Online boosting [Grabner and Bischof CVPR 06] Multiple instance boosting → Online multiple instance boosting [Babenko et al. CVPR 09] 81 / 94
  • 114. EX 6. Batch processing ⇔ Incremental / Online processing Ideas Online methods can handle potentially infinite data samples and time-varied data Examples PCA → Incremental PCA (many variants) LDA → Incremental LDA (many variants) SVM → Incremental and decremental SVM [Cauwenberghs and Poggio NIPS 01] Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] → Online dictionary learning [Mairal et al. ICML/JMLR 09] AdaBoosting → Online boosting [Grabner and Bischof CVPR 06] Multiple instance boosting → Online multiple instance boosting [Babenko et al. CVPR 09] 81 / 94
  • 115. EX 6. Batch processing ⇔ Incremental / Online processing Ideas Online methods can handle potentially infinite data samples and time-varied data Examples PCA → Incremental PCA (many variants) LDA → Incremental LDA (many variants) SVM → Incremental and decremental SVM [Cauwenberghs and Poggio NIPS 01] Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] → Online dictionary learning [Mairal et al. ICML/JMLR 09] AdaBoosting → Online boosting [Grabner and Bischof CVPR 06] Multiple instance boosting → Online multiple instance boosting [Babenko et al. CVPR 09] 81 / 94
  • 116. EX 6. Batch processing ⇔ Incremental / Online processing Ideas Online methods can handle potentially infinite data samples and time-varied data Examples PCA → Incremental PCA (many variants) LDA → Incremental LDA (many variants) SVM → Incremental and decremental SVM [Cauwenberghs and Poggio NIPS 01] Dictionary learning (e.g., K-SVD) [Aharon and Elad TSP 06] → Online dictionary learning [Mairal et al. ICML/JMLR 09] AdaBoosting → Online boosting [Grabner and Bischof CVPR 06] Multiple instance boosting → Online multiple instance boosting [Babenko et al. CVPR 09] 81 / 94
  • 117. EX 7. Fixed ⇔ Adaptive / Dynamic [Elad and Aharon TIP 06] Ideas Adaptive approaches usually outperform the predefined/fixed ones. 82 / 94
  • 118. EX 8. Parametric ⇔ Non-parametric Probability density estimation Parametric Assumes a specific functional form with paramter θ e.g., Gaussian distribution with unknown mean and variance, mixture of Gaussians Parameter estimation Estimative approach: p(x) = p(x|θbest ) Bayesian approach p(x) = a(θ)p(x|θ)dθ Non-parametric Do not assume a specific form of the probability distributions e.g., Histogram, kernel density estimation (or Parzen window method) 83 / 94
  • 119. EX 8. Parametric ⇔ Non-parametric Probability density estimation Parametric Assumes a specific functional form with paramter θ e.g., Gaussian distribution with unknown mean and variance, mixture of Gaussians Parameter estimation Estimative approach: p(x) = p(x|θbest ) Bayesian approach p(x) = a(θ)p(x|θ)dθ Non-parametric Do not assume a specific form of the probability distributions e.g., Histogram, kernel density estimation (or Parzen window method) 83 / 94
  • 120. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 121. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 122. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 123. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 124. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 125. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 126. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 127. EX 9. Z - invariant Make your method robust to potential performance degradation noise (e.g., Gaussian additive noise, impluse noise, non-uniform noise) (e.g., image restoration) translation shift (e.g., near-duplicate image/video detection, image search) scale change (e.g., object detection, feature extraction) perspective distortion (e.g., feature extraction) deformation (e.g., non-rigid registration, part-based object detection) pose variation (e.g., human pose estimation) lighting variation (e.g., face recognition) partial occlusion (e.g., object detection and recognition) 84 / 94
  • 128. EX 10. Z - aware [Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10] motion-aware video resizing Make your method be aware of potential failure cases Motion (e.g., video processing) Content (e.g., image processing) Semantic (e.g., image and video indexing/retrival) Context (e.g., image understanding) Occlusion (e.g., detection/tracking) 85 / 94
  • 129. EX 10. Z - aware [Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10] motion-aware video resizing Make your method be aware of potential failure cases Motion (e.g., video processing) Content (e.g., image processing) Semantic (e.g., image and video indexing/retrival) Context (e.g., image understanding) Occlusion (e.g., detection/tracking) 85 / 94
  • 130. EX 10. Z - aware [Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10] motion-aware video resizing Make your method be aware of potential failure cases Motion (e.g., video processing) Content (e.g., image processing) Semantic (e.g., image and video indexing/retrival) Context (e.g., image understanding) Occlusion (e.g., detection/tracking) 85 / 94
  • 131. EX 10. Z - aware [Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10] motion-aware video resizing Make your method be aware of potential failure cases Motion (e.g., video processing) Content (e.g., image processing) Semantic (e.g., image and video indexing/retrival) Context (e.g., image understanding) Occlusion (e.g., detection/tracking) 85 / 94
  • 132. EX 10. Z - aware [Wang et al. SIGGRAPH Asia 09] [Wang et al. SIGGRAPH 10] motion-aware video resizing Make your method be aware of potential failure cases Motion (e.g., video processing) Content (e.g., image processing) Semantic (e.g., image and video indexing/retrival) Context (e.g., image understanding) Occlusion (e.g., detection/tracking) 85 / 94
  • 133. Outline 1 Introduction 2 Five ways to come up with new ideas Seek different dimensions neXt = X d Combine two or more topics neXt = X + Y Re-think the research directions ¯ neXt = X Use powerful tools, find suitable problems neXt = X ↑ Add an appropriate adjective neXt = Adj + X 3 What is a bad idea? 86 / 94
  • 134. What is a bad idea? Naive combination of two or more methods Avoid a pipeline system paper Blind application of tools Use X feature and Y classifier without motivation and justification Follow the hype Too many competitors Do just because it can be done Do the right things, not just do things right 87 / 94
  • 141. Thank you for your kind attention. Questions? For more complete materials, please visit my blog http://jbhuang0604.blogspot.com/ 94 / 94