PhD Seminar - Masud Rahman, University of Saskatchewan

SUPPORTING SOURCE CODE SEARCH WITH
CONTEXT-AWARE AND SEMANTICS-DRIVEN
QUERY REFORMULATION
Masud Rahman, PhD Candidate
Department of Computer Science
University of Saskatchewan, Canada
Advisor: Dr. Chanchal Roy
@masud233
6

ABOUT ME
2
Masud
Rahman,
PhD
Candidate,
U
of
S
Software Research Lab
Me

MASUD RAHMAN: ACADEMICS
3
2019
PhD Candidate,
University of Saskatchewan
(Award: Dr. Keith Geddes Award)
2014
MSc, University of Saskatchewan
(Award: Best MSc Thesis Nomination)
2009
BSc, Khulna University, Bangladesh
(Award: President Gold Medal)
Masud
Rahman,
PhD
Candidate,
U
of
S

TALK OUTLINE
Part 2: PhD Thesis
Part 1: Research Problem
Part 4: Q&A + Discussions
4
Part 3: Future Works
Masud
Rahman,
PhD
Candidate,
U
of
S

Masud
Rahman,
PhD
Candidate,
U
of
S
Part 1: Research Problem
5
P1 P2 P4
P3

6
Masud
Rahman,
PhD
Candidate,
U
of
S
Story I
Software Bugs
P1 P2 P4
P3

7
Masud
Rahman,
PhD
Candidate,
U
of
S
Story II
Software Features
P1 P2 P4
P3

8
Masud
Rahman,
PhD
Candidate,
U
of
S
Software Bugs
Software Features
Bug
Resolution
Feature
Improvement
20% 60%
Story III
P1 P2 P4
P3

BUG REPORT & CHANGE REQUEST
9
Masud
Rahman,
PhD
Candidate,
U
of
S
P1 P2 P4
P3

10
Masud
Rahman,
PhD
Candidate,
U
of
S
Q1: How can we fix software bugs using the bug
reports?
Q2: How can we add/improve features in the existing
software?
Bug Localization
Concept Location
BUG LOCALIZATION & CONCEPT LOCATION
P1 P2 P4
P3

THREE TYPES OF CODE SEARCH
11
Masud
Rahman,
PhD
Candidate,
U
of
S
(1) Bug Localization
(2) Concept Location
(3) Internet-scale
Code Search
P1 P2 P4
P3

SEARCH FOR THE BUGGY CODE
Masud
Rahman,
PhD
Candidate,
U
of
S
Software
Customer
Bug report
Software Developer Code Search
Query Reformulation Query Reformulation
Software Codebase
12
P1 P2 P4
P3
* Kevic & Fritz, ICSE 2014

SEARCH FOR THE RELEVANT CODE
13
Masud
Rahman,
PhD
Candidate,
U
of
S
Developer
Internet-scale codebase
Query Reformulation
*Bajracharya and Lopes, EMSE 2012
P1 P2 P4
P3
Reformulated Query

PART 1: SUMMARY
14
Masud
Rahman,
PhD
Candidate,
U
of
S
(1) Bug Localization (2) Concept Location
(3) Internet-scale Code Search
P1 P2 P4
P3

Masud
Rahman,
PhD
Candidate,
U
of
S
Part 2: PhD Thesis
15
P1 P2 P4
P3

SYSTEMATIC LITERATURE REVIEW
Masud
Rahman,
PhD
Candidate,
U
of
S
ACM DL
CrossRef
DBLP
Mendeley
Google Scholar
IEEE Xplore
ProQuest
ScienceDirect
SpringerLink
Web of Science
Wiley Online Lib
2871 2317 562
Initial
results
Impurity
removal
Filter by
Title
195
Filter by
Abstract
93
Merging &
Duplicate
removal
56
Primary
studies
Filter by
Full texts
Query reformulation, query expansion, query reduction, query formulation,
query refinement, automated query expansion, AQE, query suggestion,
query recommendation, term selection, query replacement, query difficulty,
query quality, keyword selection, keyword extraction, search term
identification, search query, search term, and search keyword.
16
P1 P2 P4
P3

PHD THESIS OVERVIEW
17
Masud
Rahman,
PhD
Candidate,
U
of
S
(3) Internet-scale Code
Search
S1 (STRICT)
S2 (ACER)
S3 (BLIZZARD)
S4 (BLADER)
S6 (NLP2API)
S5 (RACK)
P1 P2 P4
P3

18
Masud
Rahman,
PhD
Candidate,
U
of
S
S1 S2 S3 S4 S5 S6
STRICT
Search Query Reformulation for Concept Location
using Graph-based Term Weighting
[SANER 2015 + 2017]
P1 P2 P4
P3

19
Masud
Rahman,
PhD
Candidate,
U
of
S
QUIZ TEST: QUERIES FROM A CHANGE REQUEST
Title
Description
ID Query QE
1. Custom search results view iresource
2. Custom search results search results view
3. element iresource provider level tree
4. Custom search results hierarchically java search results
1331
636
01
570
~11K documents
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

TF-IDF: TERM IMPORTANCE (TRADITIONAL)
20
Masud
Rahman,
PhD
Candidate,
U
of
S
University of Saskatchewan
The Saskatchewan Huskies football team
represents the University of Saskatchewan
in U Sports football that competes in the
Canada West Universities Athletic
Association conference of U Sports. The
program has won the Vanier Cup national
championship three times, in 1990, 1996
and 1998.
The Saskatchewan Huskies
became only the second U Sports team to
advance to three consecutive Vanier Cup
games, after the Saint Mary's Huskies, but
lost all three games from 2004-2006. The
team has won the most Hardy Trophy
titles in Canada West, having won a total
of 20 times. The 2006 Saskatchewan
Huskies became only the third team to
play in a Vanier Cup that their school was
hosting, when the University of
Saskatchewan hosted the 42nd Vanier
Cup. The Toronto Varsity Blues were the
first when they won two Vanier Cups in
1965 and 1993. Saskatchewan also
became the first western school to host
the national championship game.
Saskatchewan:6
Vanier: 5
Won: 4
Huskies: 4
Cup: 4
Team: 4
Sports: 3
Times: 2
School: 2
Championship:2
Vanier: 0.5
Won: 0.4
Huskies: 0.4
School: 0.1
Saskatchewan: 0.06
Championship: 0.06
Sports: 0.06
Times: 0.06
Cup: 0.04
Team: 0.04
TF IDF TF x IDF
Saskatchewan: .01
Vanier: 0.1
Won: 0.1
Huskies: 0.1
Cup: 0.01
Team: 0.01
Sports: 0.02
Times: 0.03
School: 0.05
Championship: .03
IDF = log (DF / N)
Saskatchewan Huskies
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

TEXTRANK: TERM IMPORTANCE USING CO-
OCCURRENCES (MIHALCEA ET AL, EMNLP 2004)
21
Masud
Rahman,
PhD
Candidate,
U
of
S
IResource … IJavaElement
IResource … IJavaElement
(Term Co-occurrence)
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

POSRANK: TERM IMPORTANCE USING SYNTACTIC
DEPENDENCE (BLANCO & LIOMA, INF. RETR. 2012)
22
Masud
Rahman,
PhD
Candidate,
U
of
S
Jespersen Rank Theory
(Syntactic Dependence)
Noun Verb Adjective
Element …reported, element …plain
(Syntactic Dependence)
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

STRICT: QUERY KEYWORD SELECTION WITH
PAGERANK (BRIN & PAGE, 1998)
23
 




 )
(
)
1
0
(
|
)
(
|
)
(
)
1
(
)
(
i
v
In
j
j
j
i
v
Out
v
S
v
S 


•Element
•Iresource
•Provider
•Level
•Tree
Candidate
Query 1
Candidate
Query 2
PageRank
Algorithm
Best Query
Masud
Rahman,
PhD
Candidate,
U
of
S
K1
K2
K3
K4
K5
K1
K2
K3
K4
K5
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

EXPERIMENT, DATASET & METRICS
24
~3K Change Requests Version History
Ground Truth
Masud
Rahman,
PhD
Candidate,
U
of
S
1. Hit@K
2. MAP@K
3. MRR@K
4. QE
7 RQs
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

Correct
Result
Correct
Result
Correct
Result
IMPROVED VS. WORSE QUERY
25
Masud
Rahman,
PhD
Candidate,
U
of
S
Baseline Query
(Title + Description)
Worse Query Improved Query
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

COMPARISON WITH THE STATE-OF-THE-ART
26
Masud
Rahman,
PhD
Candidate,
U
of
S
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

MY PHD THESIS STORY
27
Masud
Rahman,
PhD
Candidate,
U
of
S
Search
S1 (STRICT)
S2 (ACER)
S3 (BLIZZARD)
S4 (BLADER)
S6 (NLP2API)
S5 (RACK)
P1 P2 P4
P3

28
Masud
Rahman,
PhD
Candidate,
U
of
S
BLIZZARD
Search Query Reformulation for Bug Localization
using Report Quality Dynamics & Graph-based
Term Weighting
[ESEC/FSE 2018]
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

BUG REPORT QUALITY: A CLOSER LOOK
29
5000+
Masud
Rahman,
PhD
Candidate,
U
of
S
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

ONE SIZE DOES NOT FIT ALL
30
Traditional Idea Proposed Idea
Masud
Rahman,
PhD
Candidate,
U
of
S
Can everybody watch the game?
No
Yes
Yes
Yes
Yes
Yes
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

QUALITY-AWARE QUERY REFORMULATION
31
Noisy Poor Rich
Masud
Rahman,
PhD
Candidate,
U
of
S
Poor
Noisy
Rich
Rich
Noisy
Poor
Equality Equity
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

STEP I: REFORMULATING NOISY BUG REPORT
32
i
j
I
Masud
Rahman,
PhD
Candidate,
U
of
S
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

33
i entry
j entry
Pi Ci Mi
Pj Cj Mj
II
Masud
Rahman,
PhD
Candidate,
U
of
S
Static
Static
Hierarchical
STEP II: REFORMULATING NOISY BUG REPORT
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

STEP III: REFORMULATING NOISY BUG REPORT
34







)
(
)
1
0
(
|
)
(
|
)
(
)
1
(
)
(
i
V
In
j j
j
i
V
Out
V
S
V
S 


Ci
Cj
Mk
Mn
Cp
Masud
Rahman,
PhD
Candidate,
U
of
S
III
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

SEARCH QUERY FROM NOISY BUG REPORT
35
Bug 31637 – should be able to cast null
NullPointerException
Ci Cj Mk Mn Cp
53 01
Masud
Rahman,
PhD
Candidate,
U
of
S
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

SEARCH QUERY FOR POOR BUG REPORT
36
Poor Bug Report
compliance create preference add
configuration field dialog
annotation
01
Masud
Rahman,
PhD
Candidate,
U
of
S
30
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

SEARCH QUERY FROM RICH BUG REPORT
37
Rich Bug Report
astvisitor post postvisit
previsit pre file post pre
astnode visitor
27 01
Masud
Rahman,
PhD
Candidate,
U
of
S
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

38
5K+ Bug reports Version History
Ground Truth
Masud
Rahman,
PhD
Candidate,
U
of
S
1. Hit@K
2. MAP@K
3. MRR@K
4. QE
4 RQs
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

39
Masud
Rahman,
PhD
Candidate,
U
of
S
COMPARISON WITH THE STATE-OF-THE-ART
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

MY PHD THESIS STORY
40
Masud
Rahman,
PhD
Candidate,
U
of
S
Search
S1 (STRICT)
S2 (ACER)
S3 (BLIZZARD)
S4 (BLADER)
S6 (NLP2API)
S5 (RACK)
P1 P2 P4
P3

41
Masud
Rahman,
PhD
Candidate,
U
of
S
BLADER
Semantics-Driven Query Reformulation for
IR-Based Bug Localization
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

CHALLENGES OF POOR BUG REPORTS
42
Masud
Rahman,
PhD
Candidate,
U
of
S
Technique Query QE
Baseline { title } 30
Baseline { title + description } 12
STRICT { IRC spam channel Bug entered rid join
messages clients entry }
26
Rocchio {title + description} + {
remoteserviceadminevent admin service feed
remote synd mask writer event export }
10
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

STEPS OF BLADER
43
Masud
Rahman,
PhD
Candidate,
U
of
S
(2) Clustering Tendency Analysis
(3) Query Reformulation
(1) Semantic Hyperspace
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

Semantic
Hyperspace
STEP I: CONSTRUCTION OF SEMANTIC
HYPERSPACE USING STACK OVERFLOW
44
Masud
Rahman,
PhD
Candidate,
U
of
S
Stack Overflow
corpus
Data
preprocessing
Neural Text classifier
FastText model
(skip-gram)
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

STEP I: SEMANTIC HYPERSPACE EXPLAINED
45
Masud
Rahman,
PhD
Candidate,
U
of
S
Coffee P (1, 5, 6, 7, ….., N)
Tea P (2, 4, 6, 9, ….., N)
Pasta P (7, 9, 0, 1, ….., N)
Pasta
Tea
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3
Semantic distance = Cosine distance

46
Masud
Rahman,
PhD
Candidate,
U
of
S
channel
join spam
entered
connect
invitation
message
room
chat
handle
mask
remote
synd
admin
Q
C1
C2
STEP II: CLUSTERING TENDENCY ANALYSIS
C1 is better than C2
Hopkins Statistic (HS)
Polygon Area (PA)
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

STEP III: QUERY REFORMULATION
47
Ref. candidate
(HS)
Ref. candidate
(PA)
Ref. candidate
(baseline)
Data re-sampling
Machine learning
(Ensemble learning)
Selection of the best
reformulation
Reformulated
query
Source code
Candidate queries
Masud
Rahman,
PhD
Candidate,
U
of
S
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

BLADER QUERY PERFORMANCE
48
Masud
Rahman,
PhD
Candidate,
U
of
S
Technique Reformulated Query QE
Baseline { Title + Description } 12
STRICT { IRC spam channel Bug entered rid join messages
clients entry }
26
Rocchio { Title + Description } + { remoteserviceadminevent
admin service feed remote synd mask writer
event export }
BLADER { Title + Description } + { connect invitation
handle message room chat user send }
10
03
Lower QE is better
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

MY PHD THESIS STORY
49
Masud
Rahman,
PhD
Candidate,
U
of
S
Search
S1 (STRICT)
S2 (ACER)
S3 (BLIZZARD)
S4 (BLADER)
S6 (NLP2API)
S5 (RACK)
P1 P2 P4
P3

50
Masud
Rahman,
PhD
Candidate,
U
of
S
RACK
Query Reformulation for Internet-scale Code
Search using Crowdsourced Knowledge
[SANER 2016, ICSE 2017, EMSE 2019]
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

CHALLENGES OF CODE SEARCH ON WEB
Masud
Rahman,
PhD
Candidate,
U
of
S
Convert image to gray scale without losing transparency
BufferedImage Grayscale ImageEdit ColorConvertOp File
Transparency ColorSpace BufferedImageOp Graphics ImageEffects 51
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

WHAT IS CROWD KNOWLEDGE?
52
Masud
Rahman,
PhD
Candidate,
U
of
S
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

STEPS OF RACK
53
Masud
Rahman,
PhD
Candidate,
U
of
S
(2) API Candidate Ranking
(3) Query Reformulation
(1) Keyword-API Mapping
Database
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

STEP I: KEYWORD-API MAPPING DATABASE
54
Question title
Preprocessing
NL keywords
Accepted
answer
Code segment
extraction
API parsing
API classes
Keyword-API
linking
Keyword-API
Mapping database
Masud
Rahman,
PhD
Candidate,
U
of
S
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

STEP II: API RELEVANCE RANKING
55
Masud
Rahman,
PhD
Candidate,
U
of
S
How to parse HTML in Java?
Element
parse
HTML
Java Document
Jsoup
Keyword-API Co-occurrence
Keyword Pair-API Co-occurrence
Keyword-Keyword Coherence
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

STEP II: SEARCH QUERY REFORMULATION
56
Masud
Rahman,
PhD
Candidate,
U
of
S
HTML parser in Java
Technique Reformulated Query
Baseline HTML parser Java
RACK {HTML parser Java} + { Document Element File
IOException Jsoup }
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3

EXPERIMENT: DATASET COLLECTION
57
Java2s
175 Queries & Ground truth
769K Code segments
1. Hit@K
2. MAP@K
3. MRR@K
4. QE
5. MR@K
6. IDCG
Masud
Rahman,
PhD
Candidate,
U
of
S
P1 P2 P4
P3 S1 S2 S3 S4 S5 S6

COMPARISON WITH BASELINE
58
Masud
Rahman,
PhD
Candidate,
U
of
S
P1 P2 P4
P3 S1 S2 S3 S4 S5 S6

PART 2: SUMMARY
59
Masud
Rahman,
PhD
Candidate,
U
of
S
(3) Internet-scale Code Search
BLIZZARD
BLADER
STRICT
ACER
RACK NLP2API
76%-88%
P1 P2 P4
P3

BUGDOCTOR: LIVE DEMO
60
Masud
Rahman,
PhD
Candidate,
U
of
S
P1 P2 P4
P3

REPRODUCIBILITY & REPLICATION
61
Masud
Rahman,
PhD
Candidate,
U
of
S
https://github.com/masud-technope/BugDoctor
P1 P2 P4
P3

62
Masud
Rahman,
PhD
Candidate,
U
of
S
Part III: Future Works
P1 P2 P4
P3

T1: BUG REPRODUCIBILITY & BUG FIXING
63
Masud
Rahman,
PhD
Candidate,
U
of
S
Bug Localization Bug Understanding Bug Fixing
Bug Reproduction
P1 P2 P4
P3

T2: IMPROVED TEXT RETRIEVAL IN
SOFTWARE ENGINEERING
64
Masud
Rahman,
PhD
Candidate,
U
of
S
 




 )
(
)
1
0
(
|
)
(
|
)
(
)
1
(
)
(
i
v
In
j
j
j
i
v
Out
v
S
v
S 









RF
D
d t
t
n
D
d
f
t
IDF
TF log
))
,
log(
1
(
)
(
20+ SE
P1 P2 P4
P3

T3: GENETIC ALGORITHM FOR QUERIES
65
Masud
Rahman,
PhD
Candidate,
U
of
S
Method Search Query QE
Baseline {title + description} 25
STRICT[140] {tab classpath enabled buttons user entry} 86
TF-IDF {button entry bootstrap enabled incorrectly moving} 177
GA {open reflect tab bottom entry classpath} 01
Title
Description
Lower QE is better
P1 P2 P4
P3

T4: PROGRAMMER BOT
66
Masud
Rahman,
PhD
Candidate,
U
of
S
+
Developer
ProBot
Best code
example
P1 P2 P4
P3

RESEARCH COLLABORATIONS
67
Masud
Rahman,
PhD
Candidate,
U
of
S
Chanchal Roy David Lo Raula Kula Iman Kievanloo Jason Collins
Jesse Redl A. S. M. Arif Shamima Yeasmin Amit Mondal Saikat Mondal Rodrigo Silva
P1 P2 P4
P3

MAJOR RESEARCH ACHIEVEMENTS
68
Masud
Rahman,
PhD
Candidate,
U
of
S
DPA Nomination
(ICSME 2018)
President Gold Medal,
Bangladesh
Keith Geddes
Award 2017, U of S
Flagship SE venues
Google Scholar: 29/301, H-Index: 9
ACM CAPS
Award 2017
P1 P2 P4
P3

Masud
Rahman,
PhD
Candidate,
U
of
S
69
http://www.usask.ca/~masud.rahman
https://github.com/masud-technope
Contact: masud.rahman@usask.ca
@masud2336
Masud Rahman
Part IV: Q&A
P1 P2 P4
P3

70
Masud
Rahman,
PhD
Candidate,
U
of
S

TAKE-HOME MESSAGES
71
Masud
Rahman,
PhD
Candidate,
U
of
S
RQ1
RQ2 RQ3
TF-IDF
PageRank
Equality
Equity
Stack Overflow
FastText
WordNet
Thesis
P1 P2 P3

72
Java2s
CodeJava
310 Queries & Ground truth
769K Code segments
Hit@K
MAP@K
MRR@K
MR@K
QE
NDCG
S1 S2 S3 S4 S5 S6
P1 P2 P4
P3
Masud
Rahman,
PhD
Candidate,
U
of
S

Correct
Result
Correct
Result
Correct
Result
WHAT IS A GOOD SEARCH QUERY?
73
Masud
Rahman,
PhD
Candidate,
U
of
S
Baseline Query
(Title + Description)
Worse Query Better Query
Title
Description
P1 P2 P4
P3 S1 S2 S3 S4 S5 S6

SEMANTIC HYPERSPACE
74
Masud
Rahman,
PhD
Candidate,
U
of
S
P1 P2 P3
x P (1, 5, 6, 7, ….., N)
y P (2, 4, 6, 9, ….., N)
y
S1 S2 S3 S4 S5 S6
y = mx + c,
x^2 +y^2 = r^2
ax^2+bx+c=0

TWO WORKING CONTEXTS: LOCAL & GLOBAL
Masud
Rahman,
PhD
Candidate,
U
of
S
Local code search
(e.g., bug localization)
Internet-scale
code search
Boeing
codebase GitHub
P1 P2 P3
75

S2: KEYWORDS SELECTION FROM SOURCE
CODE WITH CODERANK
76
resolveRuntimeClasspathEntry
Resolve Runtime Classpath Entry
P1 P2 P3
 




 )
(
)
1
0
(
|
)
(
|
)
(
)
1
(
)
(
i
v
In
j
j
j
i
v
Out
v
S
v
S 


RQ1 [Source Code]: Keywords selected by PageRank
are more effective for local code searches (e.g., concept
location) than that of TF-IDF
S1 S2 S3 S4 S5 S6
Masud
Rahman,
PhD
Candidate,
U
of
S

HOW DID WE DO?
77
Masud
Rahman,
PhD
Candidate,
U
of
S
P1 P2 P3 S1 S2 S3 S4 S5 S6
3
RQ3: Appropriate query keywords can be delivered for the
code search using Stack Overflow and FastText.

R3: SOLVE VOCABULARY MISMATCH ISSUE
Masud
Rahman,
PhD
Candidate,
U
of
S
Customer
Developer
Past
Developer
Bug Report
Codebase
P1 P2 P3 P4
78

SOLUTION: SEMANTIC HYPERSPACE
Masud
Rahman,
PhD
Candidate,
U
of
S
Word 1 P (1, 5, 6, 7, ….., N)
Word 2 P (2, 4, 6, 9, ….., N)
Word 2
Cosine distance = Semantic
relevance
P1 P2 P3 P4
79

R4: GENETIC ALGORITHM FOR QUERIES
Masud
Rahman,
PhD
Candidate,
U
of
S
Method Search Query QE
Baseline {title + description} 25
STRICT[140] {tab classpath enabled buttons user entry} 86
TF-IDF {button entry bootstrap enabled incorrectly moving} 177
GA {open reflect tab bottom entry classpath} 01
Title
Description
Lower QE is better
P1 P2 P3 P4
80

SEARCH QUERY FROM NOISY BUG REPORT
81
Bug 31637 – should be able to cast null
NullPointerException
Ci Cj Mk Mn Cp
53 01
Masud
Rahman,
PhD
Candidate,
U
of
S
S1 S2 S3 S4
P1 P2 P3

DICE, ROCCHIO, RSV
Masud
Rahman,
PhD
Candidate,
U
of
S
82

VOCABULARY MISMATCH PROBLEM
Masud
Rahman,
PhD
Candidate,
U
of
S
P1 P2 P3
Both are correct and wrong!
Boeing
Customer Boeing
Developer
83

Masud
Rahman,
PhD
Candidate,
U
of
S
KEYWORDS FROM A BUG REPORT
Title
Description
ID Query QE
1. Custom search results view iresource
2. Custom search results search results view
3. element iresource provider level tree
4. Custom search results hierarchically java search results
1331
636
01
570
Lower QE is better
P1 P2 P3
84

PROBABILISTIC TERM WEIGHTING
Masud
Rahman,
PhD
Candidate,
U
of
S
KLD
85

PhD Seminar - Masud Rahman, University of Saskatchewan

Recommandé

Recommandé

Contenu connexe

Plus de Masud Rahman

Plus de Masud Rahman (20)

Dernier

Dernier (20)

PhD Seminar - Masud Rahman, University of Saskatchewan

Notes de l'éditeur