SlideShare une entreprise Scribd logo
1  sur  28
Shedding Light on Software
Engineering Specific Metaphors and
Idioms
Drexel University Virginia Commonwealth
University
Mia Mohammad Imran, Preetha Chatterjee, Kostadin
Damevski
Raining Cats and Dogs!
“Cause You Know Sometimes Words
Have Two Meanings”
Debt
Bug
Skeleton
Ticket
Are there new words emerging that are now
used differently?
Cloud Fog Edge
What about “Hallucinations”?
Figurative Language in SE
This has crept into unrelated bits of generator code
thanks to frankencoding!
However, there is a lot of unnecessary copy-paste
spaghetti code, uninformative variable names, etc.
which I didn't write myself
Oh wow, that’s even weirder than I thought lol. Quite
the heisenbug
Study Design and Goals
● Purpose: To investigate the prevalence, interpretation, and
impact of figurative language in Software Engineering
communications
● We designed 3 RQs
Research Questions
● How well can LLMs interpret figurative language in Software
Engineering context?
● Can Software Engineering-specific affective analysis
performance be improved by better insight into figurative
language?
● How does understanding figurative language impact
Software Engineering tasks like bug prioritization?
Data Collection and Annotation
Data Collection
● Sampled 2000 sentences (1000 with potential metaphors
and 1000 with potential idioms) from 9 GitHub popular
repositories
Data Annotation
● Verification of Figurative Expressions
● Rephrase sentences:
○ Equivalent Meaning Sentences (EMS): Sentences
reworded to remove figurative language while retaining
meaning
○ Different Meaning Sentences (DMS): Sentences
modified to change the meaning using similar words
Data Annotation: Example
Sentence: Otherwise this could give us a nasty bug
Equivalent Meaning Sentences (EMS): Otherwise this could
result in a dangerous error in code
Different Meaning Sentences (DMS): Otherwise, this neglected
garden could infest us with an unpleasant insect
Data Annotation
Annotators identified 1661 sentences with Figurative Language
● 752 sentences with Metaphors
● 909 sentences with Idioms
A total of 1741 unique Figurative Expressions marked
● 445 Software Engineering specific
● 1296 General
Research Questions
Research Question 1
● RQ: How well can LLMs interpret figurative language in
Software Engineering context??
● Evaluated LLMs: BERT, RoBERTa, ALBERT, CodeBERT
○ BERT, RoBERTa and ALBERT general domain
○ CodeBERT is SE specific LLM
RQ1: LLMs' Interpretation Capabilities
● Task: Assess models' abilities to
distinguish between EMS and DMS
● Calculated and compared cosine
similarity between (original
sentence, EMS) and (original
sentence, DMS) pairs
RQ1: LLMs' Interpretation Capabilities
● BERT performed best
● CodeBERT suffered most, especially with SE-specific
figurative expressions
Model SE-specific General Overall
BERT 84.51% 87.40% 86.57%
RoBERTa 83.70% 85.21% 84.95%
ALBERT 81.79% 85.80% 85.00%
CodeBERT 77.99% 79.63% 79.11%
Research Question 2
● RQ: Can SE-specific affective analysis performance be
improved by better insight into figurative language?
● Objective: Evaluate if improved figurative language
interpretation enhances LLMs' performance in SE affective
analysis
RQ2: Affective Analysis Enhancement
Analyzed Tasks:
● Emotion Detection
● Incivility Detection
Evaluated LLMs: BERT, RoBERTa, ALBERT, CodeBERT
● BERT, RoBERTa and ALBERT general domain
● CodeBERT is SE specific LLM
RQ2: Affective Analysis Enhancement
● Applied contrastive learning
● LLMs presented with triplets (Original Sentence, EMS, DMS)
● Minimize the distance between anchor (original) and positive
(EMS) samples, maximize distance from negative (DMS)
● Process repeated until the LLMs learn a satisfactory
representation
RQ2: Affective Analysis Enhancement
● Post Contrastive Learning: Task-specific fine-tuning applied
for emotion and incivility classification
○ This means two times fine-tuning for each tasks
● Performance Metric: F1-score used to evaluate
RQ2: Affective Analysis Enhancement -
Emotion Classification
● 6 classes: Anger, Love, Fear, Joy, Sadness and Surprise
● Dataset [1] Model Average F1-score Improvement
BERT
BERT-FL
0.588
0.627 +6.60%
RoBERTa
RoBERTa-FL
0.593
0.632 +6.66%
ALBERT
ALBERT-FL
0.531
0.550 +3.63%
CodeBERT
CodeBERT-FL
0.561
0.583 +3.90%
[1] M. M. Imran, Y. Jain, P. Chatterjee, and K.
Damevski, “Data augmentation for
improving emotion recognition in software
engineering communication,” in 2022
IEEE/ACM 37th International Conference on
ASE
RQ2: Affective Analysis Enhancement -
Incivility Classification
● 2 classes: Civil and Uncivil
● Dataset [1] Model Average F1-score Improvement
BERT
BERT-FL
0.734
0.783 6.67%
RoBERTa
RoBERTa-FL
0.734
0.769 4.76%
ALBERT
ALBERT-FL
0.685
0.713 4.08%
CodeBERT
CodeBERT-FL
0.692
0.741 7.07%
[1] I. Ferreira, B. Adams, and J. Cheng, “How
heated is it? understanding github locked
issues,” in 2022 IEEE/ACM 19th International
Conference on MSR
Research Question 3
● RQ: How does understanding figurative language impact SE
tasks like bug prioritization?
● Objective: Evaluating if better figurative language
interpretation boosts SE automation tasks
RQ3: Enhancing SE Automation with
Figurative Language
Analyzed Tasks:
● Bug Priority Detection
Evaluated LLMs: BERT, RoBERTa, ALBERT, CodeBERT
● BERT, RoBERTa and ALBERT general domain
● CodeBERT is SE specific LLM
RQ3: Enhancing SE Automation with
Figurative Language
● 5 classes: P1, P2, P3, P4, P5
● Dataset [1] Model Average F1-score Improvement
BERT
BERT-FL
0.716
0.730 1.96%
RoBERTa
RoBERTa-FL
0.707
0.724 2.40%
ALBERT
ALBERT-FL
0.683
0.709 3.71%
CodeBERT
CodeBERT-FL
0.714
0.726 1.61%
[1] W.-Y. Wang, C.-H. Wu, and J. He, “Clebpi:
Contrastive learning for bug priority
inference,” Information and Software
Technology, vol. 164, 2023
Implications and Future
works
Implications of Research Findings
● Educational Benefits:
○ Glossaries of project-specific figurative language can help
onboard new developers
○ Minimizing obscure jargon enhances understanding and
collaboration
● Cultural Considerations:
○ Consider cultural differences influencing figurative language
interpretation
Future Research Directions
● Integrate figurative language into SE tools/models
● Investigate role of figurative language in specific scenarios
(toxicity, bug report, documentation, etc)
● Explore figurative language for data augmentation
● Broaden to other types (similes, hyperbole, personification)
● Extend to other SE platforms (Stack Overflow, Gitter, JIRA)
Summary of Contributions
● Annotated Data: 1661 annotated GitHub sentences with
metaphors and idioms
● Open Resources: Annotation guidelines, dataset, and codes
publicly accessible
● Pioneering Research: Among the first to explore the impact
of figurative language in SE
● LLM Model Enhancement: Advanced LLM models refined for
better figurative language understanding in SE context
Questions/Thoughts/Collaboration Ideas to: Mia Mohammad Imran, imranm3@vcu.edu

Contenu connexe

Similaire à Shedding Light on Software Engineering-specific Metaphors and Idioms

ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdfITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdfOrtus Solutions, Corp
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty DialogueTransformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty DialogueJinho Choi
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
Cobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptx
Cobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptxCobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptx
Cobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptxmehrankhan7842312
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLPSatyam Saxena
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLPAnuj Gupta
 
Writing Machines: Detection and Stylometric Profiling
Writing Machines: Detection and Stylometric ProfilingWriting Machines: Detection and Stylometric Profiling
Writing Machines: Detection and Stylometric ProfilingGeorgeMikros3
 
Programming language design and implemenation
Programming language design and implemenationProgramming language design and implemenation
Programming language design and implemenationAshwini Awatare
 
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extractionGabriel Hamilton
 
Domain Specific Language Design
Domain Specific Language DesignDomain Specific Language Design
Domain Specific Language DesignMarkus Voelter
 
Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphras...
Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphras...Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphras...
Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphras...JanPhilipWahle
 
A comprehensive guide to prompt engineering.pdf
A comprehensive guide to prompt engineering.pdfA comprehensive guide to prompt engineering.pdf
A comprehensive guide to prompt engineering.pdfJamieDornan2
 
A comprehensive guide to prompt engineering.pdf
A comprehensive guide to prompt engineering.pdfA comprehensive guide to prompt engineering.pdf
A comprehensive guide to prompt engineering.pdfStephenAmell4
 

Similaire à Shedding Light on Software Engineering-specific Metaphors and Idioms (20)

ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdfITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
 
The NLP Muppets revolution!
The NLP Muppets revolution!The NLP Muppets revolution!
The NLP Muppets revolution!
 
Je2516241630
Je2516241630Je2516241630
Je2516241630
 
Je2516241630
Je2516241630Je2516241630
Je2516241630
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty DialogueTransformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Cobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptx
Cobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptxCobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptx
Cobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptx
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
 
Writing Machines: Detection and Stylometric Profiling
Writing Machines: Detection and Stylometric ProfilingWriting Machines: Detection and Stylometric Profiling
Writing Machines: Detection and Stylometric Profiling
 
Programming language design and implemenation
Programming language design and implemenationProgramming language design and implemenation
Programming language design and implemenation
 
Ppl 13 july2019
Ppl 13 july2019Ppl 13 july2019
Ppl 13 july2019
 
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extraction
 
Domain Specific Language Design
Domain Specific Language DesignDomain Specific Language Design
Domain Specific Language Design
 
Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphras...
Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphras...Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphras...
Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphras...
 
A comprehensive guide to prompt engineering.pdf
A comprehensive guide to prompt engineering.pdfA comprehensive guide to prompt engineering.pdf
A comprehensive guide to prompt engineering.pdf
 
Unit 1
Unit 1Unit 1
Unit 1
 
JAVA
JAVAJAVA
JAVA
 
A comprehensive guide to prompt engineering.pdf
A comprehensive guide to prompt engineering.pdfA comprehensive guide to prompt engineering.pdf
A comprehensive guide to prompt engineering.pdf
 
Quick Intro to Clean Coding
Quick Intro to Clean CodingQuick Intro to Clean Coding
Quick Intro to Clean Coding
 

Dernier

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Dernier (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Shedding Light on Software Engineering-specific Metaphors and Idioms

  • 1. Shedding Light on Software Engineering Specific Metaphors and Idioms Drexel University Virginia Commonwealth University Mia Mohammad Imran, Preetha Chatterjee, Kostadin Damevski
  • 3. “Cause You Know Sometimes Words Have Two Meanings” Debt Bug Skeleton Ticket Are there new words emerging that are now used differently? Cloud Fog Edge What about “Hallucinations”?
  • 4. Figurative Language in SE This has crept into unrelated bits of generator code thanks to frankencoding! However, there is a lot of unnecessary copy-paste spaghetti code, uninformative variable names, etc. which I didn't write myself Oh wow, that’s even weirder than I thought lol. Quite the heisenbug
  • 5. Study Design and Goals ● Purpose: To investigate the prevalence, interpretation, and impact of figurative language in Software Engineering communications ● We designed 3 RQs
  • 6. Research Questions ● How well can LLMs interpret figurative language in Software Engineering context? ● Can Software Engineering-specific affective analysis performance be improved by better insight into figurative language? ● How does understanding figurative language impact Software Engineering tasks like bug prioritization?
  • 7. Data Collection and Annotation
  • 8. Data Collection ● Sampled 2000 sentences (1000 with potential metaphors and 1000 with potential idioms) from 9 GitHub popular repositories
  • 9. Data Annotation ● Verification of Figurative Expressions ● Rephrase sentences: ○ Equivalent Meaning Sentences (EMS): Sentences reworded to remove figurative language while retaining meaning ○ Different Meaning Sentences (DMS): Sentences modified to change the meaning using similar words
  • 10. Data Annotation: Example Sentence: Otherwise this could give us a nasty bug Equivalent Meaning Sentences (EMS): Otherwise this could result in a dangerous error in code Different Meaning Sentences (DMS): Otherwise, this neglected garden could infest us with an unpleasant insect
  • 11. Data Annotation Annotators identified 1661 sentences with Figurative Language ● 752 sentences with Metaphors ● 909 sentences with Idioms A total of 1741 unique Figurative Expressions marked ● 445 Software Engineering specific ● 1296 General
  • 13. Research Question 1 ● RQ: How well can LLMs interpret figurative language in Software Engineering context?? ● Evaluated LLMs: BERT, RoBERTa, ALBERT, CodeBERT ○ BERT, RoBERTa and ALBERT general domain ○ CodeBERT is SE specific LLM
  • 14. RQ1: LLMs' Interpretation Capabilities ● Task: Assess models' abilities to distinguish between EMS and DMS ● Calculated and compared cosine similarity between (original sentence, EMS) and (original sentence, DMS) pairs
  • 15. RQ1: LLMs' Interpretation Capabilities ● BERT performed best ● CodeBERT suffered most, especially with SE-specific figurative expressions Model SE-specific General Overall BERT 84.51% 87.40% 86.57% RoBERTa 83.70% 85.21% 84.95% ALBERT 81.79% 85.80% 85.00% CodeBERT 77.99% 79.63% 79.11%
  • 16. Research Question 2 ● RQ: Can SE-specific affective analysis performance be improved by better insight into figurative language? ● Objective: Evaluate if improved figurative language interpretation enhances LLMs' performance in SE affective analysis
  • 17. RQ2: Affective Analysis Enhancement Analyzed Tasks: ● Emotion Detection ● Incivility Detection Evaluated LLMs: BERT, RoBERTa, ALBERT, CodeBERT ● BERT, RoBERTa and ALBERT general domain ● CodeBERT is SE specific LLM
  • 18. RQ2: Affective Analysis Enhancement ● Applied contrastive learning ● LLMs presented with triplets (Original Sentence, EMS, DMS) ● Minimize the distance between anchor (original) and positive (EMS) samples, maximize distance from negative (DMS) ● Process repeated until the LLMs learn a satisfactory representation
  • 19. RQ2: Affective Analysis Enhancement ● Post Contrastive Learning: Task-specific fine-tuning applied for emotion and incivility classification ○ This means two times fine-tuning for each tasks ● Performance Metric: F1-score used to evaluate
  • 20. RQ2: Affective Analysis Enhancement - Emotion Classification ● 6 classes: Anger, Love, Fear, Joy, Sadness and Surprise ● Dataset [1] Model Average F1-score Improvement BERT BERT-FL 0.588 0.627 +6.60% RoBERTa RoBERTa-FL 0.593 0.632 +6.66% ALBERT ALBERT-FL 0.531 0.550 +3.63% CodeBERT CodeBERT-FL 0.561 0.583 +3.90% [1] M. M. Imran, Y. Jain, P. Chatterjee, and K. Damevski, “Data augmentation for improving emotion recognition in software engineering communication,” in 2022 IEEE/ACM 37th International Conference on ASE
  • 21. RQ2: Affective Analysis Enhancement - Incivility Classification ● 2 classes: Civil and Uncivil ● Dataset [1] Model Average F1-score Improvement BERT BERT-FL 0.734 0.783 6.67% RoBERTa RoBERTa-FL 0.734 0.769 4.76% ALBERT ALBERT-FL 0.685 0.713 4.08% CodeBERT CodeBERT-FL 0.692 0.741 7.07% [1] I. Ferreira, B. Adams, and J. Cheng, “How heated is it? understanding github locked issues,” in 2022 IEEE/ACM 19th International Conference on MSR
  • 22. Research Question 3 ● RQ: How does understanding figurative language impact SE tasks like bug prioritization? ● Objective: Evaluating if better figurative language interpretation boosts SE automation tasks
  • 23. RQ3: Enhancing SE Automation with Figurative Language Analyzed Tasks: ● Bug Priority Detection Evaluated LLMs: BERT, RoBERTa, ALBERT, CodeBERT ● BERT, RoBERTa and ALBERT general domain ● CodeBERT is SE specific LLM
  • 24. RQ3: Enhancing SE Automation with Figurative Language ● 5 classes: P1, P2, P3, P4, P5 ● Dataset [1] Model Average F1-score Improvement BERT BERT-FL 0.716 0.730 1.96% RoBERTa RoBERTa-FL 0.707 0.724 2.40% ALBERT ALBERT-FL 0.683 0.709 3.71% CodeBERT CodeBERT-FL 0.714 0.726 1.61% [1] W.-Y. Wang, C.-H. Wu, and J. He, “Clebpi: Contrastive learning for bug priority inference,” Information and Software Technology, vol. 164, 2023
  • 26. Implications of Research Findings ● Educational Benefits: ○ Glossaries of project-specific figurative language can help onboard new developers ○ Minimizing obscure jargon enhances understanding and collaboration ● Cultural Considerations: ○ Consider cultural differences influencing figurative language interpretation
  • 27. Future Research Directions ● Integrate figurative language into SE tools/models ● Investigate role of figurative language in specific scenarios (toxicity, bug report, documentation, etc) ● Explore figurative language for data augmentation ● Broaden to other types (similes, hyperbole, personification) ● Extend to other SE platforms (Stack Overflow, Gitter, JIRA)
  • 28. Summary of Contributions ● Annotated Data: 1661 annotated GitHub sentences with metaphors and idioms ● Open Resources: Annotation guidelines, dataset, and codes publicly accessible ● Pioneering Research: Among the first to explore the impact of figurative language in SE ● LLM Model Enhancement: Advanced LLM models refined for better figurative language understanding in SE context Questions/Thoughts/Collaboration Ideas to: Mia Mohammad Imran, imranm3@vcu.edu