SlideShare une entreprise Scribd logo
1  sur  22
Mapping the Pubmed data
under different sub-topics
Email: venkykasprov@gmail.com
Venkatasubramani Karthikeyan
PROBLEM STATEMENT
Analogy Implementation
PROBLEM SOLVING APPROACH
Traditional approach
Data cleaning
Bag of words
Classification and clustering
Pre-Trained Model approach
No data cleaning required
BERT, BART & DEBARTA
ORIGINAL CATEGORIES CATEGORIES CONSIDERED
Traditional
approach
• Bag of words
Traditional
approach
• Bag of words
• After Remove stop words and stemming
• Using count vectorizer
Traditional
approach
• Classification
• Logistic regression
Traditional
approach
• Classification
• Logistic regression (cont)
Traditional
approach
• Classification (cont)
• Decision Tree
Entropy
Information Gain
Traditional
approach
• Classification (cont)
• Decision Tree
Traditional
approach
• Classification (cont)
• Random Forest
Traditional
approach
• Classification (cont)
• Random Forest
Traditional
approach
• Clustering
Traditional
approach
• Clustering
Traditional
approach
• Clustering (cont)
Hierarchical clustering HDBSCAN
Traditional
approach
• Clustering (cont)
Pre-trained
model approach Transformer
Pre-trained
model approach HuggingFace Transformers
Pre-trained
model approach
• BERT (Bidirectional Encoder Representations
from Transformers)
• Developed by Google in 2018.
• Revolutionary for its bidirectional training approach.
• BERT is pre-trained on a large corpus of unlabeled text
data.
id parent_title level_3 labels scores
126 293Big Data 0Bio-IT 0.645831
127 293Big Data 1Big Data 0.612736
128 293Big Data 2
Healthcare
Technology
0.602229
129 293Big Data 3
Disease
Processes
0.521784
• 🎉 40th Anniversary Special: IBM unveils the
eServer zSeries 890 (z890) mainframe, celebrating four
decades of their System/360 mainframe legacy.
• 💡 Breakthrough Tech: z890 introduces groundbreaking
tech aimed at simplifying IT environments, tailored especially
for medium-sized businesses.
• 💪 Powerhouse Performance: z890 offers almost double the
processing power of the preceding z800 series but starts 30%
smaller in capacity.
• 🔒 Enhanced Features: Elevated standards in
flexibility, virtualization, automation, security, and scalability.
• 🔄 Customized Capacity: Available as a single model with
28 capacity settings, letting businesses align server capacity
with specific needs.
• 📦 Advanced Storage: Introduction of
IBM TotalStorage Enterprise Storage Server 750, bringing
enterprise-grade storage capabilities to mid-sized businesses.
Pre-trained
model approach
• BART (Bidirectional and Auto-Regressive
Transformers)
• Developed by Facebook in 2019.
• BART is a denoising autoencoder for pretraining
sequence-to-sequence models.
• It corrupts the input by masking and then learns to
reconstruct the original data.
• 🎉 40th Anniversary Special: IBM unveils the eServer zSeries
890 (z890) mainframe, celebrating four decades of their
System/360 mainframe legacy.
• 💡 Breakthrough Tech: z890 introduces groundbreaking tech
aimed at simplifying IT environments, tailored especially for
medium-sized businesses.
• 💪 Powerhouse Performance: z890 offers almost double the
processing power of the preceding z800 series but starts 30%
smaller in capacity.
• 🔒 Enhanced Features: Elevated standards in flexibility,
virtualization, automation, security, and scalability.
• 🔄 Customized Capacity: Available as a single model with 28
capacity settings, letting businesses align server capacity with
specific needs.
• 📦 Advanced Storage: Introduction of IBM TotalStorage
Enterprise Storage Server 750, bringing enterprise-grade
storage capabilities to mid-sized businesses.
id parent_title level_3 labels scores
126 293Big Data 0Big Data 0.677244
127 293Big Data 1Proteomics 0.636867
128 293Big Data 2
Disease
Processes
0.511485
129 293Big Data 3Bio-IT 0.480203
Pre-trained
model approach
• DeBERTa (Decoding-enhanced BERT with
disentangled attention)
• Developed by Microsoft in 2020.
• Improves BERT by disentangling the content and position
information in the self-attention mechanism.
• 🎉 40th Anniversary Special: IBM unveils the
eServer zSeries 890 (z890) mainframe, celebrating four decades
of their System/360 mainframe legacy.
• 💡 Breakthrough Tech: z890 introduces groundbreaking
tech aimed at simplifying IT environments, tailored especially
for medium-sized businesses.
• 💪 Powerhouse Performance: z890 offers almost double the
processing power of the preceding z800 series but starts 30%
smaller in capacity.
• 🔒 Enhanced Features: Elevated standards in
flexibility, virtualization, automation, security, and scalability.
• 🔄 Customized Capacity: Available as a single model with
28 capacity settings, letting businesses align server capacity
with specific needs.
• 📦 Advanced Storage: Introduction of
IBM TotalStorage Enterprise Storage Server 750, bringing
enterprise-grade storage capabilities to mid-sized businesses.
id parent_title
level_
3
labels scores
126 293Big Data 0Big Data 0.808621
127 293Big Data 1Cell Biology 0.764249
128 293Big Data 2
Food
Bioscience
0.754545
129 293Big Data 3Green Biology 0.700146
if questions==True:
Ask()
else:
Thank_you()

Contenu connexe

Similaire à Mapping the pubmed data under different suptopics using NLP.pptx

Webinar: Sizing Up Object Storage for the Enterprise
Webinar: Sizing Up Object Storage for the EnterpriseWebinar: Sizing Up Object Storage for the Enterprise
Webinar: Sizing Up Object Storage for the EnterpriseStorage Switzerland
 
A scalable server environment for your applications
A scalable server environment for your applicationsA scalable server environment for your applications
A scalable server environment for your applicationsGigaSpaces
 
Effective use of cloud resources for Data Engineering - Johnson Darkwah
Effective use of cloud resources for Data Engineering - Johnson DarkwahEffective use of cloud resources for Data Engineering - Johnson Darkwah
Effective use of cloud resources for Data Engineering - Johnson DarkwahMatěj Jakimov
 
IBM FlashSystems A9000/R presentation
IBM FlashSystems A9000/R presentation IBM FlashSystems A9000/R presentation
IBM FlashSystems A9000/R presentation Joe Krotz
 
Presentation dell™ power vault™ md3
Presentation   dell™ power vault™ md3Presentation   dell™ power vault™ md3
Presentation dell™ power vault™ md3xKinAnx
 
Enterprise PostgreSQL - EDB's answer to conventional Databases
Enterprise PostgreSQL - EDB's answer to conventional DatabasesEnterprise PostgreSQL - EDB's answer to conventional Databases
Enterprise PostgreSQL - EDB's answer to conventional DatabasesAshnikbiz
 
SQL Server 2014 for Developers (Cristian Lefter)
SQL Server 2014 for Developers (Cristian Lefter)SQL Server 2014 for Developers (Cristian Lefter)
SQL Server 2014 for Developers (Cristian Lefter)ITCamp
 
Sirius ibm storage & platform computing solutions 080515 eh
Sirius ibm storage & platform computing solutions 080515 ehSirius ibm storage & platform computing solutions 080515 eh
Sirius ibm storage & platform computing solutions 080515 ehEric Herzog
 
Live Data: For When Data is Greater than Memory
Live Data: For When Data is Greater than MemoryLive Data: For When Data is Greater than Memory
Live Data: For When Data is Greater than MemoryMemVerge
 
Become More Data-driven by Leveraging Your SAP Data
Become More Data-driven by Leveraging Your SAP DataBecome More Data-driven by Leveraging Your SAP Data
Become More Data-driven by Leveraging Your SAP DataDenodo
 
Seize Profits in the Cloud with SolidFire
Seize Profits in the Cloud with SolidFire Seize Profits in the Cloud with SolidFire
Seize Profits in the Cloud with SolidFire NetApp
 
Techgate solution sets 2014
Techgate solution sets 2014Techgate solution sets 2014
Techgate solution sets 2014Techgate plc
 
The Future of Data Warehousing, Data Science and Machine Learning
The Future of Data Warehousing, Data Science and Machine LearningThe Future of Data Warehousing, Data Science and Machine Learning
The Future of Data Warehousing, Data Science and Machine LearningModusOptimum
 
Software-Defined Storage (SDS)
Software-Defined Storage (SDS)Software-Defined Storage (SDS)
Software-Defined Storage (SDS)Ali Mirfallah
 
EMC Symmetrix VMAX: An Introduction to Enterprise Storage: Brian Boyd, Varrow...
EMC Symmetrix VMAX: An Introduction to Enterprise Storage: Brian Boyd, Varrow...EMC Symmetrix VMAX: An Introduction to Enterprise Storage: Brian Boyd, Varrow...
EMC Symmetrix VMAX: An Introduction to Enterprise Storage: Brian Boyd, Varrow...Brian Boyd
 
Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018
Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018
Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018Amazon Web Services
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed_Hat_Storage
 
VSP G1000 Checklist - 7 Q's to ask your storage vendor?
VSP G1000 Checklist - 7 Q's to ask your storage vendor? VSP G1000 Checklist - 7 Q's to ask your storage vendor?
VSP G1000 Checklist - 7 Q's to ask your storage vendor? Hitachi Vantara
 
Storage Cloud and Spectrum deck March 2016
Storage Cloud and Spectrum deck March 2016Storage Cloud and Spectrum deck March 2016
Storage Cloud and Spectrum deck March 2016Joe Krotz
 

Similaire à Mapping the pubmed data under different suptopics using NLP.pptx (20)

Webinar: Sizing Up Object Storage for the Enterprise
Webinar: Sizing Up Object Storage for the EnterpriseWebinar: Sizing Up Object Storage for the Enterprise
Webinar: Sizing Up Object Storage for the Enterprise
 
A scalable server environment for your applications
A scalable server environment for your applicationsA scalable server environment for your applications
A scalable server environment for your applications
 
Effective use of cloud resources for Data Engineering - Johnson Darkwah
Effective use of cloud resources for Data Engineering - Johnson DarkwahEffective use of cloud resources for Data Engineering - Johnson Darkwah
Effective use of cloud resources for Data Engineering - Johnson Darkwah
 
IBM FlashSystems A9000/R presentation
IBM FlashSystems A9000/R presentation IBM FlashSystems A9000/R presentation
IBM FlashSystems A9000/R presentation
 
Presentation dell™ power vault™ md3
Presentation   dell™ power vault™ md3Presentation   dell™ power vault™ md3
Presentation dell™ power vault™ md3
 
Enterprise PostgreSQL - EDB's answer to conventional Databases
Enterprise PostgreSQL - EDB's answer to conventional DatabasesEnterprise PostgreSQL - EDB's answer to conventional Databases
Enterprise PostgreSQL - EDB's answer to conventional Databases
 
SQL Server 2014 for Developers (Cristian Lefter)
SQL Server 2014 for Developers (Cristian Lefter)SQL Server 2014 for Developers (Cristian Lefter)
SQL Server 2014 for Developers (Cristian Lefter)
 
Sirius ibm storage & platform computing solutions 080515 eh
Sirius ibm storage & platform computing solutions 080515 ehSirius ibm storage & platform computing solutions 080515 eh
Sirius ibm storage & platform computing solutions 080515 eh
 
Live Data: For When Data is Greater than Memory
Live Data: For When Data is Greater than MemoryLive Data: For When Data is Greater than Memory
Live Data: For When Data is Greater than Memory
 
Become More Data-driven by Leveraging Your SAP Data
Become More Data-driven by Leveraging Your SAP DataBecome More Data-driven by Leveraging Your SAP Data
Become More Data-driven by Leveraging Your SAP Data
 
Seize Profits in the Cloud with SolidFire
Seize Profits in the Cloud with SolidFire Seize Profits in the Cloud with SolidFire
Seize Profits in the Cloud with SolidFire
 
FS900 Data Sheet.PDF
FS900 Data Sheet.PDFFS900 Data Sheet.PDF
FS900 Data Sheet.PDF
 
Techgate solution sets 2014
Techgate solution sets 2014Techgate solution sets 2014
Techgate solution sets 2014
 
The Future of Data Warehousing, Data Science and Machine Learning
The Future of Data Warehousing, Data Science and Machine LearningThe Future of Data Warehousing, Data Science and Machine Learning
The Future of Data Warehousing, Data Science and Machine Learning
 
Software-Defined Storage (SDS)
Software-Defined Storage (SDS)Software-Defined Storage (SDS)
Software-Defined Storage (SDS)
 
EMC Symmetrix VMAX: An Introduction to Enterprise Storage: Brian Boyd, Varrow...
EMC Symmetrix VMAX: An Introduction to Enterprise Storage: Brian Boyd, Varrow...EMC Symmetrix VMAX: An Introduction to Enterprise Storage: Brian Boyd, Varrow...
EMC Symmetrix VMAX: An Introduction to Enterprise Storage: Brian Boyd, Varrow...
 
Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018
Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018
Leadership Session: AWS Semiconductor (MFG201-L) - AWS re:Invent 2018
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
 
VSP G1000 Checklist - 7 Q's to ask your storage vendor?
VSP G1000 Checklist - 7 Q's to ask your storage vendor? VSP G1000 Checklist - 7 Q's to ask your storage vendor?
VSP G1000 Checklist - 7 Q's to ask your storage vendor?
 
Storage Cloud and Spectrum deck March 2016
Storage Cloud and Spectrum deck March 2016Storage Cloud and Spectrum deck March 2016
Storage Cloud and Spectrum deck March 2016
 

Dernier

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationBoston Institute of Analytics
 

Dernier (20)

Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health Classification
 

Mapping the pubmed data under different suptopics using NLP.pptx

  • 1. Mapping the Pubmed data under different sub-topics Email: venkykasprov@gmail.com Venkatasubramani Karthikeyan
  • 3. PROBLEM SOLVING APPROACH Traditional approach Data cleaning Bag of words Classification and clustering Pre-Trained Model approach No data cleaning required BERT, BART & DEBARTA
  • 6. Traditional approach • Bag of words • After Remove stop words and stemming • Using count vectorizer
  • 9. Traditional approach • Classification (cont) • Decision Tree Entropy Information Gain
  • 19. Pre-trained model approach • BERT (Bidirectional Encoder Representations from Transformers) • Developed by Google in 2018. • Revolutionary for its bidirectional training approach. • BERT is pre-trained on a large corpus of unlabeled text data. id parent_title level_3 labels scores 126 293Big Data 0Bio-IT 0.645831 127 293Big Data 1Big Data 0.612736 128 293Big Data 2 Healthcare Technology 0.602229 129 293Big Data 3 Disease Processes 0.521784 • 🎉 40th Anniversary Special: IBM unveils the eServer zSeries 890 (z890) mainframe, celebrating four decades of their System/360 mainframe legacy. • 💡 Breakthrough Tech: z890 introduces groundbreaking tech aimed at simplifying IT environments, tailored especially for medium-sized businesses. • 💪 Powerhouse Performance: z890 offers almost double the processing power of the preceding z800 series but starts 30% smaller in capacity. • 🔒 Enhanced Features: Elevated standards in flexibility, virtualization, automation, security, and scalability. • 🔄 Customized Capacity: Available as a single model with 28 capacity settings, letting businesses align server capacity with specific needs. • 📦 Advanced Storage: Introduction of IBM TotalStorage Enterprise Storage Server 750, bringing enterprise-grade storage capabilities to mid-sized businesses.
  • 20. Pre-trained model approach • BART (Bidirectional and Auto-Regressive Transformers) • Developed by Facebook in 2019. • BART is a denoising autoencoder for pretraining sequence-to-sequence models. • It corrupts the input by masking and then learns to reconstruct the original data. • 🎉 40th Anniversary Special: IBM unveils the eServer zSeries 890 (z890) mainframe, celebrating four decades of their System/360 mainframe legacy. • 💡 Breakthrough Tech: z890 introduces groundbreaking tech aimed at simplifying IT environments, tailored especially for medium-sized businesses. • 💪 Powerhouse Performance: z890 offers almost double the processing power of the preceding z800 series but starts 30% smaller in capacity. • 🔒 Enhanced Features: Elevated standards in flexibility, virtualization, automation, security, and scalability. • 🔄 Customized Capacity: Available as a single model with 28 capacity settings, letting businesses align server capacity with specific needs. • 📦 Advanced Storage: Introduction of IBM TotalStorage Enterprise Storage Server 750, bringing enterprise-grade storage capabilities to mid-sized businesses. id parent_title level_3 labels scores 126 293Big Data 0Big Data 0.677244 127 293Big Data 1Proteomics 0.636867 128 293Big Data 2 Disease Processes 0.511485 129 293Big Data 3Bio-IT 0.480203
  • 21. Pre-trained model approach • DeBERTa (Decoding-enhanced BERT with disentangled attention) • Developed by Microsoft in 2020. • Improves BERT by disentangling the content and position information in the self-attention mechanism. • 🎉 40th Anniversary Special: IBM unveils the eServer zSeries 890 (z890) mainframe, celebrating four decades of their System/360 mainframe legacy. • 💡 Breakthrough Tech: z890 introduces groundbreaking tech aimed at simplifying IT environments, tailored especially for medium-sized businesses. • 💪 Powerhouse Performance: z890 offers almost double the processing power of the preceding z800 series but starts 30% smaller in capacity. • 🔒 Enhanced Features: Elevated standards in flexibility, virtualization, automation, security, and scalability. • 🔄 Customized Capacity: Available as a single model with 28 capacity settings, letting businesses align server capacity with specific needs. • 📦 Advanced Storage: Introduction of IBM TotalStorage Enterprise Storage Server 750, bringing enterprise-grade storage capabilities to mid-sized businesses. id parent_title level_ 3 labels scores 126 293Big Data 0Big Data 0.808621 127 293Big Data 1Cell Biology 0.764249 128 293Big Data 2 Food Bioscience 0.754545 129 293Big Data 3Green Biology 0.700146