We present a cognitive-based semantic approach that uses rule-based Natural Language Processing (NLP) in conjunction with a world model and cognitive frames to semantically analyze, understand, and rank digital text in search engines. The goal is to improve the relevance, accuracy, and efficiency of information search. The world model represents things existing in the real world (e.g., subject-related ontologies or classifications essential for understanding the topics to be analyzed) whereas cognitive frames specify possible users’ interactions with the world, including things that people should know or do (e.g., tasks, methods, procedures, cognitive processes) in such interactions. Using a rule-based semantic approach in conjunction with a subject-related world model and task-relevant cognitive frames to understand and evaluate text is innovative approach in search engine technology. It addresses three limitations of the existing approaches: the inadequate measure of the meaningful content in web pages; a poor understanding of users’ intention and tasks in their search and, the irrelevance and inaccuracy of search results. This method has led to the successful implementation of a full-scale semantic search engine in medicine (available at Seenso.com). The method is applicable and adaptable to other disciplines and other types of computer applications.
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
A Cognitive-Based Semantic Approach to Deep Content Analysis in Search Engines
1. A Cognitive-based Semantic Approach to
Deep Content Analysis in Search Engines
Mei Chen & Michel Décary
Cogilex R&D Inc.,
Montreal, Canada
Paper presented at the IEEE International Conference for Semantic Computing
(ICSC 2018)
Laguna Hills, California, USA
2. The rich information on the Internet reflects not only our
knowledge, but also our experiences, opinions,
expectations, and emotions as humans.
Mei Chen 2
4. When people search for information on the Internet,
it is always for a purpose
Mei Chen
5. Current Internet information search
Search
engine
User
Input
System
Outputs
Relevance:
• Symbolic data match
• Popularity measures
Related
search A
Related
search B
Related
search C
Related
search DRelated
search N
People also search for:
6. Our cognitive-based semantic approach
to search
Search
engine
Input on
a subject
matter
Output A for
Task 1
Output A for
Task 2
Output A for
Task 3
Output A for
Task 4
Output A for
Task 5
Output B for
Task 1
Output B for
Task 2
Output B for
Task 3
Output B for
Task 4
Output B for
Task 5
Output C for
Task 1
Output C for
Task 2
Output C for
Task 3
Output C for
Task 4
Output C for
Task 5
Output ...N
Agent in
Situation A
with Goal A
Agent in
Situation A
with Goal C
Redefine
the relevance:
• Subject-matter
• Agent
• Situation
• Goal
• Tasks
Agent in
Situation A
with Goal B
Agent in
Situation A
with Goal N
7. Semantic computing: A promising
perspective
• Addressing topics important for understanding digital
content such as “meaning”, “context”, “intention”
• Offering more comprehensive and cohesive views to
address the multi-faceted nature of automatic analysis of
text;
• Integrated and coherent approaches have not yet been
developed (Sheu, 2015; Wang, 2010).
8. Our cognitive-based semantic approach
to search: Seenso.com
A cognitive-based semantic approach to deep content
analysis:
• Using rule-based natural language processing technology in
conjunction with a world model and cognitive frameworks to
semantically analyze, rank, select, retrieve, and extract web
content.
• Analyzing meaning of the content of web content in relation to
user’s intention, tasks, context, and the functional use of
knowledge in a given subject-matter area.
9. Key components
(a) A world model:
The world model mimics the real world and things in it, and it serves as
the basis for understanding the topics to be analyzed.
(b) Cognitive frames (model of knowledge ):
Cognitive frames reflect the interaction with the world and things that
people should know or do in such interactions.
(c) Semantic rules:
Semantic rules are possible linguistic expressions that describe the
meaningful aspects of entities, attributes, relations, actions, and
interactions.
10. A world model
Macro World
(World Model
Of Everyday Things)
People
Objects
Food
Plants
Animals
Places
Organiza
-tions
Events
Academic
disciplines
Industries
The world model mimics
the real world and things in
it, and it serves as the
basis for understanding
the topics to be analyzed.
11. The micro world
The micro world represents
domain-specific entities or
object classes, particularly
entities important for
understanding the domain-
specific nature of users'
interactions.
• Disease entity > 25,000
• Symptom entity > 4,500
• Injury and accident entity > 1,500
• Medical procedure entity > 9,500
• Drug entity > 8,000
• Other health related object classes
Micro world
in medicine
Diseases
Symptoms
Injuries
DrugsProcedures
Specialists
Treatment
modalities
12. User interactions from a cognitive
perspective
HigherLevelsofInteraction
withtheWorld
Sense making
Performance
Planning
Decision making
Risk management
Diagnostic
problem solving
Experiment & test
Design & Creation
Communication &
socialization
Cognitive frames reflect
users’ interactions with
the world and things that
people should know or
do in such interactions.
13. Cognitive frame of diseases
• What it is
• Definition
• Clinical characteristics
• Symptoms
• Types & classification
• Development & stages
• Who are at risk
• Causes & risk factors
• Prevention & early detection
• Tests & examinations
• Making diagnosis
• Interpreting test results
• Making differential diagnosis
• Deciding the diagnosis
• Validating the diagnosis
• Treatments
• Drug therapy
• Alternative medicine
• Other medical intervention
• Potential complications & precautions
• Making informed treatment decisions
• Threshold for treatment
• Assessing treatment options
• Treatment effectiveness
• What is effective
• For what subtype of disease it is effective
• For whom it is effective
• When it is effective
• Treatment safety
• Adverse effects
• Treatment financial costs
• Opportunity cost
• How to choose a treatment?
• Shared medical decision making
• Getting a second opinion
• Self-care tips
14. Cognitive frame of medical procedures
• Understanding the procedure
Function, Effectiveness, Safety
• Assessing whether the procedure works for your
Medical condition to be treated, Preexisting disease and other health conditions to avoid
• Preparing for the procedure
Potential complications, Precautions
• Undergoing the procedure
Things to expect and do during the procedure
• Patients’ surgical safety checklist
Common medical errors and prevention
• Post-operation self-care guides
• Advice for caregivers
• Clinical guidelines
15. Cognitive frame of drugs
• What it is
• Classification
• Composition
• Mechanism of action
• Used for
• Disease
• Symptom
• Injury & accident
• People with other conditions
• Off-label use
• Drug administration
• Dose
• Route
• Used in combination
• Storage
• What to expect?
• Side effects
• Drug interaction
• Warning & precautions
• Drug accidents & overdose
• Signs of overdosing
• What to do if overdosing
• Clinical evidence
• Effectiveness
• Safety
16. Semantic rules
Linguistic descriptions of entities, attributes, relations,
processes, actions, and interaction
Strict semantic rules:
• DRUG (S) treat (V) DISEASE (O).
• DRUG (S) is (Be) effective (Adj) for treating
(V) DISEASE (O).
• SPECIALISTS (S) use (V) DRUG (O) as
first-line treatment (O) for DISEASE patients
(O).
• SPECIALISTS (S) use (V) DRUG (O) to
treat (V) patients (O) who (S) suffer from (V)
DISEASE (O) (subordinate clause).
• Patients (S) need (V) to take (V) DRUG (O)
if they (S) suffer from (V) DISEASE (O)
(conditional clause).
Loose semantic rules:
• DRUG treat * DISEASE
• DRUG is effective for * DISEASE
• SPECIALISTS use DRUG + treatment +
DISEASE
18. Analyzing the deep content of texts
Themacroworld
Input
Text 1
Text 2
Text 3
Text 4
Text 6
…
Text n
Themicromedicalworld
Genericcognitiveframe
Domain-specificcognitive
frames
Output
(Semantic
representat
ions)
SR 1
SR 2
SR 3
…
SR n
Basicnaturallanguage
processing
Genericsemanticrules
Domain-specificsemantic
rules
20. The assumptions about users’ situations
• The insights from cognitive studies of learning, problem solving, and
performance
• Being at risk
• Prevention and early direction
• Experiencing certain symptoms
• Undergoing a medical test
• Getting an abnormal test result
• Being diagnosed with a chronic disease
• Understanding treatment options
• Making informed treatment decisions
• Undergoing a medical treatment/procedure
• Pain & symptom relief
• Living with the disease
23. Current focus:
Health information for self-care
Self-care education is urgently needed to overcome the global
health care challenges (U.S. data)
• A high prevalence of chronic diseases
75% of national health care spending, 7 out of 10 deaths
• High associated healthcare costs
$3.3 to 3.6 trillion annual health expenditure
• Aging populations
47.8 million seniors
• Low health literacy
Cost due to low health literacy: up to $238 billion (in 2003)
• High rates of preventable medical errors
Third leading cause of medical related deaths
24. Self-care education opportunities
offered by the Internet
• The Internet can deliver a variety of content (webpages, audio,
and videos) to users almost anytime and anywhere, it can
become a powerful tool for promoting public health education.
• A large amount of high-quality health information is available on
the Internet;
• Over 70% of people use the Internet to search for health-related
information;
• Searching for health information on the Internet provides a
window of opportunity to promote public health education and
public health.
25. Challenges in searching health
information on the Internet
• Difficulty to get the right information at the right time;
• The concerns of reliability of health information on the Internet;
• Acting on fragmented and inadequate information can lead to
counterproductive feelings, decisions, actions, and serious
consequences;
• Better search methods are needed to improve the quality and
usefulness of health information for supporting self-care.
26. Future directions
• Automatically curate high-quality health information from the best
medical websites and create a practical self-care knowledge base
for supporting
• Patient education and self-care;
• The development of smart consumer digital applications.
27. Future directions
• Create intelligent interfaces to communicate such knowledge to
users in all forms of consumer digital health products and
services
Strategies to incorporate self-care knowledge base in EHR systems, mobile
self-monitoring devices & mobile apps:
• Smart AI conversation agents:
• Understanding health-related instruction and actions,
• Carrying meaningful conversations with users via speech and text.
• Contextualized health information support;
• On-demand mini courses on major diseases and common surgical procedures.
- As humans, we learn about the world through our interaction with the world.
- Our interactions with people, with objects, and with all the things we encounter.
-The rich information that we recorded on the Internet reflects the knowledge that we gained through such interactions.
- The reason we want to retrieve information is to support further new interactions with the world through our meaningful activities.
- We do not retrieve information out of a vacuum just for the sake of it.
- Retrieval is always done within the context of a purposeful human activity. It’s to help with something, learn about a person, carry out a task, make a decision, solve a problem, find a job or even have fun.
- So when people search for information on the Internet, it’s always for a purpose.
- However, the automated text analysis and search engine technology so far has paid little attention to the human activities, or the use of information for interacting with the world.
- Ontologies and key words normally do not reflect the interaction.
- If we want to improve the relevance and usefulness of search engine technology, we need to find a way to account for the interactions.
- Textual content exists for a purpose and it should also be retrieved with the knowledge that it serves a purpose.
- However, most indexing and retrieval methods view text essentially as "symbolic data" and the search and retrieval process as "symbol manipulation".
- For this reason, most of those methods will work well across languages without even having to know anything about the specifics of each language because it only cares about matching queries with content at a symbolic level without the need to understand why someone would make such a query.
- But if you start to care about why a certain query was made and why a certain page was created then you need to go deeper.
For instance, If someone searches for ”back pain", it is most probably because he or she is suffering from it, or wants to know the cause of it, or the remedy or the prognosis.
Then it becomes important to understand which of those aspects are represented in the content that is retrieved even though they are not explicitly mentioned in the query.
Understanding the functional use of the information gives us an opportunity to retrieve more useful content
- The emerging field of semantic computing seems to offer more comprehensive and cohesive views to address the multi-faceted nature of automatic analysis of text.
- Interesting frameworks and models are being defined that try to place the focus beyond symbolic data and more towards the cognitive and knowledge level
- But integrated and coherent approaches have not yet been fully developed.
At Cogilex, we have been working on a cognitive-based semantic approach to search for which we have done a first large-scale implementation in a publicly available medical search engine called Seenso
The core of this approach is to match every sentence of every page to a functional model of knowledge in a specific subject matter area.
In Seenso, all content is indexed in terms of its functional use by users
The ranking of content is done solely and entirely based on the semantic content of the page in terms of cognitive tasks and functions.
There are three main components in our approach:
- A model of the world in terms of classes and hierarchy of concepts
- A model of the knowledge and its functional use
- And semantic rules that matches text with those two models
The world model is composed of detailed ontologies that defines the classes and hierarchy of objects to be found in any text.
It serves as the basis for understanding the topics to be analyzed.
Detailed micro-worlds are then defined for each subject matter.
In our implementation we have created a rich medical micro world that includes for instance 25,000 medical conditions and their relationships as well as symptoms, medical procedures, drugs, etc.
Next we define a model of knowledge in terms of cognitive frames that reflect the interaction and things that people should know or do in their relations with the world.
Those frames represent the cognitive, social and emotional function of information.
When we see text, sentences, paragraphs, we need to know if it describes how to understand something, or to plan something, or to evaluate a risk or to diagnose something.
Text needs to be linked to a purposeful function
For instance, this is a cognitive frame we have created for diseases in our medical world.
When we analyse a text about diseases, we try to fit any utterance into one or more of those elements.
For instance, is a sentence talking about
- Who is at risk
- What are the causes of this disease
- How to prevent it
- How to treat it
Different topics will have different cognitive frames
Here is a subset of our frame for medical procedures
So, is our sentence talking about
Preparing for the procedure
Undergoing the procedure
Possible complications of the procedures
This is yet a different cognitive frame for information on drugs
What is it used for
How is it administered
What are the side effects
What to do in case of overdose
Next, we define semantic rules that match any text, for instance web content and users queries, to the cognitive frames and micro-worlds
Those rules represent possible linguistic expressions that describe the meaningful aspects of entities, attributes, relations, actions, and interactions.
The rules shown here are pattern matching rules but we also have used strict grammatical parsing with a dependency syntax approach.
We have found that it is important to use different levels of precision and coverage in order to more accurately capture and describe the nature of a paragraph or of a whole page.
Doing a shallower parsing with a certain strictness control like the one described here allows for a higher level of coverage that better helps to determine the cognitive frame coverage of a whole page.
On the other hand, we also need to do more precise fine grain extraction on key content in order to create databases of medical knowledge from individual sentences
So the best parsing mechanism for indexing and retrieving will not be the same as the best mechanism for extracting and inferencing.
Using a combination of methods gives us much flexibility
So in summary, this approach of identifying the meaningful content of webpages in terms of micro-world objects linked to cognitive frames representing users goals and tasks enables the search engine to identify and provide information that is central to users’ functional needs.
So, the process that our system follows when analyzing a web page goes a bit like this
- First, we apply a generic Natural Language Processing component to discover sentence and paragraph boundaries, assign parts of speech, normalize spelling, identify noun phrases boundaries.
Then, using our micro-worlds, we discover, tag and store all entities on the page. We also determine the topic and subtopics of the page
Next, we apply all semantic rules at every position of every sentence and generate semantic representations associated with every sentence
- Those representations are basically predicate structures where the predicate is a frame element like Disease:Cause and the arguments are objects in the micro-world, like “smoking” and “cancer”.
Finally, we assign a ranking or score for the whole page for every node of every cognitive frames present on that page. For instance, a page about diabetes, will receive a score for prevention, treatment, causes and, of course, an overall score for the disease frame. Any drug or procedure mentioned on that page will also be ranked in their respective frames. I will not go into the details of this evaluation but it is mostly based on the relative weight of of each branch in the cognitive frame. Those weights are determined by the knowledge engineer according to our knowledge model.
After all this is done, we now have millions of pages that are classified and indexed by object classes and cognitive frame elements
We are now ready to answer users by applying a similar process on their queries
A fundamental aspect of our work is that we put the user at the very center of the search process.
Before any querying is done or any indexing is done, there is a user.
A user who may not possess the right terminology; who may lack effective search strategies
But a user who might experience some symptoms or have some concerns;
Then the search query given by this user might be just a symptom name or a disease name
But what does our user really need?
It could be to make sense of a situation, to know what caused it, assess whether it is serious or not, decide to consult a doctor or not.
Most search engines assume that if we just return the most popular URLs for the keywords entered, this will be good enough.
It is certainly good enough if we just need to satisfy a query at its face value as if it was a formal object but what about the intention and the person behind the query?
If we really want to support a human being who is asking for information for a real reason, we believe that a ranking mechanism like the one we propose will allow the search engine to retrieve content organized in terms of useful functional information that we can use to guide our user more purposefully and more adequately.
To build our situational model of users, we rely on the insights from cognitive studies of learning and problem solving. For our model of diseases, for instance, we first specified all possible situations that users may encounter, then we identified the goals and tasks of users in these situations, finally we decided what kinds of information would help users perform these tasks successfully based on instructional principles of learning.
For example, if somebody searches for a particular risk factor known for a disease, this model helps us determine what kind of information would be most useful. For instance, statistics about this risk factor, what may influence the odds, what kind of self-monitoring needs to be done, what are recommended screening etc.
We do not have time to do a live demo of the system but you can try it online at your heart’s content at Seenso.com
Here is a screenshot of a Seenso search for Alzheimer’s disease
First, as I have explained, the system ranks the research results based on the meaning of the text on the page and its relevance to user’s goal and task.
For instance, the page from the National Institute of Health, is first because it is the one whose content covers best the cognitive tree for disease. Popularity and linking played no role here. However, we do take into account the quality of the sources evaluated both by manual and automatic means.
We also index thousands of medical news sources daily using the same algorithms.
We also index a large number of videos from very good sources based on the text descriptions of those videos. This is very good content that is largely ignored by traditional search engine as hospitals and doctor’s videos for instance are not heavily linked by other pages but as we do our analysis based on content, this does not matter any more so some jewels come up to the surface.
The tree on the left is a subset of the cognitive frame for disease that serves as a knowledge maps of key information to guide users’ search. This knowledge map also represents a model of expertise, reflecting important things we want people to know or do in their interactions with the world, in this case, dealing with a disease. This tree will change depending on the search query.
So with this guided exploration, users can choose different issues to explore, based on their goals, tasks, and preferences.
Using machine learning on big data, we also discover some hidden relations among different medical entities, which allows us to extract and expand self-care information for specific diseases. For example, it allows users to discover what are the most common symptoms, tests, treatment modalities, drugs, and even dietary plans related to a specific disease.
For instance in this example, the system discovered that Resection, Colectomy and Laparoscopy are common related procedures for Colorectal Cancer. The system knows this because it actually read it many times it in multiple quality sources and within multiple sentence structures.
This is really powerful because users now can explore a wide of range of important relations that are not mentioned in any single document, but that are important for them to understand their health conditions, make informed decision-making concerning tests they need, treatment suitable to them, or even learn how the change of dietary plan can affect their condition.
As you can witness, we focused the application of our work on self-care related content. Why is that?
Countries all over the world are facing great challenges in their healthcare systems.
But as most chronic disease are preventable, treatment outcomes can be significantly improved through better patient education and more effective self-care
So, as a society, we must provide better health information for supporting self-care both for our personal well-being and for our countries to be able to afford health care for all.
The reality is that now, the Internet, rather than physicians, has become the primary source for people to obtain health-related information;
Also searching for health information on the Internet provides a great opportunity to promote public health education and we believe this is the most cost-effective way to educate the public about self-care.
So we need to embrace the use of the Internet as a source of medical knowledge for all.
But it needs to be good and there are challenges to overcome.
The biggest challenge of doing this is to provide the right information at the right time.
Information needs to be reliable, complete and trustworthy.
We think that the kind of semantic technology that we propose is suitable for analyzing procedure-oriented tasks involved in self-care and that it can help provide the right types of information at the time when users are seeking it.
We are pretty happy with our first implementation but many things need to be improved, besides of course lots of fine tuning to make the search results better.
We want to to go much further in generating inferences from our data in order to build a useful database of curated medical information. This data is very rich and we are just touching the surface of what can be done with it.
The ultimate goal is to use this technology to create a database of high-quality, up-to-date, and practical self-care knowledge base from the best medical websites on the Internet
Then, this knowledge base can be used not only for supporting self-care, but also for the development of smart consumer digital health applications.
We also need to go much further in communicating knowledge to users.
Providing exploratory tools is good but is not enough and we need to directly engage users.
At the moment, we are working on a prototype of an AI conversation agent based on our data and our cognitive models.
We believe that this kind of work can make it easier for users to obtain, understand, and use health information in a variety of contexts such as in electronic health record systems, in smart mobile apps and mobile self-monitoring devices.
We have done this work in the field of medicine but the same exact semantic approach can also be used to develop other vertical search engines to support learning about sports, arts, troubleshooting, and virtually, anything we do. This can be done by simply defining the cognitive frames and the micro-world for each of those subject areas. All the rest stays the same.
We do believe that ultimately an approach like the one we propose can be used to transform massive, unstructured text into organized and functional knowledge for supporting human learning and performance in a variety of contexts including intelligent systems and services.
Most importantly, we believe that the successful integration of semantic search technology with cognitive models of user’s interactions will result in better search engines that will serve not only as information retrieval systems, but also as learning platforms, decision-making aids, and self-care supporting systems.