Studies show that online museum collections are among the least popular features of a museum website, which many museums attribute to a lack of interest. While it’s certainly possible that a large segment of the population is simply uninterested in viewing museum objects through a computer screen, it is also possible that a large number of people want to find and view museum objects digitally but have been discouraged from doing so due to the poor user experience (UX) of existing online-collection interfaces. This paper describes the creation and validation of a UX assessment rubric for online museum collections. Consisting of ten factors, the rubric was developed iteratively through in-depth examinations of several existing museum-collection interfaces. To validate the rubric and test its reliability and utility, an experiment was conducted in which two UX professionals and two museum professionals were asked to apply the rubric to three online museum collections and then provide their feedback on the rubric and its use as an assessment tool. This paper presents the results of this validation study, as well as museum-specific results derived from applying the rubric. The paper concludes with a discussion of how the rubric may be used to improve the UX of museum-collection interfaces and future research directions aimed at strengthening and refining the rubric for use by museum professionals.
Presented at the 2015 Museums and the Web conference in Chicago IL.
Assessing the User Experience (UX) of Online Museum Collections: Perspectives from Design and Museum Professionals
1. Assessing the User
Experience (UX) of Online
Museum Collections:
Perspectives from Design and Museum
Professionals
Craig M. MacDonald, Ph.D.
Pra3 Institute, School of Information and Library Science
Paper | Museums and the Web 2015 | April 9, 2015
2. The Online Collection
A common feature of
museum websites is the
online collection.
Idea: allow experts access to
museum holdings without
needing to be physically
present.
Substantial time and effort
has been invested in
developing these online
collections.
But, collections are routinely
among the least visited
sections of the website.
2
3. Two Possible Explanations
1.
Most people are
completely
uninterested in
viewing museum
objects through a
computer screen.
3
2.
People want to view
digital museum
objects but are
deterred from doing
so due to the poor
experiences offered
by existing online
collection interfaces.
4. Beyond Usability
Museums understand the importance of a
usable website.
If a visitor can’t find information about visiting the
museum, they probably won’t.
But, usability alone is no longer sufficient.
Museums cannot simply provide access to
their digital materials; they must also create
positive experiences for their users.
4
5. UX of Online Museum Collections
Overarching Research Question
How can the experience of using online
museum collections be improved?
Related Questions:
1. What factors determine the UX of an online
museum collection?
2. How can these factors be used to evaluate the
UX of existing online museum collections?
5
6. Two Challenges
1. Evaluating interfaces is time-‐‑consuming and
resource intensive.
Even lightweight usability testing methods can be
challenging.
2. UX is a complex concept that is difficult to
evaluate well.
The relevant UX factors of a mobile banking app are
likely not the same as those of an online museum
collection.
What’s needed:
An evaluation method that is easy to use, adaptable,
and quick.
6
7. Assessment Rubrics
Defined as: “criteria for assessing complicated
things.”
Common in educational se3ings because they
articulate gradations of quality for meaningful
dimensions or criteria.
7
Scale Level 1
Scale Level 2
Scale Level 2
Dimension 1
description
description
description
Dimension 2
description
description
description
Dimension 3
description
description
description
8. Benefits of Using Rubrics
Efficiency
Streamline assessment by reducing need to explain
why specific scores were given.
Transparency
Clearly define “quality” in objective and observable
ways.
Reflectiveness
Don’t directly prescribe specific fixes; instead, reflect
on why/how improvements can be made.
Ease of Use
Simple as completing a form, and completed rubric is
effective tool for communicating results.
8
9. Rubric Creation Process
1. Identify purpose/goals
2. Choose rubric type
3. Identify the dimensions
4. Choose a rating scale
5. Write descriptions for each rating point
9
10. What is this rubric for?
This is the most important step, as it will drive
all subsequent decisions.
Goal: To assess the UX quality of an online
museum collection.
10
Step 1
11. What type of rubric?
Holistic Rubrics
Look at a product or performance as a whole;
contain just one dimension (e.g., “overall
quality”).
Analytic Rubrics
Split a product or performance into its component
parts; allow for feedback on multiple dimensions.
11
Step 2
12. What dimensions matter?
Requires breaking down the product being
evaluated into components that are:
Observable
Important
Precise
No prescribed way to do this; just needs to be a
process that can be explained and justified.
12
Step 3
13. Finding a starting point
Began with a literature search to see if any UX
criteria for online museums had already been
established.
Starting point: Lin, Fernandez, and Gregor (2012)
identified 4 design characteristics and five design
principles associated with user enjoyment.
Characteristics: Novelty, Harmonization, No time
constraint, Appropriate facilitation and association
Principles: Multisensory learning experiences,
Creating a storyline, Mood building, Fun in learning,
Establishing social connection
13
Step 3
14. Testing Lin et al.’s model
With a graduate assistant, reviewed 39 online
museum collections with respect to these 9
dimensions.
This allowed for a bo3om-‐‑up approach.
Ensured that dimensions were reflective of what
the museum community considers valuable.
14
Step 3
15. Finding Exemplars
The Rijksmuseum
quickly emerged as an
exemplar.
But, discussing how it
excelled uncovered
limitations to Lin et
al.’s framework.
Many dimensions were
actually describing
multiple concepts,
making them difficult
to assess independently.
15
Step 3
16. Refining the dimensions
In response, we developed a parallel set of
dimensions that were more observable and
explicit.
And that more closely matched our interpretation of
Lin et. al.’s framework.
This allowed us to:
Improve the vocabulary to make it more accessible;
Tighten the concepts to make them more
distinguishable; and
Evaluate the ability of each dimension to capture an
important aspect of UX.
16
Step 3
17. Iterative testing
We iteratively tested the rubric with various
museum collections to further refine and
strengthen the dimensions.
Goal was to make them less ambiguous and more
observable.
• Ex: Harmonization and Mood building became
Strength of Visual Content and Visual Aesthetics.
Finally, split the dimensions into 3 categories
inspired by Don Norman’s model of
Emotional Design: Visceral, Behavioral,
Reflective.
17
Step 3
18. Choosing a rating scale
Typical rubrics use between 2-‐‑ and 5-‐‑point
rating scales.
Four rating scale points were chosen and a
neutral, non-‐‑judgmental language was
selected:
Incomplete
Beginning
Developing
Emerged
18
Step 4
19. Gradations of quality
Final step: writing clear and well-‐‑defined
gradations of quality for each rubric
dimension.
A 4-‐‑point rating scale should describe quality
ratings as:
No
No, but
Yes, but
Yes
19
Step 5
20. Final Assessment Rubric
Visceral (immediate impact)
1. Strength of visual content
2. Visual aesthetics
Behavioral (immediate usage)
3. System reliability & performance
4. Usefulness of metadata
5. Interface usability
6. Support for casual & expert users
Reflective (long-‐‑term usage)
7. Uniqueness of virtual experience
8. Openness
9. Integration of social features
10. Personalization of experiences
20
1 Incomplete
2 Beginning
3 Developing
4 Emerged
21. Ex: Strength of Visual Content
21
Incomplete
Beginning
Developing
Emerged
Artwork is a
peripheral
component of the
collection, with
text the dominant
visual element.
Images, when
present, are too
small and low
quality. Text is a
major distraction
from the visual
content.
[No]
[No, but]
[Yes, but]
[Yes]
Artwork is not
emphasized
throughout the
collection, and
images are rarely
the dominant
visual element.
Some images are
too small and/or
low quality. At
times, text is too
dense and
distracts from the
visual content.
Artwork is
featured
throughout the
collection, but
images are not
always the
dominant visual
element. Most
images are large
and high quality.
Text is used
purposefully, but
some is
superfluous.
Artwork is
presented as the
primary focus of
the collection,
with images as
the dominant
visual element.
All images are
large and high
quality. Text is
used purposefully
but sparingly to
enhance the
visual content.
22. Next Step: Rubric Quality
Four experts – two museum professionals and
two UX professionals – were asked to apply the
rubric to three online museum collections.
Sessions took ~90 minutes to complete (approx. 20
minutes per museum)
Held one-‐‑on-‐‑one (3 face-‐‑to-‐‑face, 1 remote)
Completed in August/September 2014
Three aspects of rubric quality:
Reliability
Validity
Utility
22
23. What is rubric reliability?
The extent to which using the rubric provides
consistent ratings of quality.
i.e.: do different raters provide the same (or
similar) ratings when applying the rubric to the
same interface?
This is known as inter-‐‑rater reliability.
Common measure: consensus agreement
23
24. UX Rubric Reliability
Participants rated three museum collections on
ten different dimensions.
30 potential opportunities for agreement.
Two estimates of agreement:
Conservative: all raters provide the same rating
• Target: Approximately 30% or higher
Liberal: all raters are within one rating point
• Target: Approximately 80% or higher
24
26. Reliability: Results [2]
Participant Type
Conservative
Liberal
All (4)
4 / 30 (13.3%)
19 / 30 (63.3%)
Museum (2)
14 / 30 (46.7%)
28/30 (96.3%)
UX (2)
9 / 30 (30.0%)
24 / 30 (80.0%)
26
27. Reliability: Discussion
27
Using the rubric was be3er than blind
guessing, but there is room for improvement.
Especially when combining UX and Museum
experts.
Conclusion: Don’t mix evaluators -‐‑ they should
all share a disciplinary background and
professional focus.
28. What is rubric validity?
The extent to which using the rubric provides
accurate measures of quality.
Many types of validity; for rubrics, two
common types:
1) Content Validity
2) Construct Validity
28
29. UX Rubric Content Validity
Content validity refers to the extent to which
the rubric measures things that actually
maXer.
i.e., do the dimensions of the rubric make sense?
Ideally, content validity is demonstrated by
soliciting feedback from subject ma3er
experts during rubric creation.
In this case, study participants were asked to rate
the perceived relevance of each rubric dimension.
29
31. Content Validity: Discussion
None of the experts proposed any other
concepts or elements that should have been
included.
Conclusion: Rubric has content validity, but
Reflective dimensions may need more refinement.
Are social features or personalization options
really the best way to engage online visitors?
Can challenges of providing “open” collection be
mitigated?
• These are open research questions.
31
32. UX Rubric Construct Validity
Construct validity refers to whether the rubric
actually measures the construct it is
supposed to measure.
i.e., is the UX rubric actually assessing UX?
Ideally, construct validity is demonstrated by
showing a correlation between rubric scores
and another accepted measure of quality.
But, there is no accepted measure of UX quality.
• Instead, study participants were asked to provide
perceived levels of construct validity.
32
34. Construct Validity: Discussion
All participants felt the rubric was an effective
measure of UX.
But, museum-‐‑centric language was a perceived
barrier for the UX experts.
Conclusion: Rubric has construct validity, but
language could be more accessible to non-‐‑museum
experts.
34
35. What is rubric utility?
The actual impact of using the rubric as an
assessment instrument.
i.e., does using the rubric make a difference?
Arguably the most complex and most
important quality of a rubric.
But, measuring actual impact is nearly impossible
(too many confounding factors).
35
36. UX Rubric Utility
Instead, focus on perceived impact.
Evaluators need to think the rubric is valuable,
otherwise they’ll be unlikely to use it.
Need to demonstrate the extent evaluators
believe the rubric is:
Useful?
Easy to use?
Easy to learn?
36
38. Utility: Discussion [1]
All participants affirmed the utility of the
rubric as an assessment instrument.
Biggest benefit is to aid decision-‐‑making:
UX expert: the rubric seems like a great tool to
“help museums figure out their digital budget.”
How? By providing a snapshot of the
assessment results.
38
40. Summary
Study results show that the rubric is a reliable,
valid, and useful assessment instrument.
Future work:
• Clarify museum-‐‑specific language.
• Examine the reflective dimensions more closely.
• Study the practicality of the rubric through an applied
case study with a museum partner.
Conclusion: Rubric can provide valuable guidance
for museums interested in improving their users’
experience with online collections.
40
41. Thank you
Craig M. MacDonald, Ph.D.
cmacdona@pra3.edu
@CraigMMacDonald
www.craigmacdonald.com