The paper focuses on analyzing how music artist influence one another. This analysis is a part of evaluation of the music metadata for being used as Semantic Web data source. The music dataset case study shall reveal problems to be solved before enabling the data to be usable for automatic inferencing by Web 3.0 user agents. The described part of the research is finding the authors and performers of the most covered works. The analysis is based on the musicbrainz dataset, mostly on relationship metadata stored in l_entity_entity tables. Results are presented and the main problems of the dataset and analysis approach are discussed.
2. Agenda
• Music (Artist) influence
• Cover - influence on artists
• The dataset and
the Semantic Web context
• Authors vs. Performers
• ‘First Performer of a Work’ approach
• The approach’s results and problems
• Conclusions
Analyzing Music Metadata on Artist Influence
page 2 of 32
7. Measurable influence
using the work (recordings) of others
sampling, covering, mixing
”social networks of artists”
„Artists” + producers, engineers, management
Music Communities
Analyzing Music Metadata on Artist Influence
page 7 of 32
8. Measurable influence cont.
Take
online music (meta)data
Evaluate
the possibility of
making the data machine readable
the relationships of influence
are not as straightforward
as being an author
or being a band member
no simple way for converting it to RDF triples
GOAL: make the relationships human readable first
Usecase: find most covered artists (authors and performers)
Analyzing Music Metadata on Artist Influence
page 8 of 32
9. Measurable influence cont.
indexed facts from music industry
Covering
– recording own version of someone else’s song
Sampling
– using a part of another artist’s recording
in own recording
Analyzing Music Metadata on Artist Influence
page 9 of 32
10. dataset MUSICBRAINZ
Semantic Web
Google
Knowledge Graph
Freebase
Musicbrainz
Analyzing Music Metadata on Artist Influence
page 10 of 32
14. Influence from l_artist_artist
RDF triple
subject – predicate - object
Sting – member of – The Police
Lennon – co-author with – McCartney
May, Taylor – 2/3 members of Smile
& then May, Taylor – 2/4 members of Queen
also artist_credit: ”Queen & David Bowie”
Analyzing Music Metadata on Artist Influence
page 14 of 32
15. What is a cover?
The
Beatles
ORIGINAL ARTIST
Joe
Cocker
COVERING ARTIST
covering
Analyzing Music Metadata on Artist Influence
page 15 of 32
„With a Little Help from My Friends"
17. l_recording_work
name: ”performance”
description: ”This is used
to link works to their
recordings.”
link_phrase: ”live medley:medley
including a partial instrumental
cover recording of”
the original recording of a
work is also in this class
(along with its covers)
18. Is it a cover? cont.
explicitly declare:
”if an artist is making the first recording of
a work then all his later recordings should
be excluded from the set of the work’s
covers”
Analyzing Music Metadata on Artist Influence
page 18 of 32
19. Is it a cover? cont.
explicitly declare:
”if an artist is making the first recording of
a work then all his later recordings should
be excluded from the set of the work’s
covers”
„Yesterday” (work)
original recording artist - The Beatles
later recording „by Paul McCartney”
???
Analyzing Music Metadata on Artist Influence
page 19 of 32
20. Why not use ?
1. „How to convert to RDF?”
relationship information is not explicit
within the SQL schema
2. PostgreSQL - 17GB (textual only!)
<13 millions of recordings of
~0.5 million works by
< 0.8 million artists in
<1.3 million releases
+ RDF serializations hugely redundant
Analyzing Music Metadata on Artist Influence
page 20 of 32
21. Top Influential Artists
Analyzing Music Metadata on Artist Influence
page 21 of 32
Authors of the most covered works
(l_artist_work with a role of
’composer’, ’lyricist’ or plainly
’writer’)
The horizontal scale shows number of
distinct artist_credits that covered
(released a performance) of a work
SELECT a.name, COUNT(DISTINCT ac.name) as c
FROM l_recording_work lrw
JOIN recording r ON r.id=lrw.entity0
JOIN l_artist_work law ON law.entity1=lrw.entity1
JOIN artist a ON a.id=law.entity0
JOIN artist_credit ac ON ac.id=r.artist_credit
JOIN link l ON l.id=lrw.link
JOIN link_type lt ON lt.id=l.link_type
WHERE lt.name=’performance’
GROUP BY a.name
ORDER BY c desc;
22. Top Influential Artists
Analyzing Music Metadata on Artist Influence
page 22 of 32
Authors of the most covered works
(l_artist_work with a role of
’composer’, ’lyricist’ or plainly
’writer’)
The horizontal scale shows number of
distinct artist_credits that covered
(released a performance) of a work
SELECT a.name, COUNT(DISTINCT ac.name) as c
FROM l_recording_work lrw
JOIN recording r ON r.id=lrw.entity0
JOIN l_artist_work law ON law.entity1=lrw.entity1
JOIN artist a ON a.id=law.entity0
JOIN artist_credit ac ON ac.id=r.artist_credit
JOIN link l ON l.id=lrw.link
JOIN link_type lt ON lt.id=l.link_type
WHERE lt.name=’performance’
GROUP BY a.name
ORDER BY c desc;
23. Authors vs. Performers
We get
Lennon and McCartney,
Jagger and Richards
Page and Plant
i.e. The Beatles,
The Rolling Stones
Led Zeppelin
But what about…
…”the King of Rock and Roll” ?!?
Analyzing Music Metadata on Artist Influence
page 23 of 32
24. First Performer of a Work cont.
”virtual ownership” of a work
said to be ”her song”
l_artist_recording and l_recording_work
with link_type = ’performer’
(since l_artist_work authors)
But…
for ’Yesterday’ we get
Paul McCartney - link_type =’vocal’
=’instrument (guitars)’
No ’The Beatles’ with link_type = ’performer’
Analyzing Music Metadata on Artist Influence
page 24 of 32
26. First Performer of a Work cont.
recording
track
medium
release
release_country
artist_credit_name holds
the performer (band) name
release_country holds
the (earliest) dates
Analyzing Music Metadata on Artist Influence
page 26 of 32
27. page 27 of 32
First Performer of a Work cont.
work 1st performer
earliest date
of recording
or releasing
nr of
covers
Star of the County Down BBC S. Orchestra 1945-01-02 227
The Christmas Song Nat King Cole 1946-06-14 212
Over the Rainbow Judy Garland 1938-10-07 205
Yesterday The Beatles 1965-08-06 185
Orchestersuite Nr. 3 D-Dur, BWV
1068: II. Air Pau Casals 1916-05-05 181
Summertime George Gershwin 1935-10-14 173
Eleanor Rigby The Beatles 1966-08-05 167
Stardust Hoagy Carmichael 1927-10-31 157
Moon River Henry Mancini 1945-01-02 154
Night and Day Django Reinhardt 1938-01-31 142
Fly Me to the Moon (In Other Words) Nat King Cole 1961-12-22 133
28. Top Influential Performers
Analyzing Music Metadata on Artist Influence
page 28 of 32
Original performers
of the most covered works.
The vertical axis groups works
(of different authors) by performers
(previous table)
The horizontal log scale is showing
the number of distinct artists covering
each work.
29. Scope of MusicBrainz data
Analyzing Music Metadata on Artist Influence
page 29 of 32
At one time, Guinness World Records cited "Yesterday" (1965)
with the most cover versions of any song ever written – 2,200.
However, "Summertime", an aria composed by George
Gershwin (1935) has been claimed to have well over 30,000
recorded performances.
31. Conclusions
analysis pitfalls and shortcomings come from the
massiveness of the music social network
analysis revealed gaps (NULLs) and inconsistencies
problematic MB Editor Guidelines:
”Prefer Specific Relationship Types”
global analysis is hard
better for more specific analysis
”Do not cluster” - the opposite of Linked Data principle
sparse network of relationships
hard to traverse
Analyzing Music Metadata on Artist Influence
page 31 of 32