Exploring Datasets With SQLite
Context
European Soccer Database (ESD) used to study team dynamics and identify the factors that lead to player’s and team’s success.
Objective
Run queries to inspect its structure through SQLite
Strategies
1. Import the European Soccer Database file into DB Browser (SQLite) and find the total number of tables in the database
2. Using the ‘Country’ table, run a SQL query to show the list of countries in descending order (Z-A) based on the country name
3. Display the specified columns from the ‘Team_Attributes’ table with filtered rows based on ‘buildUpPlaySpeed’
4. List all the players with the specified conditions in a table with the specified columns
Author: Anthony Mok
Date: 18 Nov 2023
Email: xxiaohao@yahoo.com
Unlocking New Insights Into the World of European Soccer Through the European Soccer Database (ESD)
1. Exploring Datasets With SQLite
Unlocking New Insights Into the World of European
Soccer Through the European Soccer Database (ESD)
Author: Anthony Mok
Date: 18 Nov 2023
Email: xxiaohao@yahoo.com
2. European Soccer Database (ESD)
European Soccer Database (ESD) is a
comprehensive dataset that contains
detailed information about:
• European soccer leagues
• Teams
• Players
• Matches
• Covers 11 European countries
• Data from the 2008-2016 seasons
Database’s Characteristics Entities in the Database
Consists of 7 tables: Country, League, Match, Player,
Player Attributes, Team, ands Team Attributes
3. Project’s Context, Objective & Strategies
Context
European Soccer Database (ESD)
used to study team dynamics and
identify the factors that lead to
player’s and team’s success
Objective
Run queries to inspect its
structure through SQLite
Strategies*
• Import the European Soccer Database file into
DB Browser (SQLite) and find the total number
of tables in the database
• Using the ‘Country’ table, run a SQL query to
show the list of countries in descending order (Z-
A) based on the country name
• Display the specified columns from the
‘Team_Attributes’ table with filtered rows based
on ‘buildUpPlaySpeed’
• List all the players with the specified conditions
in a table with the specified columns
* This is just a sample of many queries
deployed on this database
4. Importing File to SQLite; 8 Tables Found
European Soccer Database in SQLlite
5. Listing of Countries
Following SQL query script was used to
show the list of countries, in descending
order (Z-A), based on the country name in
the “Country” table:
SELECT
name as [List of Countries]
FROM
Country
ORDER BY
name DESC;
6. Conditionally Display Selected Columns
Following SQL query script was used to
display, in this order, the fields of
‘team_api_id’, ‘date’, ‘buildUpPlaySpeed’,
‘buildUpPlayDribblingClass’ and
‘buildUpPlaySpeedClass’. Only rows with
‘buildUpPlaySpeed’ greater than 74 but less
than 79 were displayed:
SELECT
team_api_id as [Team API ID],
strftime("%d-%m-%Y", date) as [Date], -- since date was asked for, the time was
excluded from the query
buildupplayspeed as [Build Up Play Speed],
buildUpPlayDribblingClass as [Build Up Play Dribbing Class],
buildUpPlaySpeedClass as [Build Up Play Speed Class]
FROM
Team_Attributes
WHERE
buildUpPlaySpeed BETWEEN 75 AND 78; /* since the range is >74, but < 79
which means the WHERE Clause
excludes these two numbers
*/
7. Conditionally Display Selected Players
Following SQL query script (found in the next slide) was used to list all the
players with the following conditions:
height: more than 181 cm
preferred_foot: right
attacking_work_rate: high
Data is shown in a table with fields ordered in the following manner:
‘player_api_id’, ‘player_name’, ‘height’, ‘attacking_work_rate’ and
‘preferred_foot’
8. Conditionally Display Selected Players
SELECT
p.player_api_id as "Player's API ID", -- aliases were used to improve readibility of this query
p.player_name as "Player's Name",
p.height as "Player's Height",
pa.attacking_work_rate as [Attacking Work Rate],
pa.preferred_foot as [Preferred Foot]
FROM
Player as p
INNER JOIN
Player_Attributes as pa /* since the data points come from two tables, an INNER JOIN was executed in this Query
to avoid creating NULL returns
*/
ON
p.player_api_id = pa.player_api_id
WHERE
p.height > 181
AND pa.preferred_foot = "right" /*while the syntax of AND in the WHERE Clause was not taught in the modules,
research was conducted through the SQL documentation,
and this WHERE Clause was constructed this way
*/
AND pa.attacking_work_rate = "high"
GROUP BY
p.player_api_id; /* observed that there were many rows of the same player-api_id with same data points in the rows,
so GROUP BY was used to narrow the returns of this SQL QUERY from 12,309 rows to 1,032 rows
*/
9. Exploring Datasets With SQLite
Unlocking New Insights Into the World of European
Soccer Through the European Soccer Database (ESD)
Author: Anthony Mok
Date: 18 Nov 2023
Email: xxiaohao@yahoo.com