Spotify
Source
We have two spotify data sets. The first dataset is available to download here: www.kaggle.com/datasets/ksuqing/spotify-data-1986-2023
The second one is obtained from github.com/Juanfra21/data-science/blob/main/Final-Project/spotify_dataset.csv
Description of the Data
Spotify Data 1
The first Spotify dataset contains spotify tracks with detailed audio features and popularity score (0-100).
Here is information on the variables in the dataset:
Feature | Description |
---|---|
track_id |
Unique identifier for the track |
track_name |
Name of the track |
popularity |
Popularity score (0–100) based on Spotify plays |
available_markets |
Markets/countries where the track is available |
disc_number |
Disc number (for albums with multiple discs) |
duration_ms |
Track duration in milliseconds |
explicit |
Whether the track contains explicit content (True/False) |
track_number |
Position of the track within the album |
href |
Spotify API endpoint URL for the track |
album_id |
Unique identifier for the album |
album_name |
Name of the album |
album_release_date |
Release date of the album |
album_type |
Album type (album, single, compilation) |
album_total_tracks |
Total number of tracks in the album |
artists_names |
Names of the artists on the track |
artists_ids |
Unique identifiers of the artists |
principal_artist_id |
ID of the principal/primary artist |
principal_artist_name |
Name of the principal/primary artist |
artist_genres |
Genres associated with the principal artist |
principal_artist_followers |
Number of Spotify followers of the principal artist |
acousticness |
Confidence measure of whether the track is acoustic (0–1) |
analysis_url |
Spotify API URL for detailed track analysis |
danceability |
How suitable a track is for dancing (0–1) |
energy |
Intensity and activity measure of the track (0–1) |
instrumentalness |
Predicts whether a track contains vocals (0–1) |
key |
Estimated key of the track (integer, e.g. 0=C, 1=C#/Db) |
liveness |
Presence of an audience in the recording (0–1) |
loudness |
Overall loudness of the track in decibels (dB) |
mode |
Modality of the track (1=major, 0=minor) |
speechiness |
Presence of spoken words (0–1) |
tempo |
Estimated tempo in beats per minute (BPM) |
time_signature |
Estimated overall time signature |
valence |
Musical positivity/happiness of the track (0–1) |
year |
Year the track was released |
duration_min |
Track duration in minutes |
Spotify Data 2
This dataset provides 114000 tracks from Spotify with their audio features and relevant data. Each row corresponds to the individual tracks. Spotify’s web API was used to obtain the data. We have information of both numerical data and descriptive features.
We can read more about each of the columns here also: huggingface.co/datasets/maharshipandya/spotify-tracks-dataset