Introduction I am a music lover and like my other hobbies I am really interested in applying data science methods to it. A few months ago I participated in the third week of the TidyTuesday project where I made a map of Spotify songs based on audio features and a dimensionality reduction algorithm called UMAP. Since then I have been using Spotify’s Web API to collect data and recently, I decided to look at some of my favorite Iranian artists and their songs on Spotify.
1. Introduction It wasn’t possible for many of us to watch every 2020 Democratic Primary debate. It was important for some of us to know what happened during the debates. In my case, I was reading about what happened in debates in some kind of online newspapers or I watched a highlight of a debate on Youtube the next day. However, they only give a short summary of a debate or just broadcast a portion of debates that includes a heated exchange of opinions between candidates.
I am not a US citizen, nor have I been to the United States, but that does not mean that I should not care about the result of the US presidential election. The outcome of the election plays an important role in my life and almost everyone’s else around the world. So, I have been following the US politics for a few years.
I consider everything and every issue around me as a data science problem and an opportunity to use data science.
Since the conquest of Persia (now Iran) by the Muslim forces in the 7th century, Arabic culture and language have had a huge influence on Iran and Iranians. Although Iran had never fully adapted Arabic as its main language, but the new Persian (Farsi) language is a mix of Arabic and the old Persian (Pahlavi) and almost use the same alphabet for writing. Also, in some parts of Iran, Arabic is the daily-life language.
In the 4th week of the Tidy Tuesday project, a very interesting and fun dataset was proposed to the data science community. The dataset contains information about thousands of songs on Spotify’s platform and along with their metadata and audio features. You can download the dataset can using the following piece of code.
4th week of the Tidy Tuesday project
spotify_songs <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-21/spotify_songs.csv') head(spotify_songs) ## # A tibble: 6 x 23 ## track_id track_name track_artist track_popularity track_album_id ## <chr> <chr> <chr> <dbl> <chr> ## 1 6f807x0~ I Don't C~ Ed Sheeran 66 2oCs0DGTsRO98~ ## 2 0r7CVbZ~ Memories ~ Maroon 5 67 63rPSO264uRjW~ ## 3 1z1Hg7V~ All the T~ Zara Larsson 70 1HoSmj2eLcsrR~ ## 4 75Fpbth~ Call You ~ The Chainsm~ 60 1nqYsOef1yKKu~ ## 5 1e8PAfc~ Someone Y~ Lewis Capal~ 69 7m7vv9wlQ4i0L~ ## 6 7fvUMiy~ Beautiful~ Ed Sheeran 67 2yiy9cd2QktrN~ ## # .
For 6 years I had been using python exclusively as the main tool for carrying out my data science tasks and running my experiments. Recently, I have started using Tidyverse packages and tools in R for my data science activities. I am completely facinated by how these tools make it easy for me to perform analysis and create nice visualization. Since then I have tried to participate in the weekly Tidy Tuesday project.