I: Abstract

Have you ever finished listening to a playlist on Spotify and had completely different, random, and unrelated songs begin playing in succession? If not, have you ever finished listening to a playlist on Spotify and wonder what songs are similar to the playlist that you just finished? Oftentimes, we find ourselves in a certain mood after listening to a specific type of music. For example, after listening to a pop playlist, we may be in a good mood or feeling somewhat energetic. For that reason, we would not want to start listening to slower paced music such as classical, shortly after finishing the pop playlist. Our group exists to solve this problem by finding a way to order other playlists based on similarity to the first playlist so that we can be in one continuous state of mind.


II: Introduction

Once again, through this project, our group would like to create somewhat of a recommender system that orders a second playlist based off of similarity to a first playlist. We plan to go about doing this by first taking data from two playlists that one of our group members has on Spotify. Then, we will do some introductory data analysis to determine if the playlists have any similarities before ordering them. We will then begin the process of trying to order a playlist in terms of similarity to the first. One method that we will attempt is to average numerical variables from the songs in the first playlist as a way to categorize the first playlist as one “type” of music. Next, we will find the error between the numerical values of songs in the second playlist and the average values of the first. This will allow us to see which songs are similar to the first playlist in terms of those variables. One big deciding factor for us is to choose which variables to use as we do not want our program to be too broad, but at the same time, we want to include all the necessary variables in order that we may determine what constitutes as similar or not.


III: Data

The datasets that we will be using for this project are two Spotify playlists from one of our group members. We were able to obtain the data from a website known as Organizeyourmusic.com which is linked here. This website was created by Paul Lamere who builds music recommenders at Spotify. His twitter is linked here. This specifc website was created on August 6, 2016 during The Science of Music Hackathon in NYC. The website runs in conjunction with Spotify in order to give the user data on their music tastes and playlists. After signing into your Spotify account on the website, the user is able to get information on all their playlists. For this project, we are using two playlists that we titled playlist1 and playlist2 for simplicity. The first dataset has 55 observations while the second has 67. In addition, both datasets have 13 variables:

  • Title (title): the song name
  • Artist (artist): The music artist that the song belongs to, excluding features
  • Genre (genre): The genre of music that the song falls under
  • Beats Per Minute (bpm): The tempo of the song
  • Energy (enrgy): The energy of the song. The higher the value, the more energetic the song is
  • Danceability (dance): The higher the value, the more energetic the song is
  • Loudness (dB): The higher the value, the louder the song
  • Liveness (live): The higher the value, the more likely the song is a live recording
  • Valence (val): The higher the value, the more positive mood is for the song
  • Duration (dur): The length of the song
  • Acousticness (acous): The higher the value, the more acoustic the song is
  • Speechiness (spch): The higher the value, the more spoken word the song contains
  • Popularity (pop): The higher the value, the more popular the song is

A lot of these variables seem awfully subjective and we are not completely sure how these are all measured, but the creator is a credible source so we are using his playlist program. All the variables in this dataset will be useful for this project, especially the numerical ones where we can do most of our calculations to determine if a song is similar to the other playlist or not. These numerical variables include BPM, energy, danceability, loudness, liveness, valence, duration, acousticness, speechiness, and popularity.


IV: Exploratory Data Analysis

This chart displays the number of songs within each genre of the first playlist. The first thing that we noticed when quickly glancing over this chart is one of the genres is listed as “NA” which we didn’t originally notice when looking at the raw dataset. The only reason I can think of for this labeling is that the program that converts all your playlist into a dataframe could not identify the genre for this one particular song, which happens to be “All in Time.”

In addition, it is clear that the genre with the most songs in this playlist is under the category of k-pop.


The above chart shows the number of songs within each genre of the second playlist, the playlist that we are ordering based on the first playlist. In comparison to the first playlist, there is not much difference in terms of the number of songs in the playlist as the first one has 55 songs while the second has 67 songs. However, there is a wider variety of genres in this second playlist as this playlist has 30 genres while the first has 18 genres.

Another important piece of information that we can take away from these two charts is that k-pop was the most frequent genre in both the first and second playlist as there were 17 k-pop songs in playlist 1 and 10 and playlist 2. For this reason, we can assume that the two playlists are relatively similar as both of them have the most songs in the same genre: k-pop. Therefore when ordering the second playlist, we can make the prediction that the k-pop songs will be at the top of the list as they will have the most similarity to the songs in the first playlist.


The next thing that we decided to do in order to get a better view of the data was to organize the dataframe based on artist. We grouped both the playlists based on artist in order to determine how many songs per artist were in the playlists. We then ordered them by which artists had the most songs on the playlist and displayed the top 5.

Top 5 Artists in Playlist1
Artist Number of Songs
88rising 8
DPR LIVE 8
Bazzi 3
pH-1 3
Rich Brian 3
Top 5 Artists in Playlist2
Artist Number of Songs
BLACKPINK 7
21 Savage 4
Friday Night Plans 4
Ariana Grande 3
Justin Hurwitz 3

As shown above, there is not much overlap between the top artists of both playlists as we do not see any of the top 5 artists in playlist one as top artists in playlist two.



Above are two different graphs that we coded in python using seaborn and matplotlib. One is a distplot and the other is a histogram, showing all ten numerical variables in the dataset. These graphs give us a better idea of where the average or median will be in relation to each of these variables.


V: Analysis and Discussion

V.I: Ordering Playlist: Attempt 1

Moving onto writing the program that sorts the second playlist based on the first playlist, we first want to average out all the values for the numerical variables in the first playlist. We decided upon an average as we thought that this would be the best method for finding the values that best describe the playlist as a whole. While using a median is also a viable option, we felt as if there weren’t enough songs in this first playlist for a median to be truly effective. Once we took the average of all the variables in the first playlist, we added them to the second dataset as columns so that later we could mutate the dataset to find distances from the average.

Next, we took the numerical values from the second playlist and found the difference between those values and the average values from the first playlist. Taking the absolute value of that difference gives us the distance from the average of the first playlist.

Finally, mutating the dataset once more by adding up all ten distances from the averages of the first playlist gives us the total error/distance between each song in the second playlist and the average from the first playlist. Ordering the playlist in ascending order based on the error gives us the second playlist reordered based on similarity to the first. The chart below shows the top ten songs from the second playlist that are the most similar to the first based on the ten numerical variables in the dataset.

Reordered Playlist2 Using All Numerical Variables
Title Artist Genre Error From Avg
Better - SG Lewis x Clairo SG Lewis alternative r&b 83.74545
Don’t Know What To Do BLACKPINK k-pop 91.58182
This Could Be Us Rae Sremmurd hip hop 92.52727
New Rules Dua Lipa dance pop 96.58182
Rich Nigga Shit (feat. Young Thug) 21 Savage atl hip hop 100.90909
In My Feelings Kehlani pop 101.89091
I Didn’t Realize How Empty My Bed Was Until You Left Roderick Porter emo rap 102.65455
Heebiejeebies - Bonus Aminé hip hop 103.92727
all my friends 21 Savage atl hip hop 104.58182
PLAYING WITH FIRE BLACKPINK k-pop 108.70909

V.II: Testing For Accuracy

As for testing the accuracy of the results of our program, it is quite difficult as the similarity of songs can be a rather subjective topic. Simply stated, one might listen to our reordered playlist and think the first few songs sound nothing like playlist one, but someone else might think that they sound similar. For this reason, we decided to test our program using a baseline playlist that has less variability in it. Now that I’m thinking about it, I am not completely sure why we did not do this in the first place.

Our new playlist one is a playlist that contains purely EDM. We renamed this dataset “edmplaylist” and it contains 83 songs that are all either of genre “EDM” or a subgenre of EDM. For reference, EDM is an acronym for Electronic Dance Music and is known to be faster in terms of pace and contains less lyrics. For that reason, we expect a higher average BPM, dance, and energy as well as a lower average speech count. Since all the songs are similar, we would also expect the second playlist to be reordered with EDM and EDM-like songs at the top.

For our test playlist, we decided to use one of our group member’s November playlist which contains 18 songs, 6 of which are EDM and the rest being a mixture of rap and hip hop. Using a shorter playlist for the second playlist will allow us to test for accuracy better.

We started by graphing a histogram based on the number of songs in each genre for each playlist. Below, we can see that the first playlist is all EDM and subgenres of EDM such as dance pop, dubstep, and big room.

The histogram for the number of songs by genre for the November playlist is shown below. There are 2 EDM, 1 dance pop, one chillstep, 1 brostep and 1 big room song, summed up to a total of 6 EDM related songs while the rest are hip hop or rap related. Once again, by reordering this playlist based on the EDM playlist, we would expect to see the EDM related songs at the top of the list.

We then ran our program again, using the EDM playlist averages to get the November playlist reordered below. We decided to show all the songs this time as the playlist is rather short to begin with.

Reordered Nov. Playlist Using All Variables
Title Artist Genre Error From Avg
Fragile (feat. Melanie Fontana) ARMNHMR edm 70.09639
Waiting For You (feat. RUNN) Trivecta chillstep 115.67470
Love U Right - Yetep & Zephure Remix Tritonal big room 122.93976
Safe With Me (with Audrey Mika) Gryffin dance pop 125.43373
Better Off Lonely Nurko edm 130.44578
Don’t Need Friends (feat. Lil Baby) NAV canadian hip hop 169.54217
FEEL SOMETHING (feat. Marshmello) The Kid LAROI australian hip hop 180.03614
1400 / 999 Freestyle Trippie Redd melodic rap 181.85542
Your Fault Excision brostep 181.93976
Repercussions (with Young Thug) NAV canadian hip hop 191.25301
F*CK YOU, GOODBYE (feat. Machine Gun Kelly) The Kid LAROI australian hip hop 202.61446
Young Wheezy (with Gunna) NAV canadian hip hop 204.48193
WITHOUT YOU The Kid LAROI australian hip hop 207.72289
TRAGIC (feat. Youngboy Never Broke Again & Internet Money) The Kid LAROI australian hip hop 211.66265
PIKACHU The Kid LAROI australian hip hop 243.60241
ALWAYS DO The Kid LAROI australian hip hop 246.85542
Diamond Choker Lil Gnar atl hip hop 258.44578

As seen above, most of the EDM songs are towards the top of the playlist with all 6 of them being in the top 9 songs in the reordered playlist. While this is rather accurate in our opinion, we felt as if we could do a better job in terms of our program’s accuracy as there were still a couple of hip hop songs that were classified as more similar than some EDM songs.

Our next step in attempting to make our program more accurate was to limit the number of variables that we were using. Using more variables could throw the distance from the average off. For example, one song might be very similar to the average of the other playlist, but the fact that the song is 10 minutes long would make the margin of error extremely large, thus leaving it towards the bottom of the reordered playlist.

For this reason, we decided to remove the variables loudness, duration, and popularity due to the fact that we felt as if these variables did not do a good job of telling if a song was similar to the playlist or not.

Reordered Nov. Playlist Using Limited Variables
Title Artist Genre Error From Avg
Fragile (feat. Melanie Fontana) ARMNHMR edm 59.20482
Love U Right - Yetep & Zephure Remix Tritonal big room 65.90361
FEEL SOMETHING (feat. Marshmello) The Kid LAROI australian hip hop 66.92771
Waiting For You (feat. RUNN) Trivecta chillstep 74.87952
Better Off Lonely Nurko edm 83.55422
Your Fault Excision brostep 83.87952
Safe With Me (with Audrey Mika) Gryffin dance pop 89.22892
F*CK YOU, GOODBYE (feat. Machine Gun Kelly) The Kid LAROI australian hip hop 93.50602
Young Wheezy (with Gunna) NAV canadian hip hop 105.27711
Don’t Need Friends (feat. Lil Baby) NAV canadian hip hop 105.33735
1400 / 999 Freestyle Trippie Redd melodic rap 106.74699
Repercussions (with Young Thug) NAV canadian hip hop 109.04819
TRAGIC (feat. Youngboy Never Broke Again & Internet Money) The Kid LAROI australian hip hop 111.55422
WITHOUT YOU The Kid LAROI australian hip hop 113.61446
PIKACHU The Kid LAROI australian hip hop 127.49398
ALWAYS DO The Kid LAROI australian hip hop 141.74699
Diamond Choker Lil Gnar atl hip hop 164.33735

After taking out those variables, we reran the program on the data once more and our results are shown above. This time, all 6 of the EDM related songs were in the top 7 most similar songs to the EDM playlist. In addition, the only non-EDM song in the top 7 is a hip hop song produced by Marshmello who is technically an EDM producer so it would make sense that the song has some similarity to the EDM playlist.

This method seems as if it is a more accurate way of reordering playlists based on similarity to another one.


V.III: Reordering Playlist: Attempt 2

We will now go back to our original two playlists and run the limited variables program on them to reorder the second playlist at a higher level of accuracy this time.

Reordered Playlist2 Using Limited Variables
Title Artist Genre Error From Avg
WAKE UP Travis Scott rap 54.92727
20 Min Lil Uzi Vert melodic rap 55.92727
New Rules Dua Lipa dance pop 57.92727
Herman’s Habit Justin Hurwitz hollywood 58.40000
This Could Be Us Rae Sremmurd hip hop 62.87273
Heebiejeebies - Bonus Aminé hip hop 67.40000
I Didn’t Realize How Empty My Bed Was Until You Left Roderick Porter emo rap 68.61818
Can’t Feel My Face The Weeknd canadian contemporary r&b 68.92727
What It Is Quality Control hip hop 69.47273
Don’t Know What To Do BLACKPINK k-pop 69.92727

Above are the top 10 songs in playlist2 that are considered to be most similar to the average of playlist1 according to our program that uses the variables of BPM, energy, danceability, liveness, valence, acousticness, and speechiness. We can assume that this is pretty accurate based on our previous test of the program.


VI: Conclusion

One of the main problems that we had with this project is that it is rather difficult to test for accuracy in terms of how similar a song is to another playlist. Similarity in this case is rather subjective as one person might think they sound the same while another does not. Looking back on our approach to this project, our group probably should not have used two playlists that had such a wide variety of genres as our baseline playlists. Because of this, we had no idea how accurate the reordered playlist was in comparison to the first playlist. With that being said, this forced us to test our program by using other playlists, one of which containing songs of only one genre. Using a playlist of only one genre allowed us to test which variables would make songs of that same genre to have a lower margin of error, thus making them higher in the reordered playlist. In the end, we decided that using the variables BPM, energy, danceability, liveness, valence, acousticness, and speechiness made our program the most accurate in terms of ordering the playlist based on similarity to the other one. On a side note, this project also required us to trust the data that was pulled from the playlists. For example, how does one determine a numerical value for the danceability of a song? That sounds rather subjective as well. Lastly, our group has a few ideas of how we could possibly make our program more accurate. One thing we can do is use a median instead of an average as a way to get collective values for the entire baseline playlist. Another direction we could take is running more tests on various playlists and using different combinations of variables to see if we could make it more accurate. All in all, we managed to complete our goal of reordering a playlist based on similarity to another playlist. If you are ever really enjoying a playlist and want to see which of your own songs are similar to it, we can easily help.


VIII: Appendix

First half contains all the R code while everything in comments is Python code that our group wrote in another environment. All code is our own work.

library(tidyverse)
library(sf)
library(readr)
library(USAboundaries)
library(USAboundariesData)
library(rnaturalearth)
library(rnaturalearthdata)
library(scales)
playlist1 <- read_csv("~/github/dsclub-spotify-recommender/data/Spotify Playlist 1 - My Spotify Playlist-2.csv")

playlist2 <- read_csv("~/github/dsclub-spotify-recommender/data/Playlist 2 (make a queue for this playlist) - Sheet1.csv")

no_songs = playlist1 %>% 
  group_by(genre) %>% 
  summarize(Num_of_songs = n())

ggplot(data = no_songs, aes(x = genre, y= Num_of_songs), las=2) +
  geom_bar(stat="identity", color='#2e4057', fill='#66a182') +
  theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5)) +
  labs(x = 'Genre',
       y = 'Number of Songs',
       title = 'Total Number of Songs by Genre in Playlist 1',
       caption = "Based on Playlist 1")
playlist2 = playlist2 %>% 
  rename(
    enrgy = nrgy,
    dance = dnce,
    genre = 'top genre'
  )

no_songs2 = playlist2 %>% 
  group_by(genre) %>% 
  summarize(Num_of_songs = n())

ggplot(data = no_songs2, aes(x = genre, y= Num_of_songs), las=2) +
  geom_bar(stat="identity", color='#2e4057', fill='#66a182') +
  theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5)) +
  labs(x = 'Genre',
       y = 'Number of Songs',
       title = 'Total Number of Songs by Genre in Playlist 2',
       caption = "Based on Playlist 2")
topartists1 = playlist1 %>% 
  group_by(artist) %>% 
  summarise(Num_songs = n()) %>% 
  arrange(-Num_songs) %>% 
  head(5)

knitr::kable(topartists1, caption = "Top 5 Artists in Playlist1", col.names = c("Artist","Number of Songs"), "simple", format.args = list(big.mark = ",", 
  scientific = FALSE))
topartists2 = playlist2 %>% 
  group_by(artist) %>% 
  summarise(Num_songs = n()) %>% 
  arrange(-Num_songs) %>% 
  head(5)

knitr::kable(topartists2, caption = "Top 5 Artists in Playlist2", col.names = c("Artist","Number of Songs"), "simple", format.args = list(big.mark = ",", 
  scientific = FALSE))
avgbpm = mean(playlist1$bpm)
avgenrgy = mean(playlist1$enrgy)
avgdance = mean(playlist1$dance)
avgdB = mean(playlist1$dB)
avglive = mean(playlist1$live)
avgval = mean(playlist1$val)
avgdur = mean(playlist1$dur)
avgacous = mean(playlist1$acous)
avgspch = mean(playlist1$spch)
avgpop = mean(playlist1$pop)

playlist2a = playlist2

playlist2a$bpmavg = avgbpm
playlist2a$enrgyavg = avgenrgy
playlist2a$danceavg = avgdance
playlist2a$dBavg = avgdB
playlist2a$liveavg = avglive
playlist2a$valavg = avgval
playlist2a$duravg = avgdur
playlist2a$acousavg = avgacous
playlist2a$spchavg = avgspch
playlist2a$popavg = avgpop
playlist2a = playlist2a %>% 
  mutate(bpmdiff = abs(bpm - bpmavg)) %>% 
  mutate(enrgydiff = abs(enrgy - enrgyavg)) %>% 
  mutate(dancediff = abs(dance - danceavg)) %>% 
  mutate(dBdiff = abs(dB - dBavg)) %>% 
  mutate(livediff = abs(live - liveavg)) %>% 
  mutate(valdiff = abs(val - valavg)) %>% 
  mutate(durdiff = abs(dur - duravg)) %>% 
  mutate(acousdiff = abs(acous - acousavg)) %>% 
  mutate(spchdiff = abs(spch - spchavg)) %>% 
  mutate(popdiff = abs(pop - popavg))
playlist2a = playlist2a %>% 
  mutate(totaldiff = bpmdiff + enrgydiff + dancediff + dBdiff + livediff + valdiff + durdiff + acousdiff + 
           spchdiff + popdiff)

arrangedplaylist2 = playlist2a %>% 
  arrange(totaldiff) %>% 
  select(title, artist, genre, totaldiff) %>% 
  head(10)

knitr::kable(arrangedplaylist2, caption = "Reordered Playlist2 Using All Numerical Variables", col.names = c("Title","Artist","Genre","Error From Avg"), "simple", format.args = list(big.mark = ",", 
  scientific = FALSE))
edmplaylist <- read_csv("~/github/dsclub-spotify-recommender/data/edmplaylist - Sheet1.csv")

novplaylist <- read_csv("~/github/dsclub-spotify-recommender/data/novplaylist - Sheet1.csv")

no_songs_edm = edmplaylist %>% 
  group_by(genre) %>% 
  summarize(Num_of_songs = n())

ggplot(data = no_songs_edm, aes(x = genre, y= Num_of_songs), las=2) +
  geom_bar(stat="identity", color='#2e4057', fill='#66a182') +
  theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5)) +
  labs(x = 'Genre',
       y = 'Number of Songs',
       title = 'Total Number of Songs by Genre in EDM Playlist',
       caption = "Based on EDM Playlist")
no_songs_nov = novplaylist %>% 
  group_by(genre) %>% 
  summarize(Num_of_songs = n())

ggplot(data = no_songs_nov, aes(x = genre, y= Num_of_songs), las=2) +
  geom_bar(stat="identity", color='#2e4057', fill='#66a182') +
  theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5)) +
  labs(x = 'Genre',
       y = 'Number of Songs',
       title = 'Total Number of Songs by Genre in November Playlist',
       caption = "Based on November Playlist")
avgedmbpm = mean(edmplaylist$bpm)
avgedmenrgy = mean(edmplaylist$enrgy)
avgedmdance = mean(edmplaylist$dance)
avgedmdB = mean(edmplaylist$dB)
avgedmlive = mean(edmplaylist$live)
avgedmval = mean(edmplaylist$val)
avgedmdur = mean(edmplaylist$dur)
avgedmacous = mean(edmplaylist$acous)
avgedmspch = mean(edmplaylist$spch)
avgedmpop = mean(edmplaylist$pop)

novplaylist1a = novplaylist

novplaylist1a$bpmavg = avgedmbpm
novplaylist1a$enrgyavg = avgedmenrgy
novplaylist1a$danceavg = avgedmdance
novplaylist1a$dBavg = avgedmdB
novplaylist1a$liveavg = avgedmlive
novplaylist1a$valavg = avgedmval
novplaylist1a$duravg = avgedmdur
novplaylist1a$acousavg = avgedmacous
novplaylist1a$spchavg = avgedmspch
novplaylist1a$popavg = avgedmpop

novplaylist1a = novplaylist1a %>% 
  mutate(bpmdiff = abs(bpm - bpmavg)) %>% 
  mutate(enrgydiff = abs(enrgy - enrgyavg)) %>% 
  mutate(dancediff = abs(dance - danceavg)) %>% 
  mutate(dBdiff = abs(dB - dBavg)) %>% 
  mutate(livediff = abs(live - liveavg)) %>% 
  mutate(valdiff = abs(val - valavg)) %>% 
  mutate(durdiff = abs(dur - duravg)) %>% 
  mutate(acousdiff = abs(acous - acousavg)) %>% 
  mutate(spchdiff = abs(spch - spchavg)) %>% 
  mutate(popdiff = abs(pop - popavg))

novplaylist1a = novplaylist1a %>% 
  mutate(totaldiff = bpmdiff + enrgydiff + dancediff + dBdiff + livediff + valdiff + durdiff + acousdiff + 
           spchdiff + popdiff)

novplaylist1 = novplaylist1a %>% 
  arrange(totaldiff) %>% 
  select(title, artist, genre, totaldiff) 

knitr::kable(novplaylist1, caption = "Reordered Nov. Playlist Using All Variables", col.names = c("Title","Artist","Genre","Error From Avg"), "simple", format.args = list(big.mark = ",", 
  scientific = FALSE))
novplaylist1a = novplaylist1a %>% 
  mutate(totaldiff1 = bpmdiff + enrgydiff + dancediff + livediff + valdiff + acousdiff + 
           spchdiff)

novplaylist2 = novplaylist1a %>% 
  arrange(totaldiff1) %>% 
  select(title, artist, genre, totaldiff1) 

knitr::kable(novplaylist2, caption = "Reordered Nov. Playlist Using Limited Variables", col.names = c("Title","Artist","Genre","Error From Avg"), "simple", format.args = list(big.mark = ",", 
  scientific = FALSE))
playlist2a = playlist2a %>% 
  mutate(totaldiff1 = bpmdiff + enrgydiff + dancediff + livediff + valdiff + acousdiff + 
           spchdiff)

arrangedplaylist3 = playlist2a %>% 
  arrange(totaldiff1) %>% 
  select(title, artist, genre, totaldiff1) %>% 
  head(10)

knitr::kable(arrangedplaylist3, caption = "Reordered Playlist2 Using Limited Variables", col.names = c("Title","Artist","Genre","Error From Avg"), "simple", format.args = list(big.mark = ",", 
  scientific = FALSE))

# import pandas as pd
# import numpy as np
# import matplotlib.pyplot as plt
# import seaborn as sns
# 
# f, axes = plt.subplots(2, 5, figsize=(12, 7), tight_layout = True)
# plt.suptitle('Distplot for Numerical Variables in Playlist1', fontsize = 20)
# sns.distplot(playlist1["bpm"] , color="skyblue", ax=axes[0, 0])
# sns.distplot(playlist1["energy"] , color="olive", ax=axes[0, 1])
# sns.distplot(playlist1["dance"] , color="gold", ax=axes[1, 0])
# sns.distplot(playlist1["dB"] , color="teal", ax=axes[1, 1])
# sns.distplot(playlist1["live"] , color="green", ax=axes[1, 2])
# sns.distplot(playlist1["val"] , color="orange", ax=axes[0, 2])
# sns.distplot(playlist1["dur"] , color="blue", ax=axes[0, 3])
# sns.distplot(playlist1["acous"] , color="red", ax=axes[0, 4])
# sns.distplot(playlist1["spch"] , color="purple", ax=axes[1, 3])
# sns.distplot(playlist1["pop"] , color="yellow", ax=axes[1, 4])
# plt.show()
# plt.tight_layout()
# 
# fig, axes = plt.subplots(2, 5, figsize=(12, 7), tight_layout=True)
# plt.suptitle('Histogram for Numerical Variables in Playlist1', fontsize = 20)
# playlist1.hist('bpm', bins=10, ax=axes[0,0])
# playlist1.hist('energy', bins=10, ax=axes[0,1])
# playlist1.hist('dance', bins=10, ax=axes[0,2])
# playlist1.hist('dB', bins=10, ax=axes[0,3])
# playlist1.hist('live', bins=10, ax=axes[0,4])
# playlist1.hist('val', bins=10, ax=axes[1,0])
# playlist1.hist('dur', bins=10, ax=axes[1,1])
# playlist1.hist('acous', bins=10, ax=axes[1,2])
# playlist1.hist('spch', bins=10, ax=axes[1,3])
# playlist1.hist('pop', bins=10, ax=axes[1,4])
# plt.show()
# plt.tight_layout()
library(icon)
fa("globe", size = 5, color="green")