In this blog, we have presented the association rule mining for Last.fm data set. A detail instruction of producing the result and how the data was structured is also given. We have demonstrated how we changed the transaction data to sparse R matrix. Finally, we will discuss the results obtained and sample rules.
Last FM Association Data Analysis
The LastFM data set we have used has 289955 rows and 4 columns. From the four columns we remove every column but the user and artist columns. Next, we remove any duplicate rows as that is not important for our model. We then transformed the data to a sparse R matrix using split function. Next, we did some exploratory data work by plotting the data using item frequency plot. We finally build the association model using apriori from arules library. To provide an example of rules generated by our analysis, people who listen to Naz are 14 time likely to listen to Jay-z and Kylie Minogue listeners are 8 times likely to listen to Madonna as well. Follow our detailed analysis, charts and comment given in the coding section below.