In this blog, we have presented the association rule mining for Last.fm data set. Detailed instruction of producing the result and how the data was structured is also given. We have demonstrated how we changed the transaction data to a sparse R matrix. Finally, we will discuss the results obtained and sample rules.
Last FM Association Data Analysis
The LastFM data set we have used has 289955 rows and 4 columns. From the four columns, we remove every column but the user and artist columns. Next, we remove any duplicate rows as that is not important for our model. We then transformed the data into a sparse R matrix using the split function. Next, we did some exploratory data work by plotting the data using the item frequency plot. We finally build the association model using apriori from arules library. To provide an example of rules generated by our analysis, people who listen to Naz are 14 time likely to listen to Jay-z and Kylie Minogue listeners are 8 times likely to listen to Madonna as well. Follow our detailed analysis, charts, and comment given in the coding section below.