The music is in the models.
There’s a Nirvana song that you may not have heard that, ironically, describes why you have heard another Nirvana song, “Smells Like Teen Spirit,” which dominated the airwaves in the early ’90s and still endures today.
It’s called “Verse Chorus Verse” and it follows the song structure it’s named for, which most pop songs, including “Teen Spirit” and recent smashes like “Old Town Road,” rely on. The only weird thing, though, is that the song is about frontman Kurt Cobain’s chronic stomach pain and the medications he illegally took. Not exactly chart-topping territory.
That title is a play on a common dig at pop songs—all of them sound the same. Now, two student researchers at the University of San Francisco have leveraged Spotify data to figure out if that’s really true. Using Spotify’s public Application Programming Interface (API), the scientists created four machine learning models to predict if a pop song would rise to become a hit or not.
“Our goal was to see whether hit songs shared similar features, and if so, whether those features could be used to predict which songs would be hits in the future,” Kai Middlebrook, one of the researchers, said in a press statement.
When he and fellow researcher Kian Sheik trained the models with Spotify analytics, they focused on certain features of the songs, including tempo, key, valence (how positive or negative a song sounds), energy acousticness, danceability, and loudness. Ultimately, they wound up with four models:
1) Logistic Regression: When a song is fed to this model, it’s given a label of either one or zero. A one indicates that the song will be a hit. A zero corresponds to a flop. This means that the model assumes data can be linearly separated into just two categories: hits and non-hits.
Each song feature is assigned a weight and those are used to help predict if a song is a hit or not. Logistic regression models can be trained relatively quickly and make it simple to interpret the relationship between the independent variables (song features) and dependent variables (hit or non-hit).
2) Random Forest Architecture: RF models use decision trees to break down data through yes/no questions. The downfall, though, is that these models are prone to overfitting data, meaning they memorize the training data by fitting it too closely. So, a model may not be learning an actual relationship between the song features and song popularity because the data usually includes irrelevant noise.
To avoid overfitting, Middlebrook and Sheik built their model to combine hundreds of thousands of decision trees. Each tree is trained on a different subset of the training data and a different subset of the song features. Then, the model makes a prediction by averaging the predictions from each tree and combining the results. RF models are more flexible than linear models, Middlebrook said, which is a key advantage.
3) Support Vector Machine: This model looks for the “hyperplane” that best separates the data into two categories (here, a hit or a non-hit).
4) Neural Network: This architecture uses one hidden layer with 10 filters to learn from song data.
Regardless of the model, Middlebrook and Sheik tested the results against historical data from Billboard’s API to see if the song has ever appeared on the Billboard Hot 100 chart. The researchers used a team of computers at the University of San Francisco to crunch the numbers, which took a few weeks.
The researchers found that the Support Vector Machine had the highest precision rate in predicting hits, coming in at 99.53 percent, while the random forest model had the best accuracy rate (88 percent) and recall rate (85.51 percent).
Middlebrook believes that record labels would find precision rate to be the most useful metric if using these models to put out songs. That’s because a model with high precision assumes less risk, resulting in a more sound business decision.
“Record labels have limited resources,” Middlebrook explained. “If they pour these resources into a song that the model predicts will be a hit and that song never becomes one, then the label may lose lots of money. So if a record label wants to take a little more risk with the possibility of releasing more hit records, they might choose to use our random forest model. On the other hand, if a record label wants to take on less risk while still releasing some hits, they should use our SVM model.”
All in all, Middlebrook and Sheik determined it’s possible to predict if a song will be a hit based on its audio. In the future, the team wants to look into other factors that might contribute to the success of a song—like social media presence, artist experience, and label influence.
“We can imagine a world where record labels who are constantly seeking new talent are inundated with mix-tapes and demos from the ‘next hot artists,'” Sheik said. “People only have so much time to listen to music with human ears, so ‘artificial ears,’ such as our algorithms, can enable record labels to train a model for the type of sound they seek and greatly reduce the number of songs they themselves have to consider.”
However, the idea of artists catering to machines is definitely in the realm of Black Mirror,the dystopian Netflix series. Sheik suggests that if record labels do begin to use algorithms to make artistic decisions, it should be done in a way that won’t stymie the progress of art. The models Sheik and Middlebrook built aren’t quite there yet.
This article was written by Courtney Linder and was published by Popular Mechanics on 09/09/2019