Tikalon Blog by Dev Gualtieri

Scoring the Hits

January 4, 2012

Management's desire to have machines make all their decisions for them has roots beyond the modern computer database systems which have infected corporations worldwide. When I was working in radio in the late 1960s (see photograph), I read a story, perhaps apocryphal, about music industry executives who were duped into paying for the services of a machine designed to determine the hit potential of recordings.

The author during his short career as a "Top-40" disk jockey.

This photograph was taken in 1968. Coloring added to the original black-and-white image)

In demonstrations of this wonderful machine, a recording would be played, and the machine would energize lights to indicate whether or not the recording had sales potential. The predictions were fairly good.

It was fairly easy to convince these executives that such a machine was possible. A Univac computer, featured on television, predicted that Eisenhower would win the 1952 presidential election. This hit-detecting machine, however, was a fake. A person inside the box did the actual evaluation. This was a modern day ruse in the tradition of the Turk, not to be confused with Amazon's Mechanical Turk. The Turk was a chess-playing automaton of the late eighteenth, and early nineteenth centuries, that was operated by a man inside the mechanism.

A copper engraving of the Turk, a chess-playing automaton, built in 1770, and operational until 1854. A chess master sat inside the box that also exposed a gear mechanism when the doors were opened.

Source: Karl Gottlieb von Windisch (1783), "Briefe über den Schachspieler des Hrn. von Kempelen..."(Via Wikimedia Commons))

The advent of powerful computers has heartened computer scientists to attempt this hit-prediction task. Several years ago, Brian Whitman and Tristan Jehan, computer scientists at MIT, developed software they called Echo Nest to forecast the hit potential of a recording. They paired listener reviews of recordings with a feature analysis of the recording derived from signal processing techniques.

Whitman claimed to have predicted the US Billboard magazine top 10 for several weeks and was quoted in the Guardian as saying,

"For record company executives, this raises the tantalizing possibility of knowing in advance whether their latest pop act will hit the charts at a strong position.[1]

Music Intelligence Solutions has introduced software it calls Hit Song Science, and is commercializing it on the Uplaya web site. Company CEO, David Meredith, explained that the software discovered that hit songs have similar lyrics, harmony, length, rhythm and chord progression. A study by the Harvard Business School showed that the Hit Song Science software was accurate 8 out of 10 times.[2]

Understandably, companies are unwilling to discuss the intricate details of how their software works, but academia is fueled by citations and publications. A new hit-prediction system has been developed by computer scientists at the University of Bristol.[3-4] The group is led by Tijl De Bie, Senior Lecturer (associate professor) in AI at the University of Bristol, Department of Engineering Mathematics.

As leader of the Pattern Analysis and Intelligent Systems Group of the Intelligent Systems Laboratory, De Bie and his students have been investigating pattern matching for many applications. One of the group's previous successes was in demonstrating how Twitter can be used for tracking flu outbreaks in the UK. They demonstrated that their analysis of 50 million geolocated tweets offered predictive capability of influenza severity within regions.[5]

In a paper scheduled for presentation on December 17, 2011, at MML 2011, the 4th International Workshop on Machine Learning and Music: Learning from Musical Structure (Sierra Nevada, Spain), De Bie, along with Yizhao Ni, Raul Santos-Rodriguez and Matt Mcvicar, describe their machine learning system for hit song prediction.[3]

Since the team is located in the UK, they analyzed the last fifty years of recordings on the UK top 40 singles chart. Their goal was to differentiate top-five material from the also-rans; that is, those recordings that peaked at just 30-40 on the charts.[3] They used regression analysis, a technique well known to all scientists, based on 23 variables, such as time signature, tempo, duration, loudness and harmonicity.[3]

Using machine learning techniques, the Bristol team was able to determine how important each of these 23 variables were to a song's hit-potential. This gave a list of weights for a hit equation,

in which all the factors f are normalized by their weights w and summed to give a score. Their equation had a 60% accuracy of predicting whether a song was top-five material, or relegated to the lowest ranks.[3] Chance, of course, is 50%, but in the team's defense, all these songs were already selected as being worthwhile, since they all appeared on the top-forty chart. Thus, the degree of difference between such songs is expected to be small.

The team did need to bin their data into certain time frames to make even these predictions. Tastes in music change with time, as even a cursory comparison of Frank Sinatra and Lady Gaga will show, so their analyses were done using weights that varied in time. They discovered a few interesting facts about popular music. Dance music became popular in the seventies (disco lives!), and danceability became an important factor in the eighties. Ballads with a slower tempo ( 70-89 beats/minute) were more likely to become a hit in the eighties.[3]

The Bristol equation is more accurate in the current era, possibly because record companies are loathe to change any style that sells. From the 1990s to today, 4/4 time seems to be a sine qua non for a hit song, a fact that's often derided by creative, independent musicians. The Bristol team also discovered that, on average, songs are becoming louder, and hits are relatively louder than songs at the lower end of the chart.[3]

One trouble with software like this is that song writers might try to "optimize" their offerings to get a higher hit-score, and all music will start to sound the same.[6] Kym Tuvim, an independent singer-songwriter, was quoted on NPR as saying

"From an artist's standpoint, a songwriter's standpoint, it's horrifying to me... You'll find a decreasing amount of any kind of surprises in music... This just becomes a tool to make that narrowing of the field more accessible."[2]

Then there's this other piece of research, published in Science in 2006, that says all this doesn't really matter. This paper presents evidence that the success of a song is only partially determined by its quality. Quality was reflected in the outliers; that is, the best songs rarely did poorly, and the worst rarely became hits. For songs in general, any other result was possible.[7-8]

How could I end an article that mentions both Bristol and hit music without a mention of the Bristol Stomp? The Bristol Stomp, a 1961 recording by the Dovells, rose to the second place on the Billboard magazine Hot 100 singles chart and sold more than a million copies.

References:

Permanent Link to this article

Linked Keywords: Management; machines; computer database system; corporation; radio; apocryphal; music industry; executive; recording; Univac computer; television; Eisenhower; 1952 presidential election; ruse; Turk; Mechanical Turk; chess; automaton; eighteenth century; nineteenth century; chess master; Wikimedia Commons; mainframe; computer; computer scientist; Brian Whitman; Tristan Jehan; Massachusetts Institute of Technology; MIT; Echo Nest; signal processing; Billboard magazine; Guardian; Music Intelligence Solutions; Hit Song Science; Uplaya; David Meredith; lyrics; harmony; length; rhythm; chord progression; Harvard Business School; academia; citation; publication; University of Bristol; Tijl De Bie; Department of Engineering Mathematics; Pattern Analysis and Intelligent Systems Group; Intelligent Systems Laboratory; pattern matching; Twitter; influenza; flu; United Kingdom; UK; geolocation; Sierra Nevada, Spain; machine learning; UK top 40 singles chart; regression analysis; scientist; time signature; tempo; duration; loudness; harmonicity; weighting; weights; data binning; Frank Sinatra; Lady Gaga; popular music; disco; ballad; sine qua non; independent musician; NPR; Science; quality; Bristol Stomp; Dovells; Billboard magazine.

RSS Feed

Google Search

Latest Books by Dev Gualtieri
Previews Available
at Tikalon Press

STEM-themed novel for middle-school students

Mathematics-themed novel for middle-school students

Complete texts of LGM, Mother Wode, and The Alchemists of Mars

Other Books

Blog Article Directory on a Single Page

Scoring the Hits

References:

Google Search

Recent Articles

Deep Archive