The Science of Programming

Brooklyndon · Aug 25, 2009

In the last three years technology has advanced to the point where the roles of program director and music director can be largely automated. The portable people meter and internet streams provides broadcasters with rich data on when listeners turn on; turn off; change stations; in some cases, which stations they change to; and, in conjunction with other datasets (cookies), listener demo and psycographic profiles. So my question is, when will statistics such as AQH increase/reduction, average times heard, and popularity by psycographic group be published. Furthermore, with this data, what is the point of hiring some subjective consultant to program a station to best suit the listeners needs when the listeners in the target demo can do it themselves?

This technology will work with talk as well, as AQH spikes can be attached to a particular subject.

Furthermore, doesn't AQH seem like an anachronistic dataset in the era of minute by minute data updates?

davideduardo · Aug 25, 2009

Brooklyndon said:
In the last three years technology has advanced to the point where the roles of program director and music director can be largely automated. The portable people meter and internet streams provides broadcasters with rich data on when listeners turn on; turn off; change stations; in some cases, which stations they change to; and, in conjunction with other datasets (cookies), listener demo and psycographic profiles.

In regards to radio, stations have had source and destination data, turnover, and similar information for literally decades. There was an effort to provide psychographic data via PRISM groups, but the Arbitron ratings themselves do not and have never had "psychographic" information.

So my question is, when will statistics such as AQH increase/reduction, average times heard, and popularity by psycographic group be published.

The data, for radio, and with the exception of psychographics, exists and has existed. The PPM adds additional gruanularity, but mostly the PPM adds speed of delivery and reliability at the month, week and even day level.

Furthermore, with this data, what is the point of hiring some subjective consultant to program a station to best suit the listeners needs when the listeners in the target demo can do it themselves?

The sample sizes (number of meters tuned in) to a major station in a major market are low, low single digit numbers. there is no way to use just the data on tune ins and tune outs for such a small sample to make programming decisions. While the trending over many weeks or months may give useful data on things ranging from morning show bits to individual songs, single moments in time are useless for evaluation of anything. For example, if during a song, a quarter of the meters "left" your station... is it because the don't like the song, or simply they got to work and turned off the car radio?

This technology will work with talk as well, as AQH spikes can be attached to a particular subject.

Only in the broadest terms... and when looked at over many instances over many days or weeks or months. Making a decision based on 9 meters or 15 meters or 11 meters is absurd.

Furthermore, doesn't AQH seem like an anachronistic dataset in the era of minute by minute data updates?

The unit of measurement of radio is the quarter hour. In fact, the PPM needs to detect multiple minutes in a quarter hour to get credit to a station; single minute measurement would pick up "oodles" (a technical term) of random detections and not benefit anyone. Advertisers are looking at sustainable listening levels, not spikes and valleys caused by the come and go of listeners in what is a very very small sample.

Here is an example... there are about 3000 meters in the LA market. The average rating for all radio (percent of universe using radio) 6 AM-Mid M-Sun is 10. So on average, there are 300 meters actually detecting stations being heard. To be "top 10" in LA, you need a 4 to 5 share range, so the biggest station would have 5% of the 300 in-use meters detecting it... that is 15 meters for the #1 station in LA.

Goat Rodeo Cowboy · Aug 25, 2009

I understand the thrust of your post: The Science of analyzing programming and reaction of audience just keeps getting... keeps getting.... well: more precise and scientific.

Forgive me, but the title you put in the subject line begs that this question be posed:

Is radio programming basically a "science fair project" or does radio programming also have booth space at "the art fair"? Would it be naive to pose the topic: "The Art of Programming"?

davideduardo · Aug 25, 2009

Goat Rodeo Cowboy said:
Is radio programming basically a "science fair project" or does radio programming also have booth space at "the art fair"? Would it be naive to pose the topic: "The Art of Programming"?

Those questions are each a topic in themselves!

The glib answer is that radio programming is a mix of art and science.

It's really more complex as I look at it. First, the function of a commercial radio station is to make money. In order to "serve the public interest, convenience and necessity" such a station has to make some dough. So, a station has to appeal to an audience group that advertisers want to direct commercials to. That means doing some kind of reasearch to know how to get 'em and then buying ratings so you can show Mr. Advertiser that you indeed have them!

The part that has to do with the actual programming... the construction and content of each hour... is art. I like the analogy of club DJs. Some make $100 a night, and a few can command $10 thou for an evening of mixing. The low-paid DJ probably plays the same cuts as the star DJ, but it's all about how the music is blended, how it works with the mood in the club and on the dance floor, the perfection of the blend overall and in the transitions, the feel.

Any station can research its music and play "good" songs. But the way they are rotated and played, the editing of the log, the placement of elements in the clock and such can not be done... and work... except by a skilled programming artist. No matter how much real time information you have, there is no way to tell statistically how a set or sweep or segue will sound because it takes art to put it together.

While there are PDs who never edit a computer log, the good and the great spend a lot of time on it. I was told by a former staff member of KOST that Jhani Kaye had, well before we could do this with PM3's, the tips and tails of every song in the library on cassettes... if he was log editing and unsure of a segue, he listened to the segue! Then he could make the right decision on the log edits.

Brooklyndon · Aug 26, 2009

DavidEduardo said:
In regards to radio, stations have had source and destination data, turnover, and similar information for literally decades. There was an effort to provide psychographic data via PRISM groups, but the Arbitron ratings themselves do not and have never had "psychographic" information.

This comment is based more around the burgeoning internet stream technology. Reading cookies is relatively painless.

Here is an example...there are about 3000 meters in the LA market. The average rating for all radio (percent of universe using radio) 6 AM-Mid M-Sun is 10. So on average, there are 300 meters actually detecting stations being heard. To be "top 10" in LA, you need a 4 to 5 share range, so the biggest station would have 5% of the 300 in-use meters detecting it... that is 15 meters for the #1 station in LA. Only in the broadest terms... and when looked at over many instances over many days or weeks or months. Making a decision based on 9 meters or 15 meters or 11 meters is absurd.

Am I misreading you, or is that a broadbased indictement of the rating system in general?

The sample sizes (number of meters tuned in) to a major station in a major market are low, low single digit numbers. There is no way to use just the data on tune ins and tune outs for such a small sample to make programming decisions. While the trending over many weeks or months may give useful data on things ranging from morning show bits to individual songs, single moments in time are useless for evaluation of anything. For example, if during a song, a quarter of the meters "left" your station... is it because the don't like the song, or simply they got to work and turned off the car radio?

Time series decomposition can adjust for normal daily audience loss. In addition, instead of AQH, you can use share, then isolate for listeners lost to other stations. An perhaps only sample from dayparts where the sample size is above some threshold (say 20). Furthermore, factor in that many songs get played on different stations and the decision to play to drop songs becomes based on a larger sample than fifteen radios. This fact, in conjuction with the fact that songs played on different stations may target different psochograpics, and you have at least that data to supplement share loss data and demo data that comes with the subscription.

The unit of measurement of radio is the quarter hour. In fact, the PPM needs to detect multiple minutes in a quarter hour to get credit to a station; single minute measurement would pick up "oodles" (a technical term) of random detections and not benefit anyone. Advertisers are looking at sustainable listening levels, not spikes and valleys caused by the come and go of listeners in what is a very very small sample.

Correct me if my perception is flawed, but it seems to me that standard "unit of measurement" is just may be a holdover from the days of diaries.

I agree that the methodology of focusing on sustained listening is fundamentaly sound. I just feel that the threshold should be closer to two minutes, not fifteen, and, ideally, I'd want second-by-second data to hand over to my quant to analyze him-or-herself. Furthermore, companies with streams already have extrememly rich data inclunding on second-by-second basis, data including tune-in, tune-out, which station changed to, user-location, user-psycograpy, and volume control data. Dealing with this richness is something that terrestrial will have to deal with and arbitron will have to deliver to remain competitive.

You raise a good point with sample size, and perhaps decisions made based on a single station's audience from 12AM to 6AM Monday may be wreckless, perhaps not. Afterall, many new music shows occupy that timeslot, so perhaps stations already base some of there programming decisions based on that small sample. Furthermore, the central tendancy theorem says that even this smaller sample size will be unbiased, just have a wider confidence interval.

More broadly, the decision to label a track overplayed, and remove it, to remain relevant, can, in my opinion, be made on PPM data, when applied to the songs performance across the entire market. Regardless of sample size, this average audience loss can be easily presented as a time series chart, perhaps with the upper and lower CI presented as well. This sort of data can help programmers avoid overplaying songs and losing audience.

DavidEduardo said:
Goat Rodeo Cowboy said:

Is radio programming basically a "science fair project" or does radio programming also have booth space at "the art fair"? Would it be naive to pose the topic: "The Art of Programming"?

Click to expand...

The glib answer is that radio programming is a mix of art and science.

Any station can research its music and play "good" songs. But the way they are rotated and played, the editing of the log, the placement of elements in the clock and such can not be done... and work... except by a skilled programming artist. No matter how much real time information you have, there is no way to tell statistically how a set or sweep or segue will sound because it takes art to put it together.

While there are PDs who never edit a computer log, the good and the great spend a lot of time on it. I was told by a former staff member of KOST that Jhani Kaye had, well before we could do this with PM3's, the tips and tails of every song in the library on cassettes... if he was log editing and unsure of a segue, he listened to the segue! Then he could make the right decision on the log edits.

I'd argue that uplaya.com has proven that songs can be decomposed into five or six dimensions, and "artful" programmers simply possess an ear trained well enough to match songs without aid from a computer. But, the computer can do it...cheap too.

davideduardo · Aug 28, 2009

Brooklyndon said:
This comment is based more around the burgeoning internet stream technology. Reading cookies is relatively painless.

The problem with that is that the cookie often does not tell who is using a computer, and they seldom create real demographic data. And that method is subject to a subset that has software that removes cookies with regularity... I remove all of mine at regular intervals, and those known to do anything more than set preferences for my use are nuked daily.

Am I misreading you, or is that a broadbased indictement of the rating system in general?

No, the sample is designed to work at the size it is... and that is all that the radio industry can afford... maybe more.

For the specfic purpose of selling advertising and setting a metric for each station, the ratings are excellent. Using the PPM instead of a larger-sample music test or perceptual project, however, is insane.

Time series decomposition can adjust for normal daily audience loss.

You can run the PPM data every which way through SPSS but the fact is that you are tracking behaviour by incident, and the sample per incident is inadequate. Additionally, we have no knowledge of whether tune out is caused by a person's availability to listen or the actual program content.

In addition, instead of AQH, you can use share, then isolate for listeners lost to other stations.

Share, rating and AQH persons are three ways of expressing the same thing... so whether you use one or another is immaterial.

An perhaps only sample from dayparts where the sample size is above some threshold (say 20).

Only a couple of stations would get to that point. A station outside the top 5 would pretty much never have such a sample, and a portion of the sample in PPM (perhaps half the cume) consists of unintentional "hearers" who are exposed to a station but are not part of its core audience.

Furthermore, factor in that many songs get played on different stations and the decision to play to drop songs becomes based on a larger sample than fifteen radios.

The behaviour of each metered listener is often influenced by things other than the music, such as morning shows, etc. And a song that may be wrong for KIIS might be a biggie on KBIG or a hot item on KYSR... so play on different stations may reflect the actions of people who would never listen to your station.

This fact, in conjuction with the fact that songs played on different stations may target different psochograpics, and you have at least that data to supplement share loss data and demo data that comes with the subscription.

The definition of "psychograpic" is very subjective. It's not part of the way most media is bought, other than broad things like a bait and tackle store may be happier with a country station than an Urban AC one.

Correct me if my perception is flawed, but it seems to me that standard "unit of measurement" is just may be a holdover from the days of diaries.

The reason for the quarter hour even today is to eliminate momentary listening, such as when you press scan on the radio. Also, the encoding for PPM is such that it can not be accurate even to the minute level, let alone the second. In certain cases, talk stations or certain other content may not be able to send an encoding tag for as much as several minutes. The quarter hour standard requires detection of five different minutes in any 15 minute period; any lesser standard is very error prone.

I just feel that the threshold should be closer to two minutes, not fifteen, and, ideally, I'd want second-by-second data to hand over to my quant to analyze him-or-herself.

The tag is 5 seconds long, so you can only get, assuming every tag was detected by the meter and that all 12 were broadcast each minute, 5 second intervals. But only certain content will mask the code, so the edit rules for calculating credit even "fills in" a missing minute between detection of surrounding minutes because most tags are not detected and program content will not mask at all moments the code, so it is not broadcast. Of 180 possible tags per quarter hour, a station needs only 3 or 4 detections to get credit.

The tag does not actually get sent 180 times, and the meter does not detect them all. The quarter hour, for the moment, is the standard.

Furthermore, companies with streams already have extrememly rich data inclunding on second-by-second basis, data including tune-in, tune-out, which station changed to, user-location, user-psycograpy, and volume control data. Dealing with this richness is something that terrestrial will have to deal with and arbitron will have to deliver to remain competitive.

The problem is we really don't know who is listening at all times. Almost everyone I know registers at sites with a fake name and use a dummy gmail or yahoo or hotmail account for identification. While radio ratings, or music research, participants are highly verified, online users are highly subject to fake data.

You raise a good point with sample size, and perhaps decisions made based on a single station's audience from 12AM to 6AM Monday may be wreckless, perhaps not. Afterall, many new music shows occupy that timeslot,

It's 6 am to 12 mid, as 12 am to 6 am is overnights... where there are very few meters in use. And I don't know of any stations that have shows just of new music, as that is suicidal.

so perhaps stations already base some of there programming decisions based on that small sample.

Decisions on overall programming are based on the monthly reports, and, for example, a Monday to Friday show would be based on 20 days worth of data, not just specific moments in time.

Furthermore, the central tendancy theorem says that even this smaller sample size will be unbiased, just have a wider confidence interval.

The sample will be so small it will not represent the overall demos of the station in proportion, and a large number of people will be "accidental" cumers and must be discarded... typically I would only look at about 900,000 persons of the cume of one station that has a total cume of over 2 million... the other 1.1 million are irrelavant. And still, one does not know in moment to moment data who is a valuable cumer and who isn't, as well as the changes based on availablility.

More broadly, the decision to label a track overplayed, and remove it, to remain relevant, can, in my opinion, be made on PPM data, when applied to the songs performance across the entire market. Regardless of sample size, this average audience loss can be easily presented as a time series chart, perhaps with the upper and lower CI presented as well. This sort of data can help programmers avoid overplaying songs and losing audience.

What is a good song on one station may be a burnout on another. I know of plenty of songs that one station in LA used to play heavily, but which test highly negative now... yet another station in the same cluster does play and which test fantastically well against that station's audience.

I'd argue that uplaya.com has proven that songs can be decomposed into five or six dimensions, and "artful" programmers simply possess an ear trained well enough to match songs without aid from a computer. But, the computer can do it...cheap too.

Things like Pandora can play songs "like" other songs, but they can't do neat segues and sets based on factors that are not quantifiable. All the computer things do is the same thing Amazon does for me... based on books I buy, they suggest others. But they can't tell me in which order to read them and when I'd like a change of pace book of a different kind.

Brooklyndon · Sep 18, 2009

looks like the industry is doing it....
http://online.wsj.com/article/SB125314774171818133.html

no need for programming consultants any longer.

Goat Rodeo Cowboy · Sep 18, 2009

Oh, Brooklyndon. People in all industries must be careful to not get wrapped up in their new toys and lose track of the world as it flows past them.

In a flippant mood, I will include in my resume: "Forensic Data Miner".

Several industries have allowed me to peek and poke around their silos of collected data.

Never turn major business decisions over to GEEKS who are so naive as to believe that "the data never lies". So, like in the WSJ article.... you have maybe 13 listeners to a morning show on the radio, they pick up the phone to interview someone, and so maybe six people change stations. WOW! We know that person is audience death. I would want to know if the six people were male, female or a combination. I want to know if they are at work, in their vehicles, or at home. I want to know what they do for a living. Do all six of them happen to work for the same company and because of business alliances it is not good for your career to be caught listening at work to someone who is funded by a business competitor. I would want to know if the political loyalties of the six who tuned out are different than the political loyalties of the 7 who remained tuned in. What about religion. Were the six who tuned out at odds with the person interviewed on some hot-button topic? Was it the fact that "too much, too long talk" caused the tune-away, or was it the subject of the "talk" regardless of its length?

There have to be some cynics included in "the decision making group" and I get the idea that radio stations today don't have the budget to hire any token cynics. And entertainment producers love cheerleaders.... and hate cynics.

davideduardo · Sep 18, 2009

Brooklyndon said:
looks like the industry is doing it....
http://online.wsj.com/article/SB125314774171818133.html

no need for programming consultants any longer.

To the contrary, it means they need more outside advice. We had about 40 years to get used to the diary, but in markets that switch to PPM, we have about 40 seconds.

JustPastBuffalo · Sep 18, 2009

Very compelling and informative thread, especially regarding the inner workings of PPM. David; question regarding the 5 minutes in any quarter hour; I was lead to believe that it's five "whole or block" minutes as opposed to five "scattered and splintered" minutes within a 15 minute period.

DavidEduardo said:
The sample will be so small it will not represent the overall demos of the station in proportion, and a large number of people will be "accidental" cumers and must be discarded... typically I would only look at about 900,000 persons of the cume of one station that has a total cume of over 2 million... the other 1.1 million are irrelavant. And still, one does not know in moment to moment data who is a valuable cumer and who isn't, as well as the changes based on availablility.

Also, curious as to why you'd dismiss 1.1 million listeners (55%) out of a station cume of 2 million and chose to bank on the 900,000 (45%) of the cume total. Are the 1.1 "accidental cumers" in your estimation. Is this a personal preference or mathematical projection related to "real user" or "P1" customers? Thanks. Best regards, Jim Pastrick

davideduardo · Sep 18, 2009

JimPastrick said:
Very compelling and informative thread, especially regarding the inner workings of PPM. David; question regarding the 5 minutes in any quarter hour; I was lead to believe that it's five "whole or block" minutes as opposed to five "scattered and splintered" minutes within a 15 minute period.

It's five minutes (really 5 detections in 5 different minutes) in a quarter hour. In theory, there are about 12 sends of the ID-encoding tag by a station every minute, but the tag is only sent if there is program material to mask the code. So some minutes may get fewer than 12 tag sends. The Arbitron edit rule (in this case, electronic edits by algorithm) says that if a person's meter detects a station one minute, misses another, and then gets a third minute, and there is no other station detected in the "hole in the donut" the station gets credit for the missing minute. So, in some cases, just detection in three minutes in a quarter hour gets quarter hour credit.

The sample will be so small it will not represent the overall demos of the station in proportion, and a large number of people will be "accidental" cumers and must be discarded... typically I would only look at about 900,000 persons of the cume of one station that has a total cume of over 2 million... the other 1.1 million are irrelavant. And still, one does not know in moment to moment data who is a valuable cumer and who isn't, as well as the changes based on availablility.

Also, curious as to why you'd dismiss 1.1 million listeners (55%) out of a station cume of 2 million and chose to bank on the 900,000 (45%) of the cume total. Are the 1.1 "accidental cumers" in your estimation. Is this a personal preference or mathematical projection related to "real user" or "P1" customers? Thanks. Best regards, Jim Pastrick

About 45% to 50% of a station's cume represent around 92% of the AQH listening. The others are so occasional that they don't contribute any real listening and are likely not even partisans of the station; half of them don't even know they listend to "that" station as the detection was to accidental listening.

The study of listening, active listening and "affinity" are part of a new Arbitron project detailed at http://arbitron.mediaroom.com/index.php?s=43&item=625 Note that there is great advertiser interest in knowing about radio listening by persons who are "involved" with the station they listen to.

The Science of Programming

Brooklyndon

davideduardo

Moderator/Administrator

Goat Rodeo Cowboy

davideduardo

Moderator/Administrator

Brooklyndon

davideduardo

Moderator/Administrator

Brooklyndon

Goat Rodeo Cowboy

davideduardo

Moderator/Administrator

JustPastBuffalo

davideduardo

Moderator/Administrator