Brooklyndon said:
This comment is based more around the burgeoning internet stream technology. Reading cookies is relatively painless.
The problem with that is that the cookie often does not tell who is using a computer, and they seldom create real demographic data. And that method is subject to a subset that has software that removes cookies with regularity... I remove all of mine at regular intervals, and those known to do anything more than set preferences for my use are nuked daily.
Am I misreading you, or is that a broadbased indictement of the rating system in general?
No, the sample is designed to work at the size it is... and that is all that the radio industry can afford... maybe more.
For the specfic purpose of selling advertising and setting a metric for each station, the ratings are excellent. Using the PPM instead of a larger-sample music test or perceptual project, however, is insane.
Time series decomposition can adjust for normal daily audience loss.
You can run the PPM data every which way through SPSS but the fact is that you are tracking behaviour by incident, and the sample per incident is inadequate. Additionally, we have no knowledge of whether tune out is caused by a person's availability to listen or the actual program content.
In addition, instead of AQH, you can use share, then isolate for listeners lost to other stations.
Share, rating and AQH persons are three ways of expressing the same thing... so whether you use one or another is immaterial.
An perhaps only sample from dayparts where the sample size is above some threshold (say 20).
Only a couple of stations would get to that point. A station outside the top 5 would pretty much never have such a sample, and a portion of the sample in PPM (perhaps half the cume) consists of unintentional "hearers" who are exposed to a station but are not part of its core audience.
Furthermore, factor in that many songs get played on different stations and the decision to play to drop songs becomes based on a larger sample than fifteen radios.
The behaviour of each metered listener is often influenced by things other than the music, such as morning shows, etc. And a song that may be wrong for KIIS might be a biggie on KBIG or a hot item on KYSR... so play on different stations may reflect the actions of people who would never listen to your station.
This fact, in conjuction with the fact that songs played on different stations may target different psochograpics, and you have at least that data to supplement share loss data and demo data that comes with the subscription.
The definition of "psychograpic" is very subjective. It's not part of the way most media is bought, other than broad things like a bait and tackle store may be happier with a country station than an Urban AC one.
Correct me if my perception is flawed, but it seems to me that standard "unit of measurement" is just may be a holdover from the days of diaries.
The reason for the quarter hour even today is to eliminate momentary listening, such as when you press scan on the radio. Also, the encoding for PPM is such that it can not be accurate even to the minute level, let alone the second. In certain cases, talk stations or certain other content may not be able to send an encoding tag for as much as several minutes. The quarter hour standard requires detection of five different minutes in any 15 minute period; any lesser standard is very error prone.
I just feel that the threshold should be closer to two minutes, not fifteen, and, ideally, I'd want second-by-second data to hand over to my quant to analyze him-or-herself.
The tag is 5 seconds long, so you can only get, assuming every tag was detected by the meter and that all 12 were broadcast each minute, 5 second intervals. But only certain content will mask the code, so the edit rules for calculating credit even "fills in" a missing minute between detection of surrounding minutes because most tags are not detected and program content will not mask at all moments the code, so it is not broadcast. Of 180 possible tags per quarter hour, a station needs only 3 or 4 detections to get credit.
The tag does not actually get sent 180 times, and the meter does not detect them all. The quarter hour, for the moment, is the standard.
Furthermore, companies with streams already have extrememly rich data inclunding on second-by-second basis, data including tune-in, tune-out, which station changed to, user-location, user-psycograpy, and volume control data. Dealing with this richness is something that terrestrial will have to deal with and arbitron will have to deliver to remain competitive.
The problem is we really don't know who is listening at all times. Almost everyone I know registers at sites with a fake name and use a dummy gmail or yahoo or hotmail account for identification. While radio ratings, or music research, participants are highly verified, online users are highly subject to fake data.
You raise a good point with sample size, and perhaps decisions made based on a single station's audience from 12AM to 6AM Monday may be wreckless, perhaps not. Afterall, many new music shows occupy that timeslot,
It's 6 am to 12 mid, as 12 am to 6 am is overnights... where there are very few meters in use. And I don't know of any stations that have shows just of new music, as that is suicidal.
so perhaps stations already base some of there programming decisions based on that small sample.
Decisions on overall programming are based on the monthly reports, and, for example, a Monday to Friday show would be based on 20 days worth of data, not just specific moments in time.
Furthermore, the central tendancy theorem says that even this smaller sample size will be unbiased, just have a wider confidence interval.
The sample will be so small it will not represent the overall demos of the station in proportion, and a large number of people will be "accidental" cumers and must be discarded... typically I would only look at about 900,000 persons of the cume of one station that has a total cume of over 2 million... the other 1.1 million are irrelavant. And still, one does not know in moment to moment data who is a valuable cumer and who isn't, as well as the changes based on availablility.
More broadly, the decision to label a track overplayed, and remove it, to remain relevant, can, in my opinion, be made on PPM data, when applied to the songs performance across the entire market. Regardless of sample size, this average audience loss can be easily presented as a time series chart, perhaps with the upper and lower CI presented as well. This sort of data can help programmers avoid overplaying songs and losing audience.
What is a good song on one station may be a burnout on another. I know of plenty of songs that one station in LA used to play heavily, but which test highly negative now... yet another station in the same cluster does play and which test fantastically well against that station's audience.
I'd argue that uplaya.com has proven that songs can be decomposed into five or six dimensions, and "artful" programmers simply possess an ear trained well enough to match songs without aid from a computer. But, the computer can do it...cheap too.
Things like Pandora can play songs "like" other songs, but they can't do neat segues and sets based on factors that are not quantifiable. All the computer things do is the same thing Amazon does for me... based on books I buy, they suggest others. But they can't tell me in which order to read them and when I'd like a change of pace book of a different kind.