There are too many variables for this to be totally accurate and that is why stations test their music independently. As Huff said, there is not enough actionable information.
The time in a quarter hour a song is played may affect results... such as the moment just before the hour or half hour when people get out of their car to go in to the workplace. And you don't know if tune-out was due to an early or late start of a stopset, the fact that this was the second song a listener did not like in a row, etc., etc. and etc.
But the biggest reason is that there are not enough meters "hearing" any particular station in a given quarter hour to be an adequate sample. A top station might have 8 to 12 meters at a given moment in time. To test currents, somewhere between 60 and 80 participants is thought to be the minimum for accuracy across gender, age and, sometimes, ethnicity.
Here is the math for an imaginary larger market: There are 2000 meters. Persons using radio during the whole day averages 6, meaning that out of the 2000 meters, only 120 are actually "hearing" a station. And a station with a 6 share across the day will have an average of... note this... 8 meters detecting them.