Tibbs2 said:
I have a bad feeling this is headed south...and not to Birmingham.
More like Hartselle... :-\
Hey David, gotta a legit question. When you play 8-10 seconds of music clips, have you ever tried rotating the order of the songs and gotten back to back reverse reactions?
Absolutely.
The answer is more complex, though.
Hooks that are 10 seconds are probably too long; if too long, the participant will fatigue sooner. 8 seconds is beyond the time when essentially everyone has scored the song, any longer and the test takes too long and has to be limited to fewer songs.
What is done first by a research company is to first prove that there is no poisiton bias. "Position bias" is the placement of each song in each set of songs... typically, songs are presented in batches of 50 to 70. Among, let's say, 500 songs, some songs are put at the beginning of a set and repeated in other positions. And songs that are in the test in the first 100 or so are repeated towards the end. The finding is that position is not a significant consideration.
Some companies do a 100 person test in two sessions of 50. The orders are reversed or scrambled in each test to avoid fatigue from some respondents getting tired towards the end.
If testing is individualized, each respondent can have a differently scrambled or reordered set of songs. Callout typically uses 5 to 6 different start points for 25 to 30 total songs.
While we see via internal testing that position is not a bias factor of significance, we use technology to shuffle or reorder as much as we can.
To be more clear, like in a wine tasting, if you get "the order wrong" the next couple of wines will taste "wrong" or "worse".
I've never seen any evidence of this, despite including a control group of duplicates in most tests and comparing the same song across multiple markets and multiple tests. Since songs on a test are slated (a number is spoken ahead of each song for paper based score sheets) or separated by a short silence for tests using scoring dials.
I have to think it's possible in music rotation, as well. One of my best decisions was to go over the music and keep a general flow going vs. train wrecks.
While a very useful programming technique, it's not particularly useful in testing. In fact, an objective of list preparation is to randomly shuffle and get uniform same artist separation.
Same has to be true in testing. I have also always thought it was interesting to see and compare the results of these tests among two groups. One was tested at night after work. The other was tested on Saturday or Sunday. People are more relaxed, etc. They'll prefer different music under different conditions. How do you feel about the fact that external conditions may alter the survey and adjust for them? If this is unclear, let me know. Hope it makes sense.
Similarly, two groups in the same night, one at 5:30 and another at 8 PM will have differences. This is because the two groups are different! The ideal situation would be to do 25 people at 2 PM, 25 at 6 PM, 25 at 8 PM, and 25 Saturday afternoon, but the cost would increase astronomically and just doing two sets is usually enough. In any case, what is important is the relative score vs. all other songs. The actual score is less important than the relative score.
Further, when factor / cluster analysis is done on the results using packages like SPSS, lifestyle or taste subsets can be identified, just as age variations can be via collection of the age of each participant.