Prior to its format change, Star did a perceptual with 3 formats, Adult Hits, Soft AC and Rhythmic AC. What it got back was that Rhythmic AC had the best chance for success.
That is a fairly normal procedure.
Then, once the general format is determined, people who specifically like songs and/or artists that are core to the sound will be recruited to do a full library test... as many as 1000 songs might be brought in.
Such a test would use the sample pod used in the format search along with variants. For example, the rhythmic pod might be presented in several versions that represent a spectrum of harder core rhythmic, rhythmic with some up tempo pop, and rhythmic with up tempo pop and ballads by artists that are familiar to core sound listeners. A person who would "definitely listen" to at least two of the three pods and is in the right age range and gender gets invited to the test. There will be age quotas, gender quotas, ethnic quotas, etc. And, generally, the candidates have to use radio and be medium to heavy radio users.
(Roddy: I know you know this! I am just amplifying because we have lots of site users who are not familiar with the process).
And when a format is being searched for, usually more than three different formats are tested unless the station owner knows they have to fill a particular slice of the market for cluster balance.
In one test I worked on, we tested 17 different formats, including variations of existing leading stations. The idea was to see if leading stations were winning by default. We went from the equivalent of country to hard rock, from rhythmic CHR to soft AC and all kinds of blends. The top score was a format that our sales team believed would have "zero to a negative number" of acceptance at the agency level so we went with the second highest scoring format... which was, at the time, not duplicated by any other station.
That sort of test is sometimes called an °AWT" for Awareness-Trial-Usage. First, you want to know if there is such a station in the market. Then, you want to know if the interviewee would try out such a station or change from an existing such format for a new station and, then, how likely would you be to regularly use such a station if it was, otherwise, well done. Naturally, questions are asked about current favorite stations, time spent with radio and other things that will help determine if a particular format would do well.