For Western New York and in a typical (or on average) how much weighting is actually done? For example are there sample segments that are effectively doubled or tripled... or is it far more fine 'tweaks' in the tenths or hundredths?
Generally, weighting is in the single digits and is sort of "sanding the rough edges". But in some cases, we may find 15% to 20%, and during the pandemic, of course, there was a greater chance of the need to weight.
In PPM markets where we get "authorized" weekly reports we see stations that are quite stable from month to month (with no more than 10% wobbles) having individual weeks that are considerably more varied.
Also, after umpteen cycles, I'd assume that disproportionate response rates can be generally predicted... so why not increase the solicitation for those historically non-responsive groups?
The problem in diary markets is that the sample is new every week. So, as returned diaries from two or three weeks back come in, adjustments in the placement for upcoming weeks are made. And, like a pendulum, the sample is always swinging back and fort due to that. A participant is contacted and recruited for a future week based on time to process, time to mail the diary(ies), and margin of time for postal delays.
The cost of each participant is very high. While the incentive is small in the diary, the cost to recruit, follow up with multiple phone calls and tabulation is great. Excess sample will not solve the issue but will vastly increase subscriber cost. Nielsen already know what percentage of people in each subset will actually fill in a diary and return it, and they know which groups will accept a recruit effort so their work is calculated with all those facts... even seasonal variations... in mind. In effect, they already "over-recruit" by sending out more diaries to groups that have a lower return rate.
So a sample excess or lack is not known until the diaries for that week are all returned and tabulated, at least 20 days. Then the recruiting for about 2 to three weeks in the future is adjusted, meaning that it takes 4 to 6 weeks to adjust. Every week brings new adjustments. And, finally, weighting takes place.
They are trying to get proportional samples for each age cell, gender, race/ethnicity/language preference (for Hispanics), income, education and county of residence. There is also an effort within counties to get balance base on sub-areas that can be classified, although that is not a "guaranteed and weighted" goal (it's called "Geozones" by Nielsen).
And, by the way, it may seem that users are complacent or totally satisfied by the ratings, but we are constantly making comments to Nielsen about all kinds of issues (I'm talking about "bitching" here). Many broadcast groups have in-house experts... some former Arbitron / Nielsen high-level staffers... who are looking at the data constantly. Remember, subscribers get a huge array of possible reports and there are add-on services that get even more deep data.