The old list (2015):
- Berlin Marathon
- St. George Marathon
- Twin Cities Marathon
- Portland Marathon
- Chicago Marathon
- Steamtown Marathon
- Columbus Marathon
- Baystate Marathon
- Toronto Waterfront Marathon
- Marine Corps Marathon
- New York Marathon
- Indianapolis Monumental Marathon
- Richmond Marathon
- Philadelphia Marathon
- California International Marathon
- Houston Marathon
- Boston Marathon
- Bayshore Marathon
- Ottawa Marathon
- Mountains 2 Beach Marathon
- Grandma's Marathon
- Santa Rosa Marathon
- Big Cottonwood Marathon
- Erie Marathon
- Lehigh Valley Marathon
The new list (2016):
- Boston Marathon
- Chicago Marathon
- New York City Marathon
- Philadelphia Marathon
- California International Marathon
- St George Marathon
- Grandma's Marathon
- Erie Marathon
- Twin Cities Marathon
- Houston Marathon
- Ottawa Marathon
- Baystate Marathon
- Berlin Marathon
- Columbus Marathon
- Indianapolis Monumental Marathon
- Toronto Waterfront Marathon
- Mountains 2 Beach Marathon
- Richmond Marathon
- Steamtown Marathon
- Mohawk Hudson River
- Marine Corps Marathon
- Big Cottonwood Marathon
- Santa Rosa Marathon
- London Marathon
- Wineglass Marathon
Portland, Bayshore. and Lehigh Valley have fallen off the list and replaced with Mohawk-Hudson River, London, and Wineglass.
I had Mohawk-Hudson River on my list as a bonus analysis race as it has a high percentage of qualification and over 250 qualifiers which makes it similar to some of the other smaller top feeders. And Wineglass is another race that I've run, and I wondered why it wasn't on the list of top feeders because the course and weather are (usually) very favorable.
London is an interesting one. I wonder if it would be like Berlin, where the impact was low and, in fact, when it was included in the cut off analysis by last year's blogger, it yielded a misleadingly lower cutoff prediction. When she excluded the Berlin results, she got a cutoff prediction that was fairly close to the end result. The published result data for the 2015 Berlin Marathon makes it difficult for me to include because they didn't fully publish the ages of the finishers. Many results have, for example, "MH" so I can't get an age for that result. If it were just a few, that would be one thing, but there are many results with this issue (all the way up to like 5+ hour finish times, so it's not some elite designation. And you see that MH in the first page of results too, with sub 3 hour finishers).
Looking at London, we have a similar problem. They are publishing the under 40 group as 18-39 with no indication of the actual age of the finisher. There are two Boston qualifier age groups in there with 5 minutes of difference in the standard. Given that, I can't include this data in the analysis, especially considering the 18-39 age group makes up more than 40% of the qualifiers in the current dataset. That is too big of a group to not be able to properly categorize.
I am thinking I will pull in the results of Mohawk-Hudson River and Wineglass. I will do result totals across all the races (so this will yield 26 total races). Additionally, I will calculate the cutoff with the old feeder list (minus Berlin) and the new feeder list (minus London and Berlin). So we'll get 3 cutoff prediction values on which to pontificate.