Monday, January 25, 2016

Big NYCM Update and Cutoff Prediction

I finally got the New York City Marathon data downloaded and parsed. Finally. Granted, part of the reason it took me so long was just other day-to-day and work commitments but the downloading part was a bit tricky. I didn't want to get myself flagged for issuing too many requests in rapid succession, so I put in a lot of "sleep" time in between making HTTP requests (it didn't help that the endpoint only allowed for chunks of 100 - so I had to download more than 1000 files to get both years' data).

This race is one that is on my bucket list (I went to college in The City so it has a special place in my heart). I see they have made the qualification standards less ridiculously hard, however I still feel I have a rat's chance of making them. So I need to remember to enter the lottery every year and hope I get in. At some point after however many years of lottery attempt I will probably throw in the towel and fundraise. By then, my toddler will be at least in elementary school and hopefully allow me to have a little more free time to do something challenging, like raise a ton of money for charity.

I decided since it's been a bazillion years since I posted something, I would make this a race results analysis and cut off prediction combo post.

And, folks? The data doesn't look pretty. I sort of did a double take when I ran my queries and thought to myself: "Well, this can't be right..."

But it is.

AG Group2014 Qualifiers2014 AG TotalPercentage2015 Qualifiers2015 AG TotalPercentage
F18-3443677755.61%40580495.03%
F35-3921035045.99%20133665.97%
F40-4426936617.35%25734997.34%
F45-49298274810.84%306273011.21%
F50-54221190511.60%223197411.30%
F55-5913093113.96%12492913.35%
F60-644541410.87%6141714.63%
F65-69121229.84%111258.80%
F70-742434.65%54511.11%
F75-791911.11%080.00%
F80+1333.33%1520.00%
M18-3439572845.42%45272656.22%
M35-3925549825.12%27146575.82%
M40-4433759925.62%35752916.75%
M45-4940147008.53%39045408.59%
M50-5433038588.55%31937528.50%
M55-5917619499.03%19621269.22%
M60-64120107111.20%154106014.53%
M65-695443012.56%5243012.09%
M70-74151569.62%1614810.81%
M75-793387.89%3535.66%
M80+0100.00%0100.00%
Totals3711515857.19%3804504797.54%

As you can see, the qualifiers are up. With fewer finishers, we have more qualifiers. I guess the weather in 2014 with the crazy wind must have been worse than the warmer conditions of 2015.

Margin2014Percentage2015Percentage
<1 minute2296.17%2125.57%
1-2 minutes1955.25%1915.02%
2-3 minutes1945.23%1734.55%
3-4 minutes1885.07%1594.18%
4-5 minutes1684.53%1453.81%
5-10 minutes74920.18%77920.48%
10-20 minutes102527.62%112629.60%
20> minutes96325.95%101926.79%
Totals37113804

The Squeaker Pack looks a lot different too.

2014: 26.25%
2015: 23.13%

Qualifiers skewed more to 5+ minute margins.

Because this is a table/data heavy poast as it is, I'm going so skip the comprehensive breakdown of age group percentages. If you really want this data, let me know; I can email it to you.

The totals:

AG2014 Qualifiers2014 AG Total% Qualifiers2015 Qualifiers2015 AG Total% Qualifiers
F18-342025274207.39%1874264407.09%
F35-39909106178.56%883105828.34%
F40-44908100829.01%88098138.97%
F45-49918699013.13%877725112.09%
F50-54546459811.87%597487412.25%
F55-59314230713.61%282228712.33%
F60-6413195613.70%141100913.97%
F65-694030713.03%4433113.29%
F70-748948.51%81017.92%
F75-791224.55%1175.88%
F80+1333.33%1616.67%
M18-341779249267.14%1654234827.04%
M35-39897133146.74%912127267.17%
M40-441053143367.35%1017134277.57%
M45-4912251132210.82%11571127210.26%
M50-54991887611.16%901884610.19%
M55-59590511311.54%590529611.14%
M60-64373266014.02%398268414.83%
M65-69168108815.44%152112713.49%
M70-745038013.16%4538511.69%
M75-798978.25%1110810.19%
M80+0190.00%2277.41%
Totals129351455278.89%124271420918.75%

We still have a lower number of qualifiers and qualification percentages, though only 1.6% fewer year over year.

Margin2014Percentage2015Percentage
<1 minute8106.26%7055.67%
1-2 minutes7675.93%7155.75%
2-3 minutes7185.55%6285.05%
3-4 minutes6545.06%6184.97%
4-5 minutes6134.74%5084.09%
5-10 minutes267720.70%258420.79%
10-20 minutes345626.72%345227.78%
20> minutes324025.05%321725.89%
Totals1293512427

Squeaker pack:

2014: 27.54%
2015: 25.54%

And now, the information you probably care about most... what the the calculation predict.

Over the 10 races we've analyzed so far, the number of finishers achieving 147 seconds of margin in 2014 was 11,014.

Taking the same 10 races in 2015, sorting my margin descending, the 11,014th finisher has a margin of...

119 seconds or 1:59


All I can say is "wow." We're still not at 2:28 but, damn, this is a lot closer than I thought we'd be seeing given the reports of the warmer conditions at NYCM this past year. But I guess strong winds are worse than upper 60s/low 70s.

Hopefully the next races I have on tap don't give me the same pain in download and parse. Worst case I can go back to this one aggregator but I'd rather not if I don't have to. I don't want to get my IP blocked and have zero good options.


7 comments:

  1. How is the analysis for the LA marathon coming along?

    ReplyDelete
  2. Hi there! So LA is not a top feeder and therefore won't be analyzed. I have the data for monumental done I just have to write up the post!

    ReplyDelete
  3. According to marathon guide, over 800 people qualified for Boston thru LA marathon. Link:

    http://www.marathonguide.com/races/BostonMarathonQualifyingRaces.cfm

    According to this website, there were 594 qualifiers. Link:

    http://boston-qualifier-stats.blogspot.com/2014/11/indianapolis-monumental-marathon.html?m=1

    Just curious :)

    ReplyDelete
  4. great work..very curious how it will all pan out especially after Houston and the bunch of Feb races...

    ReplyDelete
  5. Sorry, I meant to ask why LA isn't considered a top feeder because of its high amount of Boston qualifiers.

    ReplyDelete
    Replies
    1. It's not about how many qualify but which one people use to submit their qualification. BAA has the list of top feeders on their website if you're curious!

      Delete
  6. Ah, I see. You are only processing the top 25 feeders, got it! When should we expect the update with Monumental Marathon?

    ReplyDelete