Applied Bracketeering: Does our model also work for the NCAA Women’s tournament?

By: Richard W. Sharp

The question

This March, we tested a method for producing pool-winning brackets for the NCAA men’s basketball tournament. The approach showed promise, but how would it perform in the women’s championship?

The women’s tournament has an element of predictability that the men’s tournament lacks: UConn always wins (4 of the last 5 championships and 111 of their last 112 games). The most significant result in college basketball since 2014 was Mississippi State’s last second victory over the Huskies in 2017’s Final Four.

So, how does this change our just-add-randomization strategy? Given the importance of selecting the winner, worth 1/3 of the total points by the popular bracket scoring system we’ve been using, some strategic changes are probably called for if 95%+ of knowledgeable bracketeers pick the same team to win it all.


Data, data anywhere?

Before we begin, a statement on the state of the data: not so great. The relative dearth of data available for women’s sports in general has slowed us down. It was relatively simple to find easy-to-process historical results for the men’s tournament. Developing a comparable dataset for the women’s tournament has proven much more difficult and so far precluded a more thorough analysis than a discussion of general strategies based on past tournament results. Right now, the idea of producing predictive models before the women’s tournament based on things like strength of schedule is still a pipe dream.


Chalk talk

Nevertheless, would our approach to the men’s bracket fare well in the women’s tournament? Anybody who tells you that their women’s bracket didn’t have UConn in the champions slot is either lying, damn lying, or playing insanely randomized brackets. Obama, winner of our men’s pool, certainly didn’t go out on a limb here. Not only is UConn his overall winner, the Final Four consists of three number 1 seeds and Washington, a 3 seed (Husky fever? It’s certainly the right mascot to place at the top of a mascots ranking if UConn is in the mix). More generally, not only is the champion an easy choice (in theory), the number of upsets overall in the tournament is pretty low. In the women’s tournament, chalk is generally the way to go.

Let’s score Obama’s bracket, along with a bracket of straight seeds with our standard scheme: 1 point for each correct pick in the first round and points double for each round thereafter. In this scenario straight seeds beat Obama 94 to 85, primarily because of a not-so-hot third round for the former president who inexplicably picked Notre Dame to win twice (apparently he believed them to be such a dominant force that he had them beating their actual opponent, Ohio St., then as a special bonus also beating Baylor and Louisville). Worse for him, however, he went 4-4 while straight seeds went 6-2 in the round, accounting for 8 of the 9 points separating the two brackets.

UConn
What do we say to the gods of chalk? Not today.


The bottom line

So what did you have to do this year to beat chalk?

First, pick Mississippi State over UConn, but you didn’t do that. Nobody did. So which other teams could have given you the boost you need to win?

  • Oregon (10)
    • upsets 
      • 1st round: defeated Temple (7) +1
      • 2nd round: defeated Duke (2) + 2
      • 3rd round: defeated Maryland (3) + 4
    • total points over chalk: +7
  • Florida St. (3)
    • upsets
      • 3rd round: defeated Oregon St. (2) +4
    • total points over chalk: +4
  • Quinnipiac (12)
    • upsets 
      • 1st round: defeated Marquette (5) +1  (+16 for spelling both team names correctly on your bracket)
      • 2nd round: defeated Miami FL (4) + 2
    • Total points over chalk: +3 (+19 with the spelling bonus, and potentially valuable for Quinnipiac bracketeers who saw it coming)

Purdue, Ohio St., and Cal also pulled off minor upsets.


Conclusion

The women’s bracket is not nearly as volatile as the men’s. Therefore, it’s hard to see multiple plausible paths to victory in a pool as we did on the other side where Obama’s pick-the-winner approach was neck-and-neck with nailing the Final Four.

If the endgame is the same on all brackets (victory!!!!), then clearly we need to switch up the strategy a bit: pick a couple upsets early, then quickly revert to chalk. At the very least this means using a smaller randomization factor in our approach.

Alternatively, what it may really mean is that you should focus your psychic powers (or at least your in-depth research) on picking up the predictable. Do your homework. Read the injury reports and pick up on trends that others haven’t seen. The high accuracy of the seeding process may indicate that upsets are anything but: the gap between seeds is real and a fundamental change, undetectable at seeding time (undisclosed injury? tactical matchup?) is the true driver behind a so-called upset. 

About The Author

Richard is a Seattle area data scientist who builds predictive models and the services that deliver them. He earned a PhD in Applied and Computational Math from Princeton University, and left academia for the dark side of science (industry) in 2010, following his wife to the land of flannel. Fan of coffee, beer, backpacking and puns. Enjoys a day on the lake fishing, and, better, cooking up the catch for a crowd.

No Comments on "Applied Bracketeering: Does our model also work for the NCAA Women’s tournament?"

Leave a Comment