Applied Bracketeering, 2018 Edition: Do streaks matter?

  1. March Madness Bracketeering
  2. Applied Bracketeering: Modeling March Madness
  3. Bracketeering update: Mascot randomness is beating the pants off RPI after round 2
  4. Applied Bracketeering: So, who saw that final four coming?
  5. Applied bracketeering wrapup: Highly-rated team wins in shocking finale
  6. Applied Bracketeering: Does our model also work for the NCAA Women’s tournament?
  7. Applied Bracketeering, 2018 Edition: Do streaks matter?
  8. Applied Bracketeering, 2018: Streaky Clean
  9. Bracketeering Sweet 16 update: The Infallible Braculator agrees to never speak of this past weekend again
  10. Bracketeering Final Four update: Round of the Usual Suspects (and Loyola)
  11. Bracketeering Finale: Much ado about nothing or A tale of four regions
  12. What countries punch above their demographic weight at the World Cup (and can this be predictive)?
  13. World Cup Predictions: in a bonkers first round of games, even the best models get just over 50%
  14. World Cup Predictions: Most models underestimate the chance of a tie.
  15. World Cup Predictions: Knockout round madness
  16. World Cup Predictions: The final countdown
  17. World Cup predictions wrap-up: Vive le France!
  18. The Insufferable Braculator™ Strikes Again. Can your NCAA Women’s Tourney predictions beat it?
  19. The Insufferable Braculator models NCAA Women’s basketball, chapter 2: Concerning chalk

By: Richard W. Sharp

Normally in winter, the world uses waste heat from cryptocurrency mining to heat homes, but soon it will be time to refocus the world’s computing power on something much more productive: producing pool-busting brackets.

It’s time for March Madness, baby! Last year we had a little fun while following the tournament by taking a serious, principled approach based on randomness seeded by school mascots. This year, we’re going to take a new approach by looking for something that the smart money hasn’t taken into account: streaks.   

The Question

How to win the pool?

We’re not trying to objectively determine which team is best, we’re trying to win cold hard cash. That means we focus on what our opponents, the other bettors, will be doing, more than on the teams playing on the court: they will generally start from chalk, pick a couple upsets, and express some irrational overconfidence in their father’s brother’s nephew’s cousin’s former roommate’s alma mater

Last year’s approach: assume upsets are random

Our number one priority is to produce winning brackets. Playing straight seeds is a great way to look smart while losing. In order to place high enough to earn a year’s worth of lording it over your officemates you need to makes some picks that the smart money won’t: you need upsets.

Last year we took this at face value by assuming an upset is just that: something unpredictable, something random. By adding the right amount of randomness to our picks, we hoped to find some unusual winners that would translate to victory. We seeded the randomness by ranking mascots, a random characteristic of each school, from mighty forces of nature, through a middling zoo of animals, on down to a lowly set of colors. In the end, the approach showed promise, but we’re not here to talk about the past. In the end the approach was nice, not thrilling, but nice. Playing chalk or RPI did as expected (poorly), and we concluded that “if you picked one of the high performing randomized brackets, then you should be in striking distance within your pool.”

The new wrinkle

This year, it’s time to do more than just throw in some randomization. Is there something other than random noise that might predict an upset? One candidate is a team on a hot streak. An end-of-season streak may be an indicator that something fundamental has changed for a team, and at just the right time to march to March glory. Perhaps a young team has finally gained enough time on the court together to function as a cohesive unit or maybe a star player has returned from injury. In any case, many of the official ranking systems do not take this information into account. To the simple algorithmic approaches, such as perennial underperformer RPI, a win is a win whether it comes in November or on the closing day of a conference tournament.   

You’ve heard all these nice, heartwarming storylines. So have we. Let’s test them.

  • RPI: Based on a simple combination of winning percentage and the winning percentage of opponents and opponents’ opponents. Percentages don’t take the timing of a win into consideration.
  • LRMC: A Bayesian model out of Georgia Tech that primarily uses point difference in head-to-head competition. It is indifferent to the sequence in which teams play against each other.
  • Coaches poll and AP Top 25: These are subjective polls of individual experts. The final poll before the tournament begins may take streaks into account if the individuals involved deem it important. 

Of course, many of the teams in the tournament will have end-of-season winning streaks by definition: the conference tournament winners get berths in the tournament. We’ll have to work a bit harder than simply adding a binary “on streak/not on streak” variable to the model. We’ll try to determine the quality of the streak by considering factors such as the length of a streak (including whether it needs to be continuous) or the strength of the opponents.1


So here we go: pick up from where we left off last year by using a range of individual model rankings to establish a baseline, add a streak factor to try to detect a fundamental shift in a team that the polls (simple algorithmic ones at least) have missed, and finally add a dash of randomization to pick some upsets that are just flukes. We’ll create a wide range of brackets under a handful of different scenarios and see what performs best in the tournament. We’ll also compare these against some prominent brackets produced by friends and strangers alike.2 No strategy can win the pool every year, but hopefully the models can be improved to the point where they produce brackets that place well consistently.

Let’s get ready to rumble.

1 If possible, we will also add player injuries to the mix, downgrading teams that have lost a starter shortly before the tournament. Good data for this is available, but we will focus first on accounting for a streak feature for the model and add injuries as time allows (or doesn’t).^
2 It seems unlikely that there will be a Commander-in-Chief bracket this year, unless somebody tells the current office holder that his predecessor did it better than he could.

About The Author

Richard is a Seattle area data scientist who builds predictive models and the services that deliver them. He earned a PhD in Applied and Computational Math from Princeton University, and left academia for the dark side of science (industry) in 2010, following his wife to the land of flannel. Fan of coffee, beer, backpacking and puns. Enjoys a day on the lake fishing, and, better, cooking up the catch for a crowd.

No Comments on "Applied Bracketeering, 2018 Edition: Do streaks matter?"

Leave a Comment