Wednesday, November 12, 2008

Handicapping: Betting Without Validation

In the 1840's Austrian physicist Ignaz Semmelweis did a study he thought would change the way that babies were delivered. The infant mortality rate was very high at a hospital he studied, but the good doctor found out that if nurses and doctors simply washed their hands, this rate could be severely reduced. He brought his findings to the hospitals. They asked “why does your data show this?” Semmelweis could not say why, he just told them that it did. The hospitals would not succumb to his wishes and change their washing policy - in fact they fiercely resisted. If he could not tell them why his data showed this, they wanted nothing to do with it.

At the time no one knew it, but this was one of the very first studies into germs and the harm that they can do. If they had implemented his policies, lives could have been saved.

This use of statistics and data (including the above story) was explored in the New York Times bestseller, “Super Crunchers”. The author deduces that when we use data in the right way, we can tell more, much more, than we would using simple human deduction; and we do not have to know why something happens, we just have to know that is does happen.

I believe this to be true. At the racetrack we often hear that intuition is key, or ‘I bet that horse because I knew he was going to win.’ We often times fall prey to the law of small numbers: When we see something happen anecdotally we think it to be true, and extrapolate it as part of our handicapping.

Such things are commonplace in harness handicapping. “Open to blind is a good bet”, “don’t bet a breaking trotter next time as the driver will not try”, “bet come from behinders in the slop”, “don’t bet a claimer off a qualifier” and on and on.

How do we know these are true without data and accompanying information from that data? The answer is that we do not. Without checking our preconceived notions we are making a grave error, perhaps even a fatal one in terms of our handicapping bankrolls. The problem of course is that in standardbred racing, good luck in trying to get a database working to either prove or disprove these theories. It is why I have never bought a harness handicapping book and expected to learn much - I have no idea if what the author is telling me is true, without validating numbers.

William Quirin wrote a fine handicapping book in the 1970's called “Winning at the Races”. It is considered one of the best handicapping books of all time. In it, he validated his data through studying race results. He was the first handicapper to do that.

Recently I read The Power of Early Speed. The writer, Klein wrote that he had an idea early speed was an underbet factor, but he never knew for sure. Then one day he learned of the Daily Racing Form database. He had over 200,000 races with 1.6 million horses to data crunch. When he did, he found out that he was on to something - early speed, leaders at the first call, had an ROI on $2 bet of 3.12 throughout racing history. It made the book.

I have always tried to validate my preconceived notions in harness racing, and I found out many times my notions were wrong, and costly. As I mentioned before in a study on driver changes I found no impact values, or ROI boosts that show anything major at all. I ran numbers at Harrington looking for some numbers to exploit the drivers Tim Tetrick and Tony Morgan - I found nothing other than terrible ROI’s on driver changes or otherwise. Detention barn data was always worth looking at. I fell into the law of small numbers mistake with a few trainers in detention. I thought they were terrible, and so did the handicapping crowd, when in fact they were not bad at all. There was money to be made on that one, you just had to check your ego at the door and admit you didn’t know squat until you saw the numbers. Unfortunately (and I think one of the reasons harness handicapping has never caught on like the runners) the problem with doing impact values in racing is you have to do them by hand. Who has the time to do that and bet $1000 win pools?

With thoroughbreds it is absolutely fun (and sometimes profitable) to data mine and see if your handicapping angles are smart, or dumb. For example, maiden special weights dropping in class to maiden claimers was always a favorite angle in the handicapping books. Running that in my 2008 database shows this:

Starts: 1503
Wins: 203
W%: 13%
ROI: 0.81

Not something I think we should spend too much time on, eventhough the IV is over 1. The ROI isn’t even close to pursuing.

As we mentioned before on the blog, let’s check that angle we hear handicappers trumpet from time to time, blinkers on:

Wins: 709
Starts: 6875
W%: 10%
ROI: 0.7149
Impact Value: 0.8348

It’s a one way ticket to the poor house.

We sometimes hear “don’t bet the odds board and ignore it with first time starters because people overbet the unknown.” Let’s check:

First time starters bet below 8-5:
Wins: 31
Starts: 82
W%: 38%
ROI: 0.92

With a rebate you almost break even. Not a bad angle at all. The odds board signals winners and these winners are underbet.

It is difficult for us to think broad and data driven like this. The racetrack is filled with stories and angles and such. I still get caught with it from time to time, although I watch myself constantly. I was chatting with a thoroughbred handicapping friend who plays Woodbine and I told him (because I saw it ‘once or twice’ for my validation) that I heard people in the grandstand saying that WEG regular Emile Ramsammy was terrible on speed horses. After watching him strangle a speed horse he agreed wholeheartedly: Never bet this guy with a speed horse we both said. Of course I had to check, expecting to see my bias validated:

Ramsammy, E
Speed horses Ridden: 252
Wins: 42
W%: 17
ROI: 1.06

If I bet $100 to win each time Emile rode a speed horse, with no other handicapping, I would have made $1700. Instead I looked to bet against him because two or three times I saw him go to the back and strangle a speed horse.

This game is almost impossible without lower takeouts, there is no reason to make it even tougher by making bad bets based on anecdotal evidence. Now if we can somehow get harness racing to offer API’s and other data features for horseplayers, instead of locking up data like it is the last Big Mac in a famine, we might be able to up the bet, learn something and validate our handicapping.

I am trying to validate several things in harness racing, but have not had a chance yet. If anyone wants to help (i.e. do a little tabulation work), let me know by emailing. It will be boring work (we would have to crunch numbers by hand) but we can't help that. Maybe it would be interesting, who knows?

3 comments:

Cangamble said...

I used Klein's system to calculate daily speed biases when I do my own track variants.
But Klein's book is now sort of outdated thanks to the influx of artificial surfaces, most of which are not very kind to speed.
Still, I guess there are enough dirt tracks around.
Have you ever read Speed To Spare. Now there is a book for the ages that gets little attention.

Winston...not really said...

Get in touch with thoroughmetrics.blogspot.com.

That's what he does.

Pull the Pocket said...

thanks Not Really, that looks very interesting.

CG,

I read that one awhile back. I can't really remember it, so I better look at it again.