A YouTube video on Excel modeling during COVID. That's where this started. Five years and seven sports later, the models speak for themselves. Here's the full story.
"A YouTube video. Free time. A rabbit hole with no bottom."
Spring 2020. Like a lot of people, I had more time than I knew what to do with. A YouTube video on building a sports betting model in Excel came across my feed. I had no formal background in data science or statistics. Just curiosity and free time. I watched it, built the model, got it wrong, and built it again.
That was it. I was hooked. What started as an afternoon project turned into months of reading about probability theory, historical data, and predictive modeling. I learned what standard error actually means. I found out how to convert odds to implied probabilities and what an overround is. I discovered Closing Line Value and figured out why win rate alone tells you almost nothing about whether a bettor actually has edge.
The more I dug in, the more I realized there is a significant gap between how professional bettors think about this problem and how most people approach it. The frameworks that govern long-run profitability are not complicated once you understand them. They are just not widely known. That asymmetry was exactly where the opportunity lived.
The Excel model was the starting point. R was the first real coding language, which forced me to think more rigorously about data structures and statistical computation. Then Python, which became the backbone of everything that followed. Picked up enough C++ and SQL along the way to understand the tooling. Now it is mainly Python and R, depending on what I am building. The language matters less than the thinking behind it.
I spent months on Bayesian inference. Understanding how to update a probability estimate as new information arrives changes how you think about every game, every line, every injury report. It is not about having a hot take. It is about having a prior belief grounded in data and updating it correctly when new evidence comes in.
Shrinkage estimators, Monte Carlo simulation, variance decomposition. Each framework I picked up changed how I thought about the problem. The question stopped being who wins and became something more useful: what is the true probability distribution of outcomes, and where does the market price disagree with that distribution?
Early on, I built a backtest that showed a win rate too good to be real. Spent days trying to understand it before finding the problem: I had accidentally used closing line data to generate features that were supposed to represent pre-game information. The model had essentially been trained on the answer. In backtesting it looked like genius. In live use it would have been worthless. That is data leakage, one of the most common and most destructive mistakes in predictive modeling, and it is completely invisible unless you understand where every piece of data comes from and when it actually becomes available.
There is no shortcut to learning that. You learn it by building something that breaks and having to figure out why. That kind of failure teaches you in a way that reading about it never does, and it is the foundation everything else was built on.
"Seven sports. A dedicated system for each. Rebuilt every season."
The Excel model became a Python model. After that, one sport at a time, each needing its own architecture built for that market's specific dynamics. There is no single model running across everything. Seven sports means seven separate systems, each with different inputs, different data sources, and different market structures.
Every sport required building from scratch, and every season means evaluating what worked and rebuilding what didn't. The NCAAB totals model was rebuilt when tempo-based features stopped producing CLV. The NHL model was a 2-way moneyline system for four seasons before switching to 3-way regulation pricing when the draw kept eating into returns. The golf model went through a complete architectural overhaul when the original approach was found to have calibration bias. Building a model once and leaving it static is not how this works.
Each model uses exponential decay weighting so recent performance matters more than data from two years ago, without pretending a single hot week tells you anything meaningful. Bayesian shrinkage keeps sample-size uncertainty from inflating predictions on teams or players with thin historical records.
Monte Carlo simulation handles the variance side. For any given game, I'm not producing a point estimate. I'm producing a probability distribution across 100,000 simulated outcomes, then comparing that distribution against the market price to find where the implied probability is wrong.
Every prediction has a traceable path back to its inputs. I can explain any output: here is what drove the number, here is how each component was weighted, here is why this model produced this result and not another. That transparency is not a marketing feature. It is a requirement for knowing when a model is wrong and why.
All of this was built before AI made it trivially easy to generate code. When I started, if I didn't understand a concept well enough to implement it myself, the model couldn't use it. There was no tool you could describe a framework to and get a working pipeline back in seconds. Every piece had to reflect a real understanding of what it was doing and why, because there was no other way to get it running. That constraint turned out to be the whole point.
That era is over. Today anyone can describe a model to an AI and have functional-looking code in minutes. Which sounds like progress until you understand what "functional-looking" actually means. A model that runs is not a model that works. The two have nothing to do with each other.
No AI tool will flag data leakage. It will write clean, structured code using whatever features you describe, produce a beautiful backtest, and show you performance numbers that look legitimate. What it cannot do is know whether the features you fed it contain information that would not actually be available at the time of the bet. That one mistake, data that bleeds future information into the training set, makes a model look like a money printer in backtesting and a guaranteed loss in live deployment. The backtest is not lying to you. The model is lying to the backtest, and it does not know any better because nobody told it what the mistake looks like from the inside.
There are more failure modes layered on top of that. Overfitting: a model that memorized historical patterns rather than learning predictive ones, which looks excellent on in-sample data and falls apart immediately on anything new. Walk-forward test contamination: a validation set that saw information from the training set, so the out-of-sample performance is not actually out-of-sample. Feature selection that introduces look-ahead bias. Calibration that made the backtest numbers neat but degraded live probability estimates. None of these show up as errors. They show up as live results that do not match what the model promised, after the money is already in the market.
Understanding these failure modes is not a bonus feature. It is the entire job. I know them because I built models that had them, found them, and rebuilt. That process is what protects the subscriber, not the code. Not the framework. Not the tool that generated it. The years of understanding what can go wrong and making sure it doesn't.
"Five years. Every season on the board. Good ones and bad ones."
The models have been tracked since 2021. Every pick logged with the line I received, every closing line recorded for CLV measurement. The results are documented year by year across seven sports, and that includes the losing seasons.
2022-23 NCAAB spreads was down 27.6 units. 2025-26 NHL is currently down 12.1 units. Those numbers are on the results page the same way the good ones are. That is the only way to evaluate a betting process honestly. Cherry-picked results are not results. They are a narrative.
If lines consistently move in your direction after a pick is released, you are finding value before the market corrects to it. That is the mathematical definition of edge. A 58% CLV rate on NFL means that on 58 out of 100 NFL picks, the line moved toward my side after the pick went public. Not won. Moved. The result of any individual bet is noise. The direction lines move after you bet is signal.
Closing Line Value is the most honest metric available for evaluating a betting process. Win rate fluctuates with variance. CLV, tracked over hundreds of bets, converges toward your actual edge. Our CBB Totals model has the strongest CLV we track: 67% of lines have moved in our direction after release across an 806-unit sample.
A model with data leakage in its backtest produces spectacular historical numbers. It also converges toward random in live deployment, usually within a few hundred bets, because the edge was never real. The live record being in the same range as what walk-forward testing predicted across five years is the most meaningful validation this process has. It means the backtest was built honestly. No leaked features, no contaminated test sets, no optimistic calibration. What the model said it could do in testing is roughly what it has done live, including the losing seasons. That match does not happen with a model nobody stress-tested.
"The honest version. No dancing around it."
Most services give a vague answer to this question. Here is mine.
A guarantee. Sports betting is probabilistic by design. You will have losing weeks. You will have losing months. Anyone who tells you otherwise is either lying or doesn't understand variance. The edge accumulates in the long run, not over the next weekend.
If you are looking for a lock-of-the-week service that promises guaranteed profit, this is the wrong place. If you understand probability, track your bets, and want picks backed by five years of documented statistical work, you are in exactly the right place.
The barrier to generating a model that looks credible has never been lower. Anyone can prompt their way to a backtest, post the results, and start selling picks. What you cannot prompt your way to is understanding whether that backtest is valid, whether the features are clean, whether the live performance will match what the historical numbers showed, or what to actually fix when it doesn't.
Most of what is entering the market right now is built by people who have never watched a model fail in live deployment and had to diagnose why. They haven't seen data leakage destroy a backtest that looked real. They haven't rebuilt a calibration from scratch after realizing their variance estimates were wrong. They haven't watched three months of live results confirm or contradict five years of historical testing. The model runs. That is all they know.
The protection is not the technology. It is the understanding behind it, built the hard way, before the shortcuts existed, by someone who has already made the expensive mistakes in their own time so you don't make them with your money.
This started with a YouTube video and free time during COVID. It became something built on real statistical depth, documented results, and a process that holds up to scrutiny. If that is what you are looking for, the door is open.