Brian Papiernik • April 2025
This project introduces a Smart Decision-Making Model designed to evaluate offensive shot selection in the NBA by quantifying the opportunity cost of each shot attempt. The goal is to understand whether the player who took the shot was actually the best option on the floor, or if there was a better opportunity available that went unused.
The core of the model is an expected points (EP) estimator trained on over 100,000 historical NBA shots using XGBoost. The model was trained using three key spatial features:
These inputs allow the model to estimate the expected value of a shot attempt based on a snapshot of court spacing at the moment the shot is taken.
To evaluate shot selection, the model captures that same snapshot for all five offensive players on the floor. For the shooter, this represents the value of the shot they took. For the remaining four teammates, we estimate what their expected points would have been if the ball had been passed to them, incorporating adjustments for pass time and defensive closeout pressure. These adjustments allow us to simulate realistic alternatives and assess decision-making beyond just the shot outcome.
These values form the foundation of a Heliocentric Leaderboard, which tracks the opportunity cost and creation value of shot decisions across thousands of possessions. Specifically, we created three teammate-based metrics to contextualize the shooter's shot selection decision:
The model looks at how much value might have been left on the table by comparing the shooter's expected points (EP) to what their teammates could have produced if they got the ball instead. This helps us spot missed passing opportunities and shows where better decisions could have led to more efficient offense.
Importantly, the model also recognizes creation value — situations where the shooter's gravity drew extra defenders and opened high-value opportunities for teammates, even if the ball didn't move. This provides a fuller picture of heliocentric players who generate team value by manipulating defenses, regardless of whether it results in an assist or shot.
While the model is built on NBA data and dimensions, it can be adapted to college basketball too. As long as we have the same spatial information — like shot location, distance from the basket, and how close defenders are, or full player tracking at the time of the shot — the model can still work. With a few adjustments for court size and spacing, this model can help coaches, analysts, and front offices across all levels better understand the balance between shot creation, decision-making, and offensive efficiency.
In today's NBA, many teams rely on heliocentric offenses. These systems are built around one primary creator who dictates most of the team's shot decisions. These players often carry huge offensive loads that not only take a large share of the team's shots but also create opportunities for everyone else. While traditional stats like usage rate and assist percentage give us some sense of how involved these players are, they don’t fully capture the quality of the decisions being made.
Not every shot is a good shot, and not every pass actually leads to something better. What's missing in most metrics is a way to measure the opportunity cost — how many points the team may have left on the table because the star took a tough shot instead of finding an open teammate in a better spot.
That question gets even more interesting when you think about it through the lens of the Weak Link Theory. Basketball is generally considered a “strong link” sport — meaning it's often your best players, not your worst, who determine the outcome. But even the best players can hurt your offense if they consistently force shots instead of trusting teammates. If a heliocentric player is ignoring high-quality options, it might be dragging down the overall efficiency of the team — even if their individual numbers look fine.
This project was built to address that gap. The goal is to evaluate shot decisions in context — not just in terms of whether the shot went in, but whether it was the right decision given the state of the court. Using player tracking data, we estimate expected points (EP) for every offensive player at the moment a shot is taken, and compare the shooter's value to the best available alternative. This allows us to quantify missed opportunities, suboptimal decision-making, and situations where the shooter actually made the best possible choice.
At the same time, the model also recognizes off-ball creation — when a player bends the defense, draws help, or shifts defenders in a way that opens up scoring chances for others. These subtle forms of gravity are a big part of what makes heliocentric stars valuable, even if they don't always result in assists or direct involvement in the play.
By combining spatial data with a shot value model, this framework gives us a better way to understand decision-making — not just who's taking the shots, but whether they're making the most of each possession. It's a tool that can help coaches, analysts, and front offices get a clearer picture of how players are running the offense — and how much better it could be.
While heliocentric players drive much of today's NBA offense, there hasn't been a reliable way to evaluate whether their decisions — especially shot attempts — are truly optimal. Traditional metrics don't tell us how often a player chooses a tough shot over a better alternative, or how much team value is lost when high-usage players overlook open teammates. Without a framework to assess the opportunity cost of these choices, teams risk overvaluing inefficient creation. This project aims to fill that gap by quantifying the difference between the shot a player took and the best available shot on the floor — offering a clearer picture of decision-making, creation value, and missed opportunities.
This model evaluates heliocentric shot selection using spatial, contextual, and modeled variables from NBA tracking data — helping us understand not just what happened, but what could have happened.
The goal of this project is to better understand shot selection and decision-making in heliocentric offensive systems — where one player dominates the ball and dictates much of a team's shot creation. While traditional stats like usage rate and assist percentage tell us how involved a player is, they don't capture the value or opportunity cost of the decisions they're making in real time.
This model looks to answer a simple but important question: was the shot the player took actually the best option on the floor? By using tracking data, we can evaluate the decision in real time — comparing the value of the shot taken to what could've happened if the ball had gone to someone else. It also ties into the idea behind the Weak Link Theory. Even in a sport like basketball where your best players usually matter most, over-relying on one guy to make every decision can backfire if they're missing better options. By highlighting those missed chances, this project gives us a clearer way to see how shot selection and trust in teammates impact overall offensive flow and efficiency.
To address this problem, I built an Expected Points (EP) model using XGBoost, trained on over 100,000 NBA shots. The model estimates the value of a shot based on three spatial features: shot location, distance from the basket, and proximity to the closest defender.
To capture different ways defenders impact shot quality, I actually built two versions of the model. One treats defender distance as a continuous variable, while the other breaks it into categorical ranges (e.g., 0–2 feet, 2–4 feet, etc.) to better account for behavioral thresholds in defensive pressure. This gives us flexibility in how we interpret spacing and contest levels across different types of offensive possessions.
At the moment of a shot, the model applies these learned relationships not only to the shooter, but also to each of the four teammates on the floor. For every offensive player, we extract the necessary features from the court snapshot — including their position, their defender's proximity, and their distance from the hoop — and use the model to estimate each player's expected points if they were to take the shot instead.
We then go a step further by adjusting teammate EP values based on estimated pass time and defensive closeout windows, allowing us to more realistically estimate what would have happened had the shooter passed the ball.
From there, we calculate several decision-making metrics — including opportunity cost (shooter EP vs. max teammate EP), average teammate EP, and the average of the top two options — to give a more nuanced picture of shot quality in context. We also track creation value: rewarding players who shift the defense and create high-EP opportunities for others, even if they don't get an assist.
The result is a spatially grounded, decision-aware framework that helps identify which high-usage players elevate their team — and which ones might be leaving points on the table.
This project relies on two main datasets: a cleaned set of ~100,000 historical NBA shots used to train an Expected Points (EP) model, and a frame-level tracking dataset aligned with play-by-play data to apply that model in real shot contexts. Together, they allow us to evaluate whether a player's shot was the most efficient option available — and to calculate the opportunity cost of that decision. The training dataset provides the foundation for learning how shot value changes based on spatial and contextual variables. The application dataset builds on that by reconstructing every live shot event from tracking data, giving us a full view of all 10 players on the floor when each decision was made.
All data processing, feature engineering, and modeling were completed in R, with custom logic used to extract player positioning, defensive proximity, and outcome context at the moment of each shot.
This dataset includes over 100,000 shots from official NBA game logs, including spatial and contextual features used to train the EP model. The three variables used for model training were:
Two XGBoost models were trained:
This dataset combines NBA player tracking data with play-by-play (PBP) logs to identify the positioning of all 10 players on the court at the exact moment of a shot attempt. For each made or missed shot, the model was applied to:
Using the x and y coordinates of each player and their matchup proximity, the model generates expected points (EP) values for every offensive player involved in the possession. Closest defender distance is derived for each player, allowing for both continuous and categorical EP model predictions.
Including defenders not only provides accurate EP estimates but also allows for future extensions of the model to analyze team-level defensive coverage, help defense gravity, and contest recovery windows in greater depth.
From the predictions made by the EP model, the following decision-based metrics are created:
These metrics help evaluate whether the shooter made the optimal decision, and whether their presence improved the offense — even without a pass or assist.
To train and evaluate the model fairly, the dataset was split into a training set (80%) and a test set (20%) using stratified sampling based on the “points” variable. This approach helps maintain the distribution of scoring across both sets in order to ensure the model is exposed to a range of offensive performance during training that can be evaluated fairly on unseen data.
To model shot quality and decision-making in heliocentric offenses, I used XGBoost — a machine learning technique that builds decision trees one after another, each one learning from the mistakes of the previous. It's a great fit for this kind of project because it handles messy, complex data really well and highlights which variables matter most.
I trained two versions of the expected points (EP) model using over 100,000 historical NBA shots. One version treated defender distance as a continuous variable, while the other binned it into ranges like 0–2 feet, 2–4 feet, and so on, to better reflect how defenders actually pressure shooters. Both models relied on a few key features: shot distance, x/y location on the court, and the defender's proximity.
Once the models were trained, I applied them to tracking data linked with play-by-play logs. This let me evaluate not just the shooter, but every offensive player on the court at the moment of the shot. The model gives an EP value for each player based on their location and defender distance, allowing me to compare what actually happened with what could've happened if the shooter passed instead.
XGBoost worked well here because it's flexible, performs well on big datasets, and gives a clear picture of which variables are doing the heavy lifting — all of which helped me better understand how shot decisions shape offensive efficiency in heliocentric systems.
The Heliocentric Decision-Making model offered a sharper lens into how NBA players navigate offensive decisions — and just how often they pass up better ones. By comparing the expected points (EP) of the shooter to the expected outcomes of their teammates at the time of the shot, we were able to turn subjective film reads into objective, possession-level insights.
Across thousands of NBA possessions, we found:
In other words, roughly two-thirds of the time, the player who shot could have passed to someone with a higher expected return.To make sense of this, we categorized decision-making into four tiers, adjusting the thresholds based on the context of comparison:
Benchmark | Great Decision | Good Decision | Bad Decision | Terrible Decision |
---|---|---|---|---|
Max Teammate EP (m_EP) | EP ≥ m_EP + 0.20 | EP ≥ m_EP + 0.00 | EP ≤ m_EP − 0.25 | EP ≤ m_EP − 0.35 |
Top-2 Teammates EP (2_EP) | EP ≥ 2_EP + 0.15 | EP ≥ 2_EP + 0.05 | EP ≤ 2_EP − 0.20 | EP ≤ 2_EP − 0.30 |
Average Teammate EP (A_EP) | EP ≥ A_EP + 0.15 | EP ≥ A_EP + 0.00 | EP ≤ A_EP − 0.20 | EP ≤ A_EP − 0.30 |
These weren't arbitrary cutoffs — they were calibrated using the distribution of real EP values, ensuring each decision tier reflects a meaningful difference in shot quality. A “terrible” decision might mean a player forced a contested fadeaway while a teammate stood open in the corner. A “great” decision might reflect a smart relocation or self-created look that outperformed all available alternatives.
We also introduced a Heliocentric Value Score, measuring the net difference between a player's shot and a given teammate benchmark (average, top-2, or max). This lets us rank players based on their decision-making efficiency, not just their raw output or usage rate.
Depending on your lens:
To make the data even more actionable, we've created interactive Heliocentric Leaderboards that let you sort players by context, usage, and decision quality across these benchmarks.
What we found was a wide range of tendencies. Some players consistently add value by taking high-EP shots, leveraging spacing, or manipulating defenders to create their own look. Others routinely fire over better options, missing chances to optimize the possession.
Whether you're a coach developing smarter habits, a front office evaluating fit and decision-making under pressure, or a scout looking for unselfish creators, this model gives you a new way to track not just who shoots — but how well they process the floor in real time.
The beauty of this model is how much you can get out of so little. With just three inputs — shot distance, court zone (based on x/y location), and closest defender distance — we can break down any offensive possession and estimate expected points (EP) for every offensive player on the floor, not just the one who took the shot.
As shown in the graphic above, each offensive player gets two EP values. The top number comes from the categorical model, which groups defender distance into ranges like “tight” or “open” to reflect how players feel a contest. The bottom number comes from the continuous model, which uses the exact defender distance to produce more detailed predictions. This two-model setup gives us flexibility. The categorical model is great for coaching and labeling behaviors — like tagging when a player takes a contested jumper. The continuous model is better suited for regression analysis or tracking small trends over time.
With just player and defender locations, we can spot poor shot decisions by comparing the shooter's EP to their teammates', flag possessions where better options were missed, or even diagnose situations where nobody had a good look — which tells us it might be a play design issue, not just a player mistake. It also helps us measure gravity — how a player's movement or presence can create higher-value chances for others, even if they never touch the ball.
One of the most useful parts? We don't need to know whether the shot actually went in. That means we can use this tool live or retroactively, and it scales easily — whether you're looking at NBA tracking data, Synergy clips from college, or even international leagues with basic player mapping.
It also helps you measure gravity — identifying players who consistently create better scoring opportunities for teammates just by moving or spacing correctly, even if they don't end up in the box score.
This model can help players make smarter decisions, especially primary ball handlers. You can show them clips where they took tough shots while a teammate had a better look — backed by numbers, not just coach's intuition. It's also a great tool to reinforce ball movement by highlighting when high-EP teammates were open but ignored. On the flip side, it can show off-ball players the value they bring through cutting, spacing, or relocating into high-EP areas — giving them clearer development goals. And by comparing actual shot outcomes to predicted EP, coaches can identify players who are getting unlucky versus those who may be forcing low-value looks.
For coaches, this model becomes a playbook audit tool. You can use it postgame to pinpoint inefficient possessions — like when someone passed up an open three or forced a tough midrange. You can also evaluate how well your offensive sets create multiple good options, and where the breakdowns happen. On the defensive side, you can identify matchups where opponents are consistently getting good looks — even if the box score doesn't show it yet. This kind of insight adds depth to scouting reports and game prep.
From a team-building perspective, the model adds context to high-usage players. Are they actually creating value, or just taking a lot of shots? It also helps spot undervalued players — guys who regularly get into high-EP areas but don't see the ball, which could point to untapped upside. It's a useful tool for lineup evaluation too — showing how different combinations change shot quality and spacing. Over time, you can even build player archetypes based on EP profiles — like “gravity shooters,” “connectors,” or “ball-stoppers” — which can help shape your roster strategy.
For scouting, you can build EP-based shot maps to see who forces tough looks and who consistently passes up good ones. It also reveals defensive tendencies — like whether a team over-helps and leaves shooters open, or plays conservative and forces isolations. If you're preparing for a playoff series or evaluating a potential free agent, this gives you a deeper layer beyond highlights and counting stats.
What makes this so scalable is its simplicity. You don't need proprietary tracking data or years of historical stats. All you need is player and defender location, shot distance, and court zone. That opens the door to use it in a ton of settings — NBA teams with optical tracking, college programs with Synergy tagging, overseas teams combining video and player charts, even grassroots programs using AI video tools. Whether you're trying to build smarter scouting reports, train players to make better decisions, or evaluate how guys fit into your system, this model gives you a flexible, customizable lens into the decision-making process behind every possession.
Now that we've built a solid foundation for evaluating offensive decisions, there's a lot of potential to expand this framework into other areas — especially on the defensive end, in scouting prep, and in how we think about playmaking.
One natural next step is creating a Defensive Shot Quality Evaluation, or what you might call a “Luck Index.” The idea is simple: compare the expected points (EP) for each shot — based on location and contest — to what actually happened. If a team keeps allowing high-EP shots that are just missing, they're probably not defending as well as it looks. On the flip side, if opponents are hitting tough, low-EP shots, that could be bad luck or just elite shot-making. Either way, it gives us a better way to evaluate defensive performance without relying solely on whether the ball went in. It's especially useful for spotting teams that might be due for regression — or those that are quietly elite but getting unlucky.
Another area to explore is Over-Help and Rotation Tendencies. With this model, we can track when defenders leave their man to provide help and what kind of looks that creates elsewhere. Are they giving up high-EP opportunities on the weak side? Are certain players drawing multiple defenders without even shooting? We can dig into that. It's a great way to spot which offensive actions trigger breakdowns — whether it's isolation drives, post touches, or ball screens — and which defenders or team schemes are vulnerable. You could build heatmaps showing where help is coming from, or use expected closeout times to show when rotations are too slow to matter. It's the kind of stuff that doesn't show up in a box score but completely changes a possession.
We can also take the model one step further by building an Assist Opportunity Model. This shifts the focus from just “did the player pass?” to “should they have passed?” On every possession, we can identify teammates who were open in high-EP spots, even if the ball never got to them. That lets us track missed assists — not just in terms of stats left on the table, but in terms of decision-making. Are certain players drawing attention but failing to find open teammates? Are others consistently making the right read, even if the shot doesn't fall? This kind of insight helps reframe how we define playmaking — not just based on assists, but on the quality of decisions. And it's perfect for film breakdowns and development — showing players when they had a high-value pass available and what they missed.
Together, these ideas unlock a much broader view of the game. You could build a full analytics suite around them — defensive dashboards, playmaker recognition tools, rotation breakdown visualizations — whatever fits your team's needs. And if you want to take it further, they can all be built into radar charts, interactive visuals, or scouting dashboards to help coaches, players, and execs make smarter decisions across the board.
Building on the foundation of expected points modeling, we've introduced a new feature called the Shot Maker Difficulty Value. This metric captures the added value a player provides on a shot by comparing the actual outcome to the expected points (EP) based on shot quality. Specifically, it multiplies the result of the shot (1 for make, 0 for miss) by the point value of the shot (2 or 3), then subtracts the predicted EP. The result quantifies how much a shooter added (or left on the table) relative to expectations — essentially isolating the shot-making talent above what the model predicted. A positive value means the player hit a tough shot or added value through pure skill; a negative one reflects a missed opportunity or poor execution relative to a makeable look. This feature gives us a more refined way to evaluate players' finishing ability and can be a powerful lens for identifying high-level scorers, analyzing performance under pressure, or evaluating consistency across different shot types and game contexts.