If you’ve ever listened to basketball analysts, you’ve likely heard the phrase, “defense wins games.” But how true is that? At Data Explorers, LLC, we put this claim to the test using real NBA data from games played between 2019 and 2022. Here's what we discovered.
Scoring the Game: Feature Engineering
We used feature engineering to create offensive and defensive scores for each game. The scoring algorithm compared a game’s individual stats (like rebounds or steals) to the dataset average. For example, if a team's rebounds in Game X exceeded the dataset’s average, a score of +1 was assigned. Scores across all relevant stats were then totaled to calculate the team’s overall score.
For instance, if a team's defensive rebounds, steals, and blocks earned scores of +1, +1, and -1 respectively, the defensive score for that game would be +1 (1 + 1 - 1 = 1).
Insights from the Data
Once scores were calculated, we analyzed their impact on game outcomes. While not all "good" scores led to wins and not all "bad" scores resulted in losses, the data revealed a clear positive linear relationship between win percentages and both offensive and defensive efforts.
Interestingly, offensive effort correlated more strongly with wins than defensive effort.
- Offensive score vs. outcome: Pearson correlation coefficient = 0.39 (moderate positive correlation).
- Defensive score vs. outcome: Pearson correlation coefficient = 0.29 (weak positive correlation).
Hypothesis Testing: Impact on Points
To further quantify the impact of effort, we conducted a hypothesis test comparing point differentials (team score vs. opponent score) under different effort levels. The results showed:
- Teams with good offensive effort outscored opponents by an average of 6.5 points.
- Teams with good defensive effort outscored opponents by 4.1 points.
- Conversely, teams with poor offensive effort scored 7.4 points fewer than opponents, while those with poor defensive effort scored 4.0 points fewer.
In both cases, offensive effort had the greater influence on point differentials.
Predicting Game Outcomes with Machine Learning
Given the significance of offensive effort, we trained a machine learning model to predict game outcomes based on offensive stats. The model achieved a prediction accuracy of 71.7%. Curious about how it works? Try the app below to see it in action!