Much better. Although redAssists and blueAssists have some high correlation with blueDeaths and blueKills, we leave these properties in the data frame because the correlation coefficients are not too high and the effect of aids on match outcome remains important in our overall analysis.


For the modeling process, a series of GridSearches applications are iterated to fine-tune the hyperparameters of each model while addressing any under- or over-installations that may occur. Let’s now take the results of our logistic regression model:

Not bad for logistic regression! We can see that our macro data point is 0.7215 from the training data, from our test results we got a macro search score of 0.7275, which means that our model predicts 72.75% of the actual gains and losses correctly. We also have no problem with under- or over-equipping.

Let’s compare these results with the results of our GridSearched XGBoost Random Forest model:

Although we used several GridSearches to improve our restore points, we can see that the scores are low and there are still slightly lower scores in the test data. Nevertheless, since we have a similar score as our logistic regression, we can now investigate how the feature imports of our XGBoost model are compared to the feature coefficients of our logistic regression model.

Because our units shown are variable odds, we can interpret the increase in one standard deviation for each of the above characteristics, resulting in a corresponding percentage increase or decrease in profit odds.

Interpretation of results

From the bar images above, we can see that on both of our models, Kills, Deaths, and Grants were ranked the most in the outcome of the match. When we create a bar graph of the number of kills and the rate of victory, we can see a clear correlation between the number of kills made with a 10-minute notation and the resulting profit percentage. The standard deviation of the kills is 3, so we can interpret this 3 kills as an increase, which leads to an increase in odds of more than 100%.

After the customs, we see that both models determine that dragons are significantly more important than the Heralds. Since the standard deviation of the dragons is 0.48, we can interpret this as a dragon murder, resulting in a 20% increase in odds.

From this bar chart, it is easy to see that there is a difference of about 22% in the payout percentage depending on whether or not the team protects the dragon during the first 10 minutes.

And last but not least, let’s look at the average CS in terms of gains and losses:

As any casual or competing League of Legends player should know without doing a thorough data analysis, as we have, CS is an important part of the game. Our feature imports and odds place minion kills definitely a big impact on the outcome of the match, but our data also show that there are about 10 differences between wins and losses in the number of minions killed at 10 minutes. If we divide this between the three bands, it’s just a little over 3 minion difference in 10 minutes. Given that a total of 107 minions per lane spawn during the first 10 minutes of the match, 3 minions per lane does not appear to be a large enough difference. This notion is further reinforced by the fact that one standard deviation in killed minions was close to 22 minions, which means that this increase in CS would increase the profit margin by about 25%. Therefore, we can see that while our models estimate CS to be very important in predicting the result, this feature has a lot of variation, so it’s hard to tell exactly how much the Odds increase when we add a certain amount of killed minions.

It’s important to note that neither model gets more than 73 percent accuracy, which we have, is still far from a perfect predictor of the outcome of the match. This means that it is not possible to conclude with certainty that the above findings are true.

Our results indicate that a similar analysis could be applied to data collected from professional competitions, and it is likely that additional information about specific teams and their players may lead to a better-functioning predictive classifier.

Nonetheless, our logistic regression and XGBoost Random Forest models for this analysis provided practical insights for all League of Legends players looking to rise to the ranks. Using the eigenvalues ​​of our logistic regression model, we are able to see approximately how large the Earnings Odds change as we increase each characteristic.

While the reason I’m stuck in silver may be solely due to the damn control mechanics, I think I’m sure my better understanding of the game elements will at least allow me to better understand the results of matches.


Please enter your comment!
Please enter your name here