Statistical Analysis of Premier League Match Statistics Using a Regression Analysis in R
Location
Memorial Ballroom, Hall Campus Center
Access Type
Open Access
Entry Number
19
Start Date
4-7-2021 12:00 PM
End Date
4-7-2021 1:15 PM
Department
Statistics
Abstract
This thesis analyzes the correlation between a team’s statistics and the success of their performances, and develops a predictive model that can be used to forecast final season results for that team. Data from the 2017-2018 Premier League season is to be gathered and broken down within R to highlight what factors and variables are largely contributing to the success or downfall of a team. A multiple linear regression model is then used to take out any factors that are believed to have little or no significance in match results.
The predictions about the 17-18 season results based on the model proved to be satisfactory. The model saw an accuracy percentage very near to perfect and allowed for a correct prediction of table standings. This allowed for the next step in the experiment to be conducted which was to analyze and compare the findings with recent seasons effected by Covid-19. The breakdown of a season not effected and a season fully effected allows an opportunity to see what has changed in the game we see today.
Faculty Mentor(s)
Dr. Leslie HatfieldDr. Mark Ledbetter
Rights Statement
The right to download or print any portion of this material is granted by the copyright owner only for personal or educational use. The author/creator retains all proprietary rights, including copyright ownership. Any editing, other reproduction or other use of this material by any means requires the express written permission of the copyright owner. Except as provided above, or for any other use that is allowed by fair use (Title 17, §107 U.S.C.), you may not reproduce, republish, post, transmit or distribute any material from this web site in any physical or digital form without the permission of the copyright owner of the material.
Statistical Analysis of Premier League Match Statistics Using a Regression Analysis in R
Memorial Ballroom, Hall Campus Center
This thesis analyzes the correlation between a team’s statistics and the success of their performances, and develops a predictive model that can be used to forecast final season results for that team. Data from the 2017-2018 Premier League season is to be gathered and broken down within R to highlight what factors and variables are largely contributing to the success or downfall of a team. A multiple linear regression model is then used to take out any factors that are believed to have little or no significance in match results.
The predictions about the 17-18 season results based on the model proved to be satisfactory. The model saw an accuracy percentage very near to perfect and allowed for a correct prediction of table standings. This allowed for the next step in the experiment to be conducted which was to analyze and compare the findings with recent seasons effected by Covid-19. The breakdown of a season not effected and a season fully effected allows an opportunity to see what has changed in the game we see today.