All-Time-Premier-League-Player-Statistics (Q6036645)

From MaRDI portal
OpenML dataset with id 43548
Language Label Description Also known as
English
All-Time-Premier-League-Player-Statistics
OpenML dataset with id 43548

    Statements

    0 references
    0 references
    Context\NI am a really huge football fan and the Premier League is one of my favourite football (or soccer, whatever you like to call it) leagues. So, as my very first dataset, I thought this would be a great opportunity for me to make a dataset of player statistics of all seasons from the Premier League.\NThe Premier League, often referred to as the English Premier League or the EPL outside England, is the top level of the English football league system. Contested by 20 clubs, it operates on a system of promotion and relegation with the English Football League (EFL). Contested by 20 clubs, it operates on a system of promotion and relegation with the English Football League. \NHome to some of the most famous clubs, players, managers and stadiums in world football, the Premier League is the most-watched league on the planet with one billion homes watching the action in 188 countries.The league takes place between August and May and involves the teams playing each other home and away across the season, a total of 380 matches.\NThree points are awarded for a win, one point for a draw and none for a defeat, with the team with the most points at the end of the season winning the Premier League title. The teams that finish in the bottom three of the league table at the end of the campaign are relegated to the Championship, the second tier of English football. Those teams are replaced by three clubs promoted from the Championship; the sides that finish in first and second place and the third via the end-of-season playoffs. \NDetails about the dataset\N\NSome players of certain position may not have certain statistics - For example, A goalkeeper may not have a statistic for "Shot Accuracy"\NThe format for the filename is - dataset - yyyy-mm-dd Date\N(The date is date when the file was last updated on)\N\NContent\NThe data was acquired from:\Nhttps://www.premierleague.com/ \NI made a BeautifulSoup4 Web Scrapper in Python3 which automatically outputs a csv file of all the player statistics. The runtime of the file is about 20 minutes but it varies with the bandwidth of the Internet connection. I made this program so that this dataset could be updated weekly. The reason for weekly update is that the statistics change after each match played by the player so I felt that for the most up-to-date results, such a program is needed. Planning this project took 2 days. Making the program in Python3 took 7 days and the testing and bug fixing took another 5 days. The project was completed in the span of 2 weeks.\NAcknowledgements\NSource credits : https://www.premierleague.com/\NImage credits : https://rb.gy/wuiwth\NInspiration\NHow do variables like age, nationality and club affect the player performance? \NKnown issues in the dataset\N\NGoals per match displays an abnormally high value for a few players as the HTML displays incorrect value during first few milliseconds of loading the page. I am trying to fix it analytically rather than scrapping directly from the website.
    0 references
    24-09-2020
    0 references
    23 March 2022
    0 references
    e9d581d396966abb04d3ab7614999968
    0 references
    0
    0 references
    59
    0 references
    571
    0 references
    10,224
    0 references
    52
    0 references
    0 references