Story
scrape_crashes.py: web-scraping game logs from Ethercrash
This script is a Python-based web scraper designed to extract data from the EtherCrash game logs. The script utilizes the nodriver library to automate the process of fetching game data from the EtherCrash website. The main purpose of this script is to scrape game logs, including the crash value, timestamp, player count, and the sum of all player’s bets in Ethos.
The script addresses several challenges and features to avoid detection and maintain a more natural and human-like scraping behavior. These include randomizing the header, implementing random wait times, fetching other pages occasionally, and shuffling the order of game IDs in fixed-size chunks.
The script takes two arguments: the start ID and the end ID. It scrapes the game logs in decreasing order of IDs, starting from the start ID and ending before the end ID. The script also supports a chunk size argument, which determines the size of the chunk of game IDs to shuffle. This helps in avoiding detection and maintaining a more natural scraping behavior.
The script outputs the scraped data to the standard output in CSV format. It supports a header argument, which writes a header row to the output. The script also supports a verbose argument, which shows verbose output during the scraping process.
Overall, this script is a useful tool for anyone interested in analyzing EtherCrash game logs. By automating the process of data extraction and implementing features to avoid detection, the script enables users to scrape large amounts of data efficiently and accurately.
infinite_money.py: compute an optimal game strategy from the CSV game logs
This script is a data analysis tool that aims to find the optimal cashout value based on the available game logs from the EtherCrash game. The script uses a binary search algorithm to efficiently search for the optimal cashout value. The script also provides visualizations of the data using the matplotlib library.
The script reads game logs from a CSV file and creates a sequence of Game objects, each representing a single game. The script then creates a sequence of Cashout objects, each representing a possible cashout value. The script calculates the actual win probability and the difference between the actual win probability and the required win probability for each Cashout object. The script also calculates the gain for each Cashout object, which is the difference multiplied by the value of the cashout.
The script uses several visualization functions to display the data. The plot_diff function plots the difference between the actual win probability and the required win probability for each Cashout object. The plot_gain function plots the gain for each Cashout object. The scatter_by function creates a scatter plot of the data, with the x-axis representing a specified attribute of the Game objects and the y-axis representing another specified attribute.
The script also includes several other functions for creating leaderboards, calculating the median and mean of a sequence of numbers, and plotting box plots and bar charts.
Overall, this script is a useful tool for anyone interested in analyzing EtherCrash game logs and finding the optimal cashout value. By using a binary search algorithm and providing visualizations of the data, the script enables users to efficiently search for the optimal cashout value and gain insights into the data.