This is the read-me for the POPRES analysis. Everything is written in Python - it relies on packages such as numpy, scipy and matplotlib. Make sure to install all of them (for instance via PIP)

There are several classes/files dividing tasks among them:

Main: This is the file where everything comes together; and it provides the text-interface for the user. The user inputs integers to navigate trough the menues.

Load-Data: This is the file which loads the relevant files and does some preliminary tasks; like calculating the distance Matrices.

Analysis: This is a class which actually does most of the inference tasks; or where the MLE-scheme classes are called from. It also contains methods for statistical analysis of the MLE-results, like the bootstrap.

This class also includes the formulas for the fit in the mle model. For inference it creates the mle_estim_error object and passes it the formula and the according starting values.

mle_estim_error: This is the important class which actually does the MLE-estimation. As the name says; it includes the errors, and the error-formulas are written out there. The class inherits from "general_likelihood_model", where the likelihood per observation and underlying model is overwritten. Thus one has a frame-work do to standard-mle tasks. This class also contains the parameters for the mle-analysis, for example the underlying binning. It also has the very important formulas for the error model in it.

var_plots: Stand-alone class for producing various 'nice' plots.

###############################

There are several files needed for the program to run:

ibd-blocklens.csv File containing all block-sharing data: The ids of both individual plus the length of the shared block as inferred by beagle

ibd-pop-info.csv File where every individual is attributed to a country.

country_centroids.csv File with the GPS-position of every included country. Here the weighted population center of populations was used where known; otherwise the coordinates of the biggest city.

The location of these files must be specified in main.py in order for the scheme to run correctly.

##################################

User-Manual:

Make sure that all the above described files are in place and right form; and the main.py knows where to find them. Then the following things should be done in order:

1) Extract data: This extracts the blocks and loads them; this can take a while

2) You can Save/Load the extracted data. This is done to avoid overhead computation every time. Pickle is used for the saving.

3) After extracting the date resp. loading it from pickle, one can go to the "Analyze data" menu.

4) In the menue, one can do several tasks. Most important is the MLE-estimation scheme. For this one can go to the MLE-error submenu:

There one first has to choose the scenario to fit via "Choose MLE-model". Then a single run can be done via "Run Fit". This will yield the MLE-parameters and some statistics. A summary graphical fit can then be done via "Log-Likelihood surface"; or the residual plot done vie "Analyze Residuals"

One can do bootstrap over different units; and summary statistics are printed out then. To visualize the bootstrap results; one can plot the Log-Likelihood surface.