Set up
======

To run the calculation of fit libraries (Assembline sub-pipeline called efitter) you need a parameter file in Python language format that specifies:

    * input structures to fit
    * EM map (or EM maps)
    * fitting parameters
    * optionally, options for execution on a computer cluster 

Two template parameter files are provided in the ``Assembline/doc/templates`` directory which can be found and inspected from the git repo `Assembline <https://git.embl.de/kosinski/Assembline/-/tree/master/doc/templates>`__. Explanations for the different fitting parameters are provided below and in the two template parameter files named:
   
    * ``fit_params_template_cluster.py`` - for a computer cluster (Slurm-based cluster)

    * ``fit_params_template_multicore.py`` - for a workstation, preferably multicore


Example parameter file with explanations:

.. code-block:: python

    #Some import lines required
    from efitter import Map
    from string import Template


    method='chimera' # Fitting method, currently only 'chimera' supported
    dry_run = False # Dry run would only print commands it is going to run
    run_crashed_only = False # Only run the jobs that have not delivered output
    master_outdir = 'path_to_output_directory' # relative path to the output, it will be created in the running directory

    # For experimental maps specify: the paths to the maps, the density threshold at which the fitting to the experimental map should be performed, the resolution of the experimental map in Angstrom
    MAPS = [
        Map('/path_to_your_maps/map1.mrc', threshold=0.01, resolution=5),
        Map('/path_to_your_maps/map2.mrc', threshold=0.02, resolution=5),

    ]

    models_dir = '/path_to_your_structure(s)' # directory with the structure files

    #the actual structure files in the above directory
    PDB_FILES = [
        'pdb_file_name1.pdb',
        'pdb_file_name2.pdb',
        'pdb_file_name3.pdb'
    ]

    CA_only = False # Calculate the fitting scores using Calpha atoms only?
    backbone_only = False # Calculate the fitting scores using backbone atoms only?
    move_to_center = True # Move the PDB structure to the center of the map prior to fitting?

    # Each element of fitmap_args is a dictionary specifying parameters for a run
    # If multiple dictionaries are specified,
    # the script will run a separate run for each dictionary for each map-structure combination.
    # E.g. if two maps, three structures, and two parameters dictionaries are specified,
    # the script will run 2 x 3 x 2 = 12 runs.
    fitmap_args = [
        # We suggest to run a small test run with search 100 first
        {
            'template': Template("""
                map $map
                map_threshold $threshold
                fitmap_args resolution $resolution metric cam envelope true search 100 placement sr clusterAngle 3 clusterShift 3.0 radius 200 inside .30
                saveFiles False
                """),
            'config_prefix': 'test' # some name informative of the fitmap_args parameters
            },
        # Paramters for main runs (https://www.cgl.ucsf.edu/chimera/docs/UsersGuide/midas/fitmap.html)
        {
            'template': Template("""
                map $map
                map_threshold $threshold
                fitmap_args resolution $resolution metric cam envelope true search 100000 placement sr clusterAngle 3 clusterShift 3.0 radius 200 inside .30
                saveFiles False
                """),
            'config_prefix': 'search100000_metric_cam_inside0.3' # some name informative of the fitmap_args parameters
            },
        # Run with alternative "inside" parameter
        {
            'template': Template("""
                map $map
                map_threshold $threshold
                fitmap_args resolution $resolution metric cam envelope true search 100000 placement sr clusterAngle 3 clusterShift 3.0 radius 200 inside .60
                saveFiles False
                """),
            'config_prefix': 'search100000_metric_cam_inside0.6' # some name informative of the fitmap_args parameters
            }
    ]

    # If necessary, edit the below template of the script that will run individual fitting runs

    run_script_templ = Template("""#!/bin/bash
    #
    echo $job_name
    $cmd &>$pdb_outdir/log&
    """)


The provided template for a cluster includes an example configuration for the Slurm queuing engine.
If your cluster uses a different queuing engine, modify the respective parameters in ``fit_params_template_multicore.py``:

    .. code-block:: python
    
        cluster_submission_command = 'sbatch'
        run_script_templ = Template("""#!/bin/bash
        #
        #SBATCH --job-name=$job_name
        #SBATCH --time=1-00:00:00
        #SBATCH --error $pdb_outdir/log_err.txt
        #SBATCH --output $pdb_outdir/log_out.txt
        #SBATCH --mem=1000

        $cmd
        """)

.. note:: It is important that your edited template includes ``$job_name``, ``$pdb_outdir``, and ``$cmd``. They will be auto-filled by the program.