Set up

To run the calculation of fit libraries (Assembline sub-pipeline called efitter) you need a parameter file in Python language format that specifies:

  • input structures to fit

  • EM map (or EM maps)

  • fitting parameters

  • optionally, options for execution on a computer cluster

Two template parameter files are provided in the Assembline/doc/templates directory which can be found and inspected from the git repo Assembline. Explanations for the different fitting parameters are provided below and in the two template parameter files named:

  • fit_params_template_cluster.py - for a computer cluster (Slurm-based cluster)

  • fit_params_template_multicore.py - for a workstation, preferably multicore

Example parameter file with explanations:

#Some import lines required
from efitter import Map
from string import Template


method='chimera' # Fitting method, currently only 'chimera' supported
dry_run = False # Dry run would only print commands it is going to run
run_crashed_only = False # Only run the jobs that have not delivered output
master_outdir = 'path_to_output_directory' # relative path to the output, it will be created in the running directory

# For experimental maps specify: the paths to the maps, the density threshold at which the fitting to the experimental map should be performed, the resolution of the experimental map in Angstrom
MAPS = [
    Map('/path_to_your_maps/map1.mrc', threshold=0.01, resolution=5),
    Map('/path_to_your_maps/map2.mrc', threshold=0.02, resolution=5),

]

models_dir = '/path_to_your_structure(s)' # directory with the structure files

#the actual structure files in the above directory
PDB_FILES = [
    'pdb_file_name1.pdb',
    'pdb_file_name2.pdb',
    'pdb_file_name3.pdb'
]

CA_only = False # Calculate the fitting scores using Calpha atoms only?
backbone_only = False # Calculate the fitting scores using backbone atoms only?
move_to_center = True # Move the PDB structure to the center of the map prior to fitting?

# Each element of fitmap_args is a dictionary specifying parameters for a run
# If multiple dictionaries are specified,
# the script will run a separate run for each dictionary for each map-structure combination.
# E.g. if two maps, three structures, and two parameters dictionaries are specified,
# the script will run 2 x 3 x 2 = 12 runs.
fitmap_args = [
    # We suggest to run a small test run with search 100 first
    {
        'template': Template("""
            map $map
            map_threshold $threshold
            fitmap_args resolution $resolution metric cam envelope true search 100 placement sr clusterAngle 3 clusterShift 3.0 radius 200 inside .30
            saveFiles False
            """),
        'config_prefix': 'test' # some name informative of the fitmap_args parameters
        },
    # Paramters for main runs (https://www.cgl.ucsf.edu/chimera/docs/UsersGuide/midas/fitmap.html)
    {
        'template': Template("""
            map $map
            map_threshold $threshold
            fitmap_args resolution $resolution metric cam envelope true search 100000 placement sr clusterAngle 3 clusterShift 3.0 radius 200 inside .30
            saveFiles False
            """),
        'config_prefix': 'search100000_metric_cam_inside0.3' # some name informative of the fitmap_args parameters
        },
    # Run with alternative "inside" parameter
    {
        'template': Template("""
            map $map
            map_threshold $threshold
            fitmap_args resolution $resolution metric cam envelope true search 100000 placement sr clusterAngle 3 clusterShift 3.0 radius 200 inside .60
            saveFiles False
            """),
        'config_prefix': 'search100000_metric_cam_inside0.6' # some name informative of the fitmap_args parameters
        }
]

# If necessary, edit the below template of the script that will run individual fitting runs

run_script_templ = Template("""#!/bin/bash
#
echo $job_name
$cmd &>$pdb_outdir/log&
""")

The provided template for a cluster includes an example configuration for the Slurm queuing engine. If your cluster uses a different queuing engine, modify the respective parameters in fit_params_template_multicore.py:

cluster_submission_command = 'sbatch'
run_script_templ = Template("""#!/bin/bash
#
#SBATCH --job-name=$job_name
#SBATCH --time=1-00:00:00
#SBATCH --error $pdb_outdir/log_err.txt
#SBATCH --output $pdb_outdir/log_out.txt
#SBATCH --mem=1000

$cmd
""")

Note

It is important that your edited template includes $job_name, $pdb_outdir, and $cmd. They will be auto-filled by the program.