Set up
To run the calculation of fit libraries (Assembline sub-pipeline called efitter) you need a parameter file in Python language format that specifies:
input structures to fit
EM map (or EM maps)
fitting parameters
optionally, options for execution on a computer cluster
Two template parameter files are provided in the Assembline/doc/templates
directory which can be found and inspected from the git repo Assembline. Explanations for the different fitting parameters are provided below and in the two template parameter files named:
fit_params_template_cluster.py
- for a computer cluster (Slurm-based cluster)
fit_params_template_multicore.py
- for a workstation, preferably multicore
Example parameter file with explanations:
#Some import lines required
from efitter import Map
from string import Template
method='chimera' # Fitting method, currently only 'chimera' supported
dry_run = False # Dry run would only print commands it is going to run
run_crashed_only = False # Only run the jobs that have not delivered output
master_outdir = 'path_to_output_directory' # relative path to the output, it will be created in the running directory
# For experimental maps specify: the paths to the maps, the density threshold at which the fitting to the experimental map should be performed, the resolution of the experimental map in Angstrom
MAPS = [
Map('/path_to_your_maps/map1.mrc', threshold=0.01, resolution=5),
Map('/path_to_your_maps/map2.mrc', threshold=0.02, resolution=5),
]
models_dir = '/path_to_your_structure(s)' # directory with the structure files
#the actual structure files in the above directory
PDB_FILES = [
'pdb_file_name1.pdb',
'pdb_file_name2.pdb',
'pdb_file_name3.pdb'
]
CA_only = False # Calculate the fitting scores using Calpha atoms only?
backbone_only = False # Calculate the fitting scores using backbone atoms only?
move_to_center = True # Move the PDB structure to the center of the map prior to fitting?
# Each element of fitmap_args is a dictionary specifying parameters for a run
# If multiple dictionaries are specified,
# the script will run a separate run for each dictionary for each map-structure combination.
# E.g. if two maps, three structures, and two parameters dictionaries are specified,
# the script will run 2 x 3 x 2 = 12 runs.
fitmap_args = [
# We suggest to run a small test run with search 100 first
{
'template': Template("""
map $map
map_threshold $threshold
fitmap_args resolution $resolution metric cam envelope true search 100 placement sr clusterAngle 3 clusterShift 3.0 radius 200 inside .30
saveFiles False
"""),
'config_prefix': 'test' # some name informative of the fitmap_args parameters
},
# Paramters for main runs (https://www.cgl.ucsf.edu/chimera/docs/UsersGuide/midas/fitmap.html)
{
'template': Template("""
map $map
map_threshold $threshold
fitmap_args resolution $resolution metric cam envelope true search 100000 placement sr clusterAngle 3 clusterShift 3.0 radius 200 inside .30
saveFiles False
"""),
'config_prefix': 'search100000_metric_cam_inside0.3' # some name informative of the fitmap_args parameters
},
# Run with alternative "inside" parameter
{
'template': Template("""
map $map
map_threshold $threshold
fitmap_args resolution $resolution metric cam envelope true search 100000 placement sr clusterAngle 3 clusterShift 3.0 radius 200 inside .60
saveFiles False
"""),
'config_prefix': 'search100000_metric_cam_inside0.6' # some name informative of the fitmap_args parameters
}
]
# If necessary, edit the below template of the script that will run individual fitting runs
run_script_templ = Template("""#!/bin/bash
#
echo $job_name
$cmd &>$pdb_outdir/log&
""")
The provided template for a cluster includes an example configuration for the Slurm queuing engine.
If your cluster uses a different queuing engine, modify the respective parameters in fit_params_template_multicore.py
:
cluster_submission_command = 'sbatch' run_script_templ = Template("""#!/bin/bash # #SBATCH --job-name=$job_name #SBATCH --time=1-00:00:00 #SBATCH --error $pdb_outdir/log_err.txt #SBATCH --output $pdb_outdir/log_out.txt #SBATCH --mem=1000 $cmd """)
Note
It is important that your edited template includes $job_name
, $pdb_outdir
, and $cmd
. They will be auto-filled by the program.