JSON configuration file

Creating JSON file

Create XlinkAnalyzer project file for your complex
Note

XlinkAnalyzer is used here as a graphical interface for input preparation in Assembline.

Does not matter if you do not have crosslinks - we will use XlinkAnalyzer to prepare the input file for modeling.
1. Add all subunits using Xlink Analyzer graphical interface
2. Assign unique chain ID and color to every subunit
3. Optionally, define domains within subunits. You could then refer to these domains in the Selectors
4. Add sequences using Setup panel in Xlink Analyzer, map the sequences to names of subunits using the Map button
5. Add crosslinks if available, map the crosslinked protein names to names of subunits using the Map button
Make a copy of the Xlinkanalyzer project file. This copy will be next manually edited to add modeling directives. E.g. Copy XlinkAnalyzer_project.json as X_config.json

Open X_config.json in a text editor

Note

The project file is in so-called JSON format

While it may look difficult to edit at the first time, it is actually quite OK with a proper editor (and a bit of practice ;-)

We recommend to use a good editor such as:

SublimeText

Atom

At this point, the JSON has the following format:
{
    "data": [
        {
            "some xlink definition 1"
        },
        {
            "some xlink definition 2"
        },
        {
            "sequence file definition"
        }
    ],
    "subunits": [
            "subunit definitions"
    ],
    "xlinkanalyzerVersion": "1.1.1"
}

[Optionally] Define series

Serie is a group of related subunit copies e.g. related by symmetry.

In Assembline configuration, series are used to:

create multiple copies of the same subunit

define symmetry acting on the subunits within the serie

define symmetry between two different series, if any

Series can be specified in the top-level series block, at the same level as data, subunits etc.:
{
    "series": [

        "series specifications go here"

    ],

    "data": [
        {
            "some xlink definition 1"
        },
        {
            "some xlink definition 2"
        },
        {
            "sequence file definition"
        }
    ],
    "subunits": [
            "subunit definitions"
    ],
    "xlinkanalyzerVersion": "1.1.1"
}
Specification of a single serie:
{
    "name": "Name of the serie",
    "subunit": "Subunit instance for this Serie",
    "mode": "[optional] 'auto' or 'input', default: 'input'
             a parameter used in imp_utils1 to define whether PMI copies ('input') or clones ('auto') should be created",
    "cell_count": "[optional] How many elements in the Series, default: 1",
    "tr3d": "[optional] Name of the transformation relating Subunits in this Serie, as defined in the symmetry config below",
    "inipos": "[optional] 'input' or name of a symmetrical transformation in config",
    "ref_serie": "[optional] Reference Serie. If ini_pos is a transformation, imp_utils1 will transform
                  each copy in this serie relatively to the ref_serie by that transformation",
    "states": "[optional] List of state indices this serie should act on. Not sure if it's working."
}
Example:
{
    "series": [
        {
            "name": "2fold",
            "subunit": "Elp1",
            "mode": "input",
            "cell_count": 2,
            "tr3d": "2fold",
            "inipos": "input"
        },
        {
            "name": "2fold",
            "subunit": "Elp2",
            "mode": "auto",
            "cell_count": 2,
            "tr3d": "2fold",
            "inipos": "input"
        },
        {
            "name": "2fold",
            "subunit": "Elp3",
            "mode": "auto",
            "cell_count": 2,
            "tr3d": "2fold",
            "inipos": "input"
        }
    ]
}
defines three series, one for each of the three subunits Elp1, Elp2, Elp3, two copies per subunit as defined by cell_count (six molecules total).

"mode": "input" and "mode": "auto" define how molecules will be created behind the scenes. auto would be typically used if you have structure for one subunit copy and you want the system to auto-generate the remaining copies a clones of the first. If unsure, keep input.

Users familiar with PMI: auto would build clones, input would build copies.

Here, we keep input for Elp1 subunit, as one of the input PDB structures is a dimer comprising fragments of both copies of Elp1. We keep auto for Elp2 and Elp3, as for them, we have structures of one copy only and we let the system to generate clones of the the first copy.

tr3d points to a name we gave to a transformation matrix for the 2-fold symmetry, which we define below.

"inipos": "input" defines the initial position. If input the positions will be as in the input PDB files. inipos can be set to a name of another series (see the example below why)

Examples:

Note

In some examples provided in the tutorials or our published work you may see “series” defined in a different place, under the “symmetry” block. That is an old way, both are supported for backward compatibility.

[If applicable] Add symmetry information

Symmetry can be specified in the top-level symmetry block, at the same level as symmetries, data, subunits etc.:
{

    "symmetry": {

        "sym_tr3ds": "<specification of transformation matrices>",

        "apply_symmetry": "<specification of subunits or regions of subunits on which the symmetry is acting>"

    },

    "series": [

        "series specifications go here"

    ],

    "data": [
        {
            "some xlink definition 1"
        },
        {
            "some xlink definition 2"
        },
        {
            "sequence file definition"
        }
    ],
    "subunits": [
            "subunit definitions"
    ],
    "xlinkanalyzerVersion": "1.1.1"
}
Within the symmetry block you specify:

sym_tr3ds - transformation matrices for each symmetry axis within the complex, and give each transformation a unique name. You can specify multiple symmetries if applicable.

Read Defining symmetry for instructions.

apply_symmetry - subunits or regions of subunits on which the symmetry is acting. This means you may have one symmetry specification for one domain of a subunit, and another symmetry or no symmetry at all for another domain.

Read Applying symmetry for instructions.

Define input PDB files and rigid bodies

as specified here: Input structures

and add it to the data block as below:

{

    "symmetry": {

        "sym_tr3ds": "<specification of transformation matrices>",

        "apply_symmetry": "<specification of subunits or regions of subunits on which the symmetry is acting>"

    },

    "series": [

        "series specifications go here"

    ],

    "data": [
        {
            "type": "pdb_files",
            "name": "pdb_files",
            "data": [

                    "PDB file specifications"
            ]
        },

        {
            "some xlink definition 1"
        },
        {
            "some xlink definition 2"
        },
        {
            "sequence file definition"
        }
    ],
    "subunits": [
            "subunit definitions"
    ],
    "xlinkanalyzerVersion": "1.1.1"
}

[Optionally] Define custom rigid bodies.

By default, rigid bodies are created based on blocks in the pdb_files specification above.

If you want to define rigid bodies differently (there are some special situation you may want to do it) you can define custom rigid bodies

as specified here: Rigid bodies

and add it to the data block as below:

{

    "rigid_bodies": {
        "Rigid bodies file specifications"
    },

    "symmetry": {

        "sym_tr3ds": "<specification of transformation matrices>",

        "apply_symmetry": "<specification of subunits or regions of subunits on which the symmetry is acting>"

    },

    "series": [

        "series specifications go here"

    ],

    "data": [
        {
            "type": "pdb_files",
            "name": "pdb_files",
            "data": [

                    "PDB file specifications"
            ]
        },

        {
            "some xlink definition 1"
        },
        {
            "some xlink definition 2"
        },
        {
            "sequence file definition"
        }
    ],
    "subunits": [
            "subunit definitions"
    ],
    "xlinkanalyzerVersion": "1.1.1"
}

For 1. Global optimization step, compute fitting libraries and add them to the JSON file as specified here:

About fit libraries

Add other restraints:

Restraints are specified as blocks in the data block:

{

    "rigid_bodies": {
        "Rigid bodies file specifications"
    },

    "symmetry": {

        "sym_tr3ds": "<specification of transformation matrices>",

        "apply_symmetry": "<specification of subunits or regions of subunits on which the symmetry is acting>"

    },

    "series": [

        "series specifications go here"

    ],

    "data": [
        {
            "type": "pdb_files",
            "name": "pdb_files",
            "data": [

                    "PDB file specifications"
            ]
        },

        {
            "some xlink definition 1"
        },
        {
            "some xlink definition 2"
        },
        {
            "sequence file definition"
        },

        {
            "a restraint"
        },

        {
            "another restraint"
        },

        {
            "and so on"
        }
    ],
    "subunits": [
            "subunit definitions"
    ],
    "xlinkanalyzerVersion": "1.1.1"
}

You can define the following restraints

EM restraints

Excluded volume (steric) restraints

Connectivity restraints

Symmetry restraints and constraints

Interaction restraints

Distance restraints

Similar orientation restraints

Elastic network restraints

Parsimonious states restraints

Custom restraints (e.g. original IMP restraints and your own implementations)

Series and copies

As shown above, Series is a group of related subunit copies e.g. related by symmetry.

In Assembline configuration, series are used to:

create multiple copies of the same subunit
define symmetry acting on the subunits within the series
define symmetry between two different series, if any

Series have names and can be referred to by this name within selectors.

Each series is a group of subunit copies: Copies of subunits, numbered 0, 1, 2. For example, you may have a subunit of the nuclear pore complex in 16 copies arranged in two rings, 8 copies each. You could then define two series, for each ring, ring1 and ring2, each with copies 0, 1, 2, 3, 4, 5, 6, 7. Then, the first copy of ring1 would selected by ring1 and copy index 0. The first copy of the ring2 would be ring2 and, again, copy 0.

Selectors

Selectors are collections of keywords used to “select” parts of the system and act on them. For example, you can select a part of the system and define as rigid body, add to an EM restraint, or other restraints.

See here how to define Selectors

Paths

You can use either absolute or relative paths in your JSON project file. Both have advantages and disadvantages. The relative paths allow to move your project between different computers but cause some trouble when working on the output (not a big deal, but you need to use some extra option for scripts to point to the original directory with the data). The absolute paths do not need the extra options and just work, but you cannot the move the projects between computers without modification.