JSON configuration file
=======================

Creating JSON file
------------------

#. Create `XlinkAnalyzer <https://www.embl-hamburg.de/XlinkAnalyzer/XlinkAnalyzer.html>`_ project file for your complex

    .. note:: `XlinkAnalyzer <https://www.embl-hamburg.de/XlinkAnalyzer/XlinkAnalyzer.html>`_ is used here as a graphical interface for input preparation in Assembline.

        Does not matter if you do not have crosslinks - we will use XlinkAnalyzer to prepare the input file for modeling.

    #. Add all subunits using Xlink Analyzer graphical interface
   
    #. Assign unique chain ID and color to every subunit
       
    #. Optionally, define domains within subunits. You could then refer to these domains in the :doc:`selectors`
   
    #. Add sequences using Setup panel in Xlink Analyzer, map the sequences to names of subunits using the Map button

    #. Add crosslinks if available, map the crosslinked protein names to names of subunits using the Map button

#. Make a copy of the Xlinkanalyzer project file. This copy will be next manually edited to add modeling directives. E.g. Copy ``XlinkAnalyzer_project.json`` as ``X_config.json``

#. Open ``X_config.json`` in a text editor
   
    .. note:: The project file is in so-called `JSON format <https://en.wikipedia.org/wiki/JSON>`_

        While it may look difficult to edit at the first time, it is actually quite OK with a proper editor (and a bit of practice ;-)
    
        We recommend to use a good editor such as:

            * `SublimeText <https://www.sublimetext.com/>`_
        
            * `Atom <https://atom.io/>`_

    At this point, the JSON has the following format:

    .. code-block:: JSON

        {
            "data": [
                {
                    "some xlink definition 1"
                },
                {
                    "some xlink definition 2"
                },
                {
                    "sequence file definition"
                }
            ],
            "subunits": [
                    "subunit definitions"
            ],
            "xlinkanalyzerVersion": "1.1.1"
        }

#. [Optionally] Define ``series``

    Serie is a group of related subunit copies e.g. related by symmetry.

    In Assembline configuration, series are used to:

    * create multiple copies of the same subunit

    * define symmetry acting on the subunits within the serie
    
    * define symmetry between two different series, if any

    Series can be specified in the top-level ``series`` block, at the same level as ``data``, ``subunits`` etc.:

    .. code-block:: json

        {
            "series": [

                "series specifications go here"
            
            ],
            
            "data": [
                {
                    "some xlink definition 1"
                },
                {
                    "some xlink definition 2"
                },
                {
                    "sequence file definition"
                }
            ],
            "subunits": [
                    "subunit definitions"
            ],
            "xlinkanalyzerVersion": "1.1.1"
        }

    Specification of a single serie:

    .. code-block:: json

        {
            "name": "Name of the serie",
            "subunit": "Subunit instance for this Serie",
            "mode": "[optional] 'auto' or 'input', default: 'input'
                     a parameter used in imp_utils1 to define whether PMI copies ('input') or clones ('auto') should be created",
            "cell_count": "[optional] How many elements in the Series, default: 1",
            "tr3d": "[optional] Name of the transformation relating Subunits in this Serie, as defined in the symmetry config below",
            "inipos": "[optional] 'input' or name of a symmetrical transformation in config",
            "ref_serie": "[optional] Reference Serie. If ini_pos is a transformation, imp_utils1 will transform
                          each copy in this serie relatively to the ref_serie by that transformation",
            "states": "[optional] List of state indices this serie should act on. Not sure if it's working."
        }


    Example:


    .. code-block:: json

        {
            "series": [
                {
                    "name": "2fold",
                    "subunit": "Elp1",
                    "mode": "input",
                    "cell_count": 2,
                    "tr3d": "2fold",
                    "inipos": "input"
                },
                {
                    "name": "2fold",
                    "subunit": "Elp2",
                    "mode": "auto",
                    "cell_count": 2,
                    "tr3d": "2fold",
                    "inipos": "input"
                },
                {
                    "name": "2fold",
                    "subunit": "Elp3",
                    "mode": "auto",
                    "cell_count": 2,
                    "tr3d": "2fold",
                    "inipos": "input"
                }
            ]
        }

    defines three series, one for each of the three subunits Elp1, Elp2, Elp3, two copies per subunit as defined by ``cell_count`` (six molecules total).

    ``"mode": "input"`` and ``"mode": "auto"`` define how molecules will be created behind the scenes. ``auto`` would be typically used if you have
    structure for one subunit copy and you want the system to auto-generate the remaining copies a ``clones`` of the first. If unsure, keep ``input``.

    Users familiar with PMI: ``auto`` would build clones, ``input`` would build copies.

    Here, we keep ``input`` for Elp1 subunit, as one of the input PDB structures is a dimer comprising fragments of both copies of Elp1. We keep ``auto`` for Elp2 and Elp3, as for them,
    we have structures of one copy only and we let the system to generate clones of the the first copy.

    ``tr3d`` points to a name we gave to a transformation matrix for the 2-fold symmetry, which we define below.

    ``"inipos": "input"`` defines the initial position. If ``input`` the positions will be as in the input PDB files. ``inipos`` can be set to a name of another ``series`` (see the example below why)

    * Examples:


    .. note:: In some examples provided in the tutorials or our published work you may see "series" defined in a different place, under the "symmetry" block. That is an old way, both are supported for backward compatibility. 
    
#. [If applicable] Add symmetry information

    Symmetry can be specified in the top-level ``symmetry`` block, at the same level as ``symmetries``, ``data``, ``subunits`` etc.:

    .. code-block:: json

        {
            
            "symmetry": {

                "sym_tr3ds": "<specification of transformation matrices>",

                "apply_symmetry": "<specification of subunits or regions of subunits on which the symmetry is acting>"

            },

            "series": [

                "series specifications go here"
            
            ],
            
            "data": [
                {
                    "some xlink definition 1"
                },
                {
                    "some xlink definition 2"
                },
                {
                    "sequence file definition"
                }
            ],
            "subunits": [
                    "subunit definitions"
            ],
            "xlinkanalyzerVersion": "1.1.1"
        }


    Within the symmetry block you specify:

        * ``sym_tr3ds`` - transformation matrices for each symmetry axis within the complex, and give each transformation a unique name. You can specify multiple symmetries if applicable.
            
            Read :doc:`symmetry_tr3ds` for instructions.

        * ``apply_symmetry`` - subunits or regions of subunits on which the symmetry is acting. This means you may have one symmetry specification for one domain of a subunit, and another symmetry or no symmetry at all for another domain.
          
            Read :doc:`applying_symmetry` for instructions.
    
    
#. Define input PDB files and rigid bodies

    as specified here: :doc:`input_structures`

    and add it to the ``data`` block as below:

    .. code-block:: json

        {
            
            "symmetry": {

                "sym_tr3ds": "<specification of transformation matrices>",

                "apply_symmetry": "<specification of subunits or regions of subunits on which the symmetry is acting>"

            },

            "series": [

                "series specifications go here"
            
            ],
            
            "data": [
                {
                    "type": "pdb_files",
                    "name": "pdb_files",
                    "data": [

                            "PDB file specifications"
                    ]
                },

                {
                    "some xlink definition 1"
                },
                {
                    "some xlink definition 2"
                },
                {
                    "sequence file definition"
                }
            ],
            "subunits": [
                    "subunit definitions"
            ],
            "xlinkanalyzerVersion": "1.1.1"
        }

#. [Optionally] Define custom rigid bodies.
    
    By default, rigid bodies are created based on blocks in the ``pdb_files`` specification above.

    If you want to define rigid bodies differently (there are some special situation you may want to do it) you can define custom rigid bodies

    as specified here: :doc:`rigid_bodies`

    and add it to the ``data`` block as below:

    .. code-block:: json

        {
            
            "rigid_bodies": {
                "Rigid bodies file specifications"
            },

            "symmetry": {

                "sym_tr3ds": "<specification of transformation matrices>",

                "apply_symmetry": "<specification of subunits or regions of subunits on which the symmetry is acting>"

            },

            "series": [

                "series specifications go here"
            
            ],
            
            "data": [
                {
                    "type": "pdb_files",
                    "name": "pdb_files",
                    "data": [

                            "PDB file specifications"
                    ]
                },

                {
                    "some xlink definition 1"
                },
                {
                    "some xlink definition 2"
                },
                {
                    "sequence file definition"
                }
            ],
            "subunits": [
                    "subunit definitions"
            ],
            "xlinkanalyzerVersion": "1.1.1"
        }

#. For :doc:`combinations` step, compute fitting libraries and add them to the JSON file as specified here:

    :doc:`fit_libraries_intro`

#. Add other restraints:
   
    Restraints are specified as blocks in the ``data`` block:

    .. code-block:: json

        {
            
            "rigid_bodies": {
                "Rigid bodies file specifications"
            },

            "symmetry": {

                "sym_tr3ds": "<specification of transformation matrices>",

                "apply_symmetry": "<specification of subunits or regions of subunits on which the symmetry is acting>"

            },

            "series": [

                "series specifications go here"
            
            ],
            
            "data": [
                {
                    "type": "pdb_files",
                    "name": "pdb_files",
                    "data": [

                            "PDB file specifications"
                    ]
                },

                {
                    "some xlink definition 1"
                },
                {
                    "some xlink definition 2"
                },
                {
                    "sequence file definition"
                },

                {
                    "a restraint"
                },

                {
                    "another restraint"
                },

                {
                    "and so on"
                }
            ],
            "subunits": [
                    "subunit definitions"
            ],
            "xlinkanalyzerVersion": "1.1.1"
        }

    You can define the following restraints

    :doc:`em_restraints`

    :doc:`excluded_volume`

    :doc:`connectivity_restraints`

    :doc:`symmetry_restraints`

    :doc:`interaction_restraints`

    :doc:`distance_restraints`

    :doc:`similar_orientation_restraints`

    :doc:`elastic_network_restraints`

    :doc:`parsimonious_states_restraints`

    :doc:`custom_restraints` (e.g. original IMP restraints and your own implementations)


Series and copies
-----------------

As shown above, **Series** is a group of related subunit copies e.g. related by symmetry.

In Assembline configuration, series are used to:

* create multiple copies of the same subunit

* define symmetry acting on the subunits within the series

* define symmetry between two different series, if any
  
Series have names and can be referred to by this name within selectors. 

Each series is a group of subunit copies: **Copies** of subunits, numbered 0, 1, 2. For example, you may have a subunit 
of the nuclear pore complex in 16 copies arranged in two rings, 8 copies each. You could then define two series, for each ring,
ring1 and ring2, each with copies 0, 1, 2, 3, 4, 5, 6, 7. Then, the first copy of ring1 would selected by ring1 and copy index 0.
The first copy of the ring2 would be ring2 and, again, copy 0.

Selectors
---------

Selectors are collections of keywords used to "select" parts of the system and act on them. For example,
you can select a part of the system and define as rigid body, add to an EM restraint, or other restraints.

See here how to define :doc:`selectors`

Paths
-----

You can use either absolute or relative paths in your JSON project file. Both have advantages and disadvantages.
The relative paths allow to move your project between different computers but cause some trouble when working on the output
(not a big deal, but you need to use some extra option for scripts to point to the original directory with the data).
The absolute paths do not need the extra options and just work, but you cannot the move the projects
between computers without modification.