Organize data

  1. Create a new directory for your project

    Note

    Note that for Assembline your files can be located in arbitrary locations on your disk but it is more convenient to have them in a single directory and use relative paths in the JSON configuration file.

  2. Collect sequences of your subunits

    Prepare a single FASTA-formatted file with all sequences of your subunits. A subunit is a protein of your complex.

    For example:

    >subunit1
    MVEHDKSGSKRQELRSNMRNLITLNKGKFKPTASTAEGDEDDLSFTLLDSVFDTLSDSI
    ISWRGDCDYFAVSSVEEVPDEDDETKSIKRRAFRVFSREGQLDSASEPVTGMEHQLSWK
    EMKKGKHPSIVCEFPKSEFTSEVDSLRQVAFINDSIVGVLLDTDNLSRIALLDIQDITQ
    RYKEAFIVCRTHRINLDILHDYAPELFIENLEVFINQIGRVDYLNLFISCLSEDDVTKT
    HGLALYRYDSEKQNVIYNIYAKHLSSNQMYTDAAVAYEMLGKLKEAMGAYQSAKRWREA
    RLIERLNQTKPDAVRVVEGLCRR
    >subunit2
    MVECITPEAIFIGANKQTQVSDIHKVKKIVAFGAGKTIALWDPIEPNNKGVYATLKGHE
    LLSNKQYKFQIDDELRVGINFEALIMGHDDWISSLQWHESRLQLLAATADTSLMVWEPD
    GEDDANEDDEEEEGGNKETPDITDPLSLLECPPMEDQLQRHLLWPEVEKLYGHGFEITC
    RLRWSHLKRNGKLFLGVGSSDLSTRIYSLAYE
    
  3. Collect structures of your subunits

    Create a directory with structures of your subunits in PDB format.

    The subunit chains can be organized in the PDB files in any way: a PDB file can contain multiple subunits, extra proteins not used in modeling, domains of a subunit can be separate or together. The JSON configuration file will take care of reading only what is needed.

    Warning

    Make sure that the protein sequence and residue numbering in the PDB files correspond to the sequences in the FASTA file.

  4. [OPTIONALLY] To simplify definition of rigid bodies later: Prepare your PDB file such that each PDB file corresponds to an anticipated rigid body (i.e. there is 1-to-1 mapping between the PDB files and your anticipated rigid bodies).

    Read more about rigid bodies and alternative ways of preparing them here: Rigid bodies

  5. Collect EM maps and put them into a subdirectory if you want to use EM restraints (read more EM restraints)

  6. Collect CSV files with crosslinks in xQuest or XlinkAnalyzer format, if you want to use crosslink restraints (read more Crosslink restraints)

  7. You may have the following directory structure at this point:

    complexX/
        X_sequences.fasta
        in_pdbs/
            pdb1.pdb
            pdb2.pdb
            ...
        xlinks/
            some_name.csv
            ...
        EM/
            map1.mrc
            map2.mrc
            ...