Python 3 (tested in version 3.6.6)
Python 3 libraries: numpy (1.15.0), nose (1.3.7), biopython (1.71)
GNU CC compiler (gcc, tested on version 6.4.0)
FreeSASA (tested on version 2.0.3)
MUSCLE (only if you want WHISCY to do automotically the multiple sequence alignment for you, tested on version 3.8.31)

Software version is indicative except for Python, which has to be from the 3.6.x series, and freesasa from the series 2.x and above.

1.1. Installation in macOS

1.1.1. Python3, GCC and libraries

Using Macports, you can install Python 3 and the necessary libraries, GCC and git:

sudo port install gcc6 git python36 py36-biopython py36-numpy py36-nose

Similar installation should be possible using Homebrew instead of Macports.

If you already have python3 and pip3 installed, it is completely OK to install bio, numpyand nose libraries using pip3.

1.1.2. MUSCLE

To install MUSCLE, go to the official download site, download the Mac OS X version suitable for your architecture (32 or 64bit) and follow the instructions provided by the authors.

1.1.3. FreeSASA

Go to the FreeSASA main site and follow the Quick-start guide. We don't need the Python bindings as we will be calling freesasa binary from command line.

Make sure freesasa binary is in your path:

$ freesasa --version
FreeSASA 2.0.3
License: MIT <http://opensource.org/licenses/MIT>
If you use this program for research, please cite:
  Simon Mitternacht (2016) FreeSASA: An open source C
  library for solvent accessible surface area calculations.
  F1000Research 5:189.

Report bugs to <https://github.com/mittinatten/freesasa/issues>
Home page: <http://freesasa.github.io>

1.1.4. WHISCY

Then, the next step is to clone thise repository:

git clone https://github.com/haddocking/whiscy.git
cd whiscy
pwd

With pwd, you will get the directory where you have cloned whiscy. Please, copy that directory path because you will need to specify it in your .bashrc or .bash_profile file.

Edit your .bashrc or .bash_profile and add the following lines:

# Whiscy
export WHISCY_PATH=/PATH/TO/WHISCY
export PYTHONPATH=$PYTHONPATH:${WHISCY_PATH}
export WHISCY_BIN=${WHISCY_PATH}/whiscy.py
export PATH=$PATH:${WHISCY_PATH}

You have to change /PATH/TO/WHISCY according to the directory pointed by the pwd command.

Now, we compile the protdist software:

cd $WHISCY_PATH
cd bin/protdist
./compile.sh
./protdist

If we see an output like this:

Too few arguments for this modified version of PROTDIST
Usage: protdist <infile> <outfile>

everything is ready.

There is only one final step where we tell WHISCY about where to find MUSCLE binary. Edit $WHISCY_PATH/etc/local.json:

{
  "ALIGN": {
    "MUSCLE_BIN": "/path/to/bin/muscle/muscle3.8.31_i86darwin64"
  },
  "CUTOFF": {
    "sa_pred_cutoff": 15.0,
    "sa_act_cutoff": 40.0,
    "air_cutoff": 0.18,
    "air_dist_cutoff": 6.5
  },
  "AIR": {
    "air_pro_percentage": 10.0,
    "air_wm_pro_or": 98.52,
    "air_wm_whis_or": 0.370515,
    "air_wm_pro_and": 55.42,
    "air_wm_whis_and": 0.106667
  }
}

Change the MUSCLE_BIN variable to the correct path of your MUSCLE binary.

1.2. Installation in GNU/Linux

In Debian/Ubuntu flavours use apt:

sudo apt-get install python3 python3-numpy python3-nose2 python3-biopython gcc-4.6 git-all

For the next steps, see the macOS instructions which also apply: 1.1.2, 1.1.3 and 1.1.4.

2. WHISCY setup

WHISCY needs of some initial data in order to do the prediction. For that purpose, a script called whiscy_setup.py is provided:

$ whiscy_setup.py 
usage: whiscy_setup [-h] pdb_file_name chain_id
whiscy_setup: error: the following arguments are required: pdb_file_name, chain_id

The parameters of this script are pdb_file_name and chain_id. While pdb_file_name can be a pdb file (for example 1ppe.pdb) or a PDB code (1ppe), chain_id is a character (upper or minor case) indicating the chain of the molecule to use for the prediction.

For example, if we are instered in predicting the chain E of the 1PPE complex:

$ whiscy_setup.py 1ppe e
Downloading PDB structure '1ppe'...
whiscy_setup [INFO] PDB structure with chain E saved to 1ppe_E.pdb
whiscy_setup [INFO] Atom accessibility calculated to 1ppe_E.rsa
whiscy_setup [INFO] Surface and buried residues calculated
whiscy_setup [INFO] Downloading HSSP alignment...
whiscy_setup [INFO] HSSP alignment stored to 1ppe.hssp
whiscy_setup [INFO] HSSP file converted to PHYLIP format
whiscy_setup [INFO] Protdist calculated
whiscy_setup [INFO] Conversion table file generated
whiscy_setup [INFO] Whiscy setup finished

whiscy_setup.py first checks if the PDB file or PDB structure contains the chain indicated, then tries to download from the HSSP database the PDB complex MSA alignment. If this step fails, the script will try a different approach based on 1) NBCI Blastp and then 2) a multiple sequence alignment of the blastp results using MUSCLE.

Note that whiscy_setup.py requires of internet access in order to gather the relevant files.

whiscy_setup.py output

whiscy_setup.py generates a set of files needed for the prediction step with whiscy.py. Here it is a list of the generated files in our 1ppe complex example:

File name	Explanation
1ppe.hssp	Multiple sequence alignment download from the HSSP database
1ppe.hssp.bz2	HSSP MSA file compressed
1ppe.pdb	PDB file download from the Protein Data Bank
1ppe_E.pdb	1ppe.pdb parsed to select only the given `chain_id`
1ppe_E.rsa	SASA output of `freesasa` in `NACCESS` format of `1ppe_E.pdb` file
1ppe_E.fasta	Sequence of 1ppe_E.pdb. Alternative residues have been removed
1ppe_E.phylseq	MSA file translated from HSSP to PHYLIP format
1ppe_E.conv	PDB residue numeration to FASTA sequence numeration
1ppe_E.out	Output of the `protdist` software on 1ppe_E.pdb
1ppe_E.sur	>15 % surface residue list according to `sa_pred_cutoff` cutoff
1ppe_E.suract	>40 % surface residue list according to `sa_act_cutoff` cutoff
1ppe_E.lac	0-15 % accessible residue list

3. WHISCY prediction

Running the main whiscy.py script without parameters will give you a guess of the required files for WHISCY in order to predict the interface residues of your protein:

$ whiscy.py
usage: whiscy [-h] [-o output_file]
              surface_list conversion_table alignment_file distance_file
whiscy: error: the following arguments are required: surface_list, conversion_table, alignment_file, distance_file

WHISCY needs of four input files:

surface_list which is a list of residues in the interface. Tipically comes with .surextension, for example 1ppe_E.sur.
conversion_table, the file representing the mapping of the PDB file residue numeration into the FASTA sequence numeration, tipically with .conv extension and for example, 1ppe_E.conv.
alignment_file is the MSA file in PHYLIP format, .phylseq extension, 1ppe_E.phylseq.
distance_file is the output of protdist software, in our example with extension .out: 1ppe_E.out.

If we try the input from our example, whiscy_setup.py runs with the protein 1PPE and chain E:

$ whiscy.py 1ppe_E.sur 1ppe_E.conv 1ppe_E.phylseq 1ppe_E.out 
whiscy [INFO] Parsing surface list...
whiscy [INFO] Loading conversion table...
whiscy [INFO] Converting...
whiscy [INFO] Initializing score calculation...
whiscy [INFO] Calculating scores...
whiscy [INFO] Subtracting average value ...
whiscy [INFO] Sorting scores...
whiscy [INFO] Writing scores...
0.95155   G19 
0.95070  G219 
0.91582  G133 
0.90942   H57 
0.87666   P28 
0.84307   G18 
0.82484  K107 
0.80778  L123 
0.75653  P173 
...
-1.24180  K222 
-1.31572  G187 
-1.41471  G203 
-1.43181  G193 
-1.54083  L185 
-1.64713  Y184 
-2.41928  C232 
-2.49188  C191 
-2.67467  W237 
-2.77188  W215 


"My God, so much I like to drink Scotch that sometimes I think my name is Igor Stra-whiskey."
  -  Igor Stravinsky

There is also the possibility of writing the WHISCY scores to a file if we use the -o flag:

$ whiscy.py 1ppe_E.sur 1ppe_E.conv 1ppe_E.phylseq 1ppe_E.out -o 1ppe_E.cons
whiscy [INFO] Parsing surface list...
whiscy [INFO] Loading conversion table...
whiscy [INFO] Converting...
whiscy [INFO] Initializing score calculation...
whiscy [INFO] Calculating scores...
whiscy [INFO] Subtracting average value ...
whiscy [INFO] Sorting scores...
whiscy [INFO] Writing scores...
whiscy [INFO] Prediction written to 1ppe_E.cons

"My God, so much I like to drink Scotch that sometimes I think my name is Igor Stra-whiskey."
  -  Igor Stravinsky

The prediction in this case will be saved to the 1ppe_E.cons file.

3.1. WHISCY server-like prediction

To mimic the WHISCY server behavior using interface propensities and surface smoothing, there is a BASH script in the WHISCY home directory. You can execute it like in this example:

whiscy_protocol.sh 1ppe_E

After a few seconds, there will be a new .pscons file with the predicted residues in the interface sorted by their WHISCY score:

$ head 1ppe_E.pscons 
 0.61467   I73
 0.58322   G78
 0.56459   V75
 0.55906   F82
 0.49307   Q81
 0.48432   L114
 0.47943   E80
 0.46539   Y39
 0.45669   V76
 0.43158   N72

As stated in the original WHISCY publication, residues are predicted to be in the interface if the WHISCY score is higher than 0.180, corresponding to a 29.4% of sensitivity.

3.2. Show WHISCY predictions

There is a Python3 script in the bin directory called whiscy2bfactor.py in charge of mapping the WHISCY interface residues prediction into the B-factor column of the PDB file.

An usage example:

$ cd ${WHISCY_PATH}/example
$ ../bin/whiscy2bfactor.py 1ppe_E.pdb 1ppe_E_whiscy.pdb 1ppe_E.pscons
1ppe_E_whiscy.pdb PDB file with WHISCY scores in B-factor column has been created

We can use any molecular visualization software to depict our molecule using the B-factor column over the surface (example from UCSF Chimera using Cyan-Maroon scale for predicted/not-predicted):

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

WHat Information does Surface Conservation Yield?

How does WHISCY work?

1. Installation

1.1. Installation in macOS

1.1.1. Python3, GCC and libraries

1.1.2. MUSCLE

1.1.3. FreeSASA

1.1.4. WHISCY

1.2. Installation in GNU/Linux

2. WHISCY setup

whiscy_setup.py output

3. WHISCY prediction

3.1. WHISCY server-like prediction

3.2. Show WHISCY predictions

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
bin		bin
etc		etc
example		example
libwhiscy		libwhiscy
media		media
param		param
test		test
.gitignore		.gitignore
README.md		README.md
whiscy.py		whiscy.py
whiscy_protocol.sh		whiscy_protocol.sh
whiscy_setup.py		whiscy_setup.py

pegi3s/whiscy

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

WHat Information does Surface Conservation Yield?

How does WHISCY work?

1. Installation

1.1. Installation in macOS

1.1.1. Python3, GCC and libraries

1.1.2. MUSCLE

1.1.3. FreeSASA

1.1.4. WHISCY

1.2. Installation in GNU/Linux

2. WHISCY setup

whiscy_setup.py output

3. WHISCY prediction

3.1. WHISCY server-like prediction

3.2. Show WHISCY predictions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages