{ "cells": [ { "cell_type": "markdown", "id": "a3a1c6e1-2ae5-4340-9369-e3fbf75f2114", "metadata": {}, "source": [ "## Clustering of ligand cavity shape\n", "\n", "In this tutorial, we will perform clustering of ligand cavity. This tutorial can be used as a giude **to perform protein channel shape clustering** to analyse different shapes of channel during the simulations. The protein considered in this tutorial has ligand cavity burried deep inside the protein and cavity is almost linear.\n", "\n", "### Final result\n", "\n", "\n", "\n", "### Instructions\n", "* **Tutorial files**: The tutorial files can be downloaded from [here](https://figshare.com/ndownloader/files/55894172).\n", "* **Extract the files**: `tar -zxvf protein-pca.tar.gz`\n", "* **Go to directory**: `cd protein-pca`\n", "* **Copy the Jupyter Notebook**: This notebook is available in the GitHub repo. [Download and copy it from the github](https://github.com/rjdkmr/gmx_clusterByFeatures/tree/master/docs/tutorials).\n", "\n", "\n", "### Required Tools\n", "* [HOLE2 - A tool to calculate channel/cavity radius](https://www.holeprogram.org/)\n", "* GROMACS\n", "* gmx_clusterByFeatures\n", "\n", "### Steps\n", "1. Preparation of reference structure\n", "2. Generate a new trajectory aligned on reference structure\n", "3. Plotting radius and residues distribution\n", "4. Calculation of cavity/channel radius\n", "5. Create feature-file\n", "6. Clustering using radius as features\n", "7. Analysis\n", "\n", "### 1. Preparation of reference structure\n", "\n", "A reference structure need to be prepared where cavity/channel axis align with one of the principal axis. For protein channel, it might be already done as channel-axis is mostly aligned along z-axis. In this case, I have already aligned the ligand-cavity axis along the Y-axis and it is provided as `inputs/ref_cavity.pdb`.\n", "\n", "\n", "\n", "### 2. Generate a new trajectory superimposed on reference structure\n", "\n", "`gmx_clusterByFeatures hole` has a functionality to superimpose the conformation for consistent calculation of cavity/channel radius along the given vector However, during clustering, the output central structures and cluster trajectory will be not superimposed and therefore, it is good to superimpose whole trajectory on reference structure so that cluster's central structures orientations will be same as of reference structure.\n", "\n", "We will use `gmx trjconv` to create a new superimposed trajectory with reference structre as follows.\n" ] }, { "cell_type": "code", "execution_count": 3, "id": "6da47265-89de-4570-b6a6-7ab7c0f2a8ed", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ " :-) GROMACS - gmx trjconv, 2025.2 (-:\n", "\n", "Executable: /opt/gromacs-2025/bin/gmx\n", "Data prefix: /opt/gromacs-2025\n", "Working dir: /home/raj/workspace/gmx_clusterByFeatrues/tutorials/protein-pca\n", "Command line:\n", " gmx trjconv -s inputs/ref_cavity.pdb -f inputs/onlyProtein.xtc -n inputs/onlyProtein.ndx -fit rot+trans -o traj_superimposed.xtc\n", "\n", "Will write xtc: Compressed trajectory (portable xdr format): xtc\n", "WARNING: all CONECT records are ignored\n", "\n", "WARNING: If there are molecules in the input trajectory file\n", " that are broken across periodic boundaries, they\n", " cannot be made whole (or treated as whole) without\n", " you providing a run input file.\n", "\n", "Group 0 ( System) has 8327 elements\n", "Group 1 ( Protein) has 8327 elements\n", "Group 2 ( Protein-H) has 4217 elements\n", "Group 3 ( C-alpha) has 542 elements\n", "Group 4 ( Backbone) has 1626 elements\n", "Group 5 ( MainChain) has 2169 elements\n", "Group 6 ( MainChain+Cb) has 2660 elements\n", "Group 7 ( MainChain+H) has 2668 elements\n", "Group 8 ( SideChain) has 5659 elements\n", "Group 9 ( SideChain-H) has 2048 elements\n", "Group 10 (r_5-535_&_MainChain) has 2124 elements\n", "Group 11 (r_5-535_&_MainChain_&_!r_72-85_&_!r_257-268_&_!r_338-352_&_!r_435-444) has 1920 elements\n", "Group 12 ( r_79_&_CA) has 1 elements\n", "Group 13 ( r_341_&_CA) has 1 elements\n", "Group 14 ( r_293_&_CA) has 1 elements\n", "Group 15 (r_10-535_&_MainChain) has 2104 elements\n", "Group 16 ( r_203) has 11 elements\n", "Group 17 ( r_447) has 18 elements\n", "Group 18 ( oxyanion_hole) has 52 elements\n", "Group 19 ( acyl_pocket) has 84 elements\n", "Group 20 ( choline_site) has 45 elements\n", "Group 21 (peripheral_site) has 106 elements\n", "Select a group: Group 0 ( System) has 8327 elements\n", "Group 1 ( Protein) has 8327 elements\n", "Group 2 ( Protein-H) has 4217 elements\n", "Group 3 ( C-alpha) has 542 elements\n", "Group 4 ( Backbone) has 1626 elements\n", "Group 5 ( MainChain) has 2169 elements\n", "Group 6 ( MainChain+Cb) has 2660 elements\n", "Group 7 ( MainChain+H) has 2668 elements\n", "Group 8 ( SideChain) has 5659 elements\n", "Group 9 ( SideChain-H) has 2048 elements\n", "Group 10 (r_5-535_&_MainChain) has 2124 elements\n", "Group 11 (r_5-535_&_MainChain_&_!r_72-85_&_!r_257-268_&_!r_338-352_&_!r_435-444) has 1920 elements\n", "Group 12 ( r_79_&_CA) has 1 elements\n", "Group 13 ( r_341_&_CA) has 1 elements\n", "Group 14 ( r_293_&_CA) has 1 elements\n", "Group 15 (r_10-535_&_MainChain) has 2104 elements\n", "Group 16 ( r_203) has 11 elements\n", "Group 17 ( r_447) has 18 elements\n", "Group 18 ( oxyanion_hole) has 52 elements\n", "Group 19 ( acyl_pocket) has 84 elements\n", "Group 20 ( choline_site) has 45 elements\n", "Group 21 (peripheral_site) has 106 elements\n", "Reading frame 0 time 0.000 \n", "Precision of inputs/onlyProtein.xtc is 0.001 (nm)\n", "Using output precision of 0.001 (nm)\n", "Reading frame 22000 time 2200000.000 -> frame 21999 time 2199900.000 \n", "\n", "Last written: frame 22500 time 2250000.000\n", "\n", "\n", "GROMACS reminds you: \"We can make it into a friend class. But I don't like having friends.\" (Joe Jordan)\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Note that major changes are planned in future for trjconv, to improve usability and utility.\n", "\n", "WARNING: Masses and atomic (Van der Waals) radii will be guessed\n", " based on residue and atom names, since they could not be\n", " definitively assigned from the information in your input\n", " files. These guessed numbers might deviate from the mass\n", " and radius of the atom type. Please check the output\n", " files if necessary. Note, that this functionality may\n", " be removed in a future GROMACS version. Please, consider\n", " using another file format for your input.\n", "\n", "Select group for least squares fit\n", "Selected 11: 'r_5-535_&_MainChain_&_!r_72-85_&_!r_257-268_&_!r_338-352_&_!r_435-444'\n", "Select group for output\n", "Selected 0: 'System'\n" ] } ], "source": [ "%%bash\n", "\n", "rm traj_superimposed.xtc\n", "\n", "echo 11 0 | gmx trjconv -s inputs/ref_cavity.pdb -f inputs/onlyProtein.xtc -n inputs/onlyProtein.ndx \\\n", " -fit rot+trans -o traj_superimposed.xtc" ] }, { "cell_type": "markdown", "id": "a193257b-c2cb-4aa8-8f19-dfb91abff921", "metadata": {}, "source": [ "### 3. Calculation of cavity/channel radius\n", "\n", "We will use [HOLE2](https://www.holeprogram.org/) tool to calculate ligand cavity. It should be installed and `hole` should be in PATH. HOLE accept only one pdb file as input. `gmx_clusterByFeatures hole` use external `hole` and calculate radius from each frame automating the entire calculation for the trajectory.\n", "\n", "Following options are used here:\n", "* `-catmid 1903` - seed atom from where radius calculation will start in both the direction along the axis\n", "* `-rad bondi` - radius of atoms\n", "* `-cvect 0 1 0` - vector along which radius will be calculated. Here, it is Y-axis.\n", "* `-endrad 12` - stop radius calculation when this threshold radius is reached. It means, now it is in solvent phase.\n", "* `-sample 0.5` - slab size along the cavity/channel axis. Radius is calculated in each slab\n", "* `-mcstep 2000` - Monte-Carlo step size. See details in HOLE manual\n", "\n", "This command will dump radius data as a function of time in `radius.dat` file" ] }, { "cell_type": "code", "execution_count": 1, "id": "4c601695-a0f1-4144-9a76-de21c5cc8a84", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ " :-) GROMACS - gmx_clusterByFeatures hole, 2025.0-dev-20250210-6949615-local (-:\n", "\n", "Executable: gmx_clusterByFeatures hole\n", "Data prefix: /project/external/gmx_installed\n", "Working dir: /home/raj/workspace/gmx_clusterByFeatrues/tutorials/protein-pca\n", "Command line:\n", " 'gmx_clusterByFeatures hole' -f traj_superimposed.xtc -s inputs/ref_cavity.pdb -n inputs/onlyProtein.ndx -rad bondi -catmid 1903 -cvect 0 1 0 -endrad 12 -sample 0.5 -mcstep 2000\n", "\n", "\n", " :-) gmx_clusterByFeatures hole (-:\n", " \n", " Author: Rajendra Kumar\n", " \n", " Copyright (C) 2018-2025 Rajendra Kumar\n", " \n", " \n", " gmx_clusterByFeatures is a free software: you can redistribute it and/or modify\n", " it under the terms of the GNU General Public License as published by\n", " the Free Software Foundation, either version 3 of the License, or\n", " (at your option) any later version.\n", " \n", " gmx_clusterByFeatures is distributed in the hope that it will be useful,\n", " but WITHOUT ANY WARRANTY; without even the implied warranty of\n", " MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n", " GNU General Public License for more details.\n", " \n", " You should have received a copy of the GNU General Public License\n", " along with gmx_clusterByFeatures. If not, see .\n", " \n", " THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n", " \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n", " LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n", " A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n", " OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n", " SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED\n", " TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR\n", " PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF\n", " LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING\n", " NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS\n", " SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n", " \n", "WARNING: all CONECT records are ignored\n", "Group 0 ( System) has 8327 elements\n", "Group 1 ( Protein) has 8327 elements\n", "Group 2 ( Protein-H) has 4217 elements\n", "Group 3 ( C-alpha) has 542 elements\n", "Group 4 ( Backbone) has 1626 elements\n", "Group 5 ( MainChain) has 2169 elements\n", "Group 6 ( MainChain+Cb) has 2660 elements\n", "Group 7 ( MainChain+H) has 2668 elements\n", "Group 8 ( SideChain) has 5659 elements\n", "Group 9 ( SideChain-H) has 2048 elements\n", "Group 10 (r_5-535_&_MainChain) has 2124 elements\n", "Group 11 (r_5-535_&_MainChain_&_!r_72-85_&_!r_257-268_&_!r_338-352_&_!r_435-444) has 1920 elements\n", "Group 12 ( r_79_&_CA) has 1 elements\n", "Group 13 ( r_341_&_CA) has 1 elements\n", "Group 14 ( r_293_&_CA) has 1 elements\n", "Group 15 (r_10-535_&_MainChain) has 2104 elements\n", "Group 16 ( r_203) has 11 elements\n", "Group 17 ( r_447) has 18 elements\n", "Group 18 ( oxyanion_hole) has 52 elements\n", "Group 19 ( acyl_pocket) has 84 elements\n", "Group 20 ( choline_site) has 45 elements\n", "Group 21 (peripheral_site) has 106 elements\n", "Select a group: Group 0 ( System) has 8327 elements\n", "Group 1 ( Protein) has 8327 elements\n", "Group 2 ( Protein-H) has 4217 elements\n", "Group 3 ( C-alpha) has 542 elements\n", "Group 4 ( Backbone) has 1626 elements\n", "Group 5 ( MainChain) has 2169 elements\n", "Group 6 ( MainChain+Cb) has 2660 elements\n", "Group 7 ( MainChain+H) has 2668 elements\n", "Group 8 ( SideChain) has 5659 elements\n", "Group 9 ( SideChain-H) has 2048 elements\n", "Group 10 (r_5-535_&_MainChain) has 2124 elements\n", "Group 11 (r_5-535_&_MainChain_&_!r_72-85_&_!r_257-268_&_!r_338-352_&_!r_435-444) has 1920 elements\n", "Group 12 ( r_79_&_CA) has 1 elements\n", "Group 13 ( r_341_&_CA) has 1 elements\n", "Group 14 ( r_293_&_CA) has 1 elements\n", "Group 15 (r_10-535_&_MainChain) has 2104 elements\n", "Group 16 ( r_203) has 11 elements\n", "Group 17 ( r_447) has 18 elements\n", "Group 18 ( oxyanion_hole) has 52 elements\n", "Group 19 ( acyl_pocket) has 84 elements\n", "Group 20 ( choline_site) has 45 elements\n", "Group 21 (peripheral_site) has 106 elements\n", "Reading frame 22000 time 2200000.000 \n", "-------- -------- ------------------- -------- ----------\n", "O.S. Smart, J.M. Goodfellow and B.A. Wallace (1993)\n", "The Pore Dimensions of Gramicidin A\n", "Biophysical Journal 65:2455-2460.\n", "-------- -------- ------------------- -------- ----------\n", "\n", "GROMACS reminds you: \"Never replicate a successful experiment.\" (Fett's law.)\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Choose a group for the least squares fit\n", "Selected 11: 'r_5-535_&_MainChain_&_!r_72-85_&_!r_257-268_&_!r_338-352_&_!r_435-444'\n", "\n", "Choose a group for hole calculation...\n", "Selected 1: 'Protein'\n", "\n", "\n", "Selected \"H\" atom of \"GLY-122\" as seed for channel/cavity\n", "\n", "Thanks for using gmx_hole!!!\n", "\n", "++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++\n" ] } ], "source": [ "%%bash\n", "\n", "rm radius.dat\n", "\n", "echo 11 1 | gmx_clusterByFeatures hole -f traj_superimposed.xtc -s inputs/ref_cavity.pdb -n inputs/onlyProtein.ndx \\\n", " -rad bondi -catmid 1903 -cvect 0 1 0 -endrad 12 -sample 0.5 -mcstep 2000" ] }, { "cell_type": "markdown", "id": "74b90af4-b5dc-4765-8650-27a6f321641c", "metadata": {}, "source": [ "### 4. Plotting radius and residues distribution\n", "\n", "To plot radius value distributions and residues outlining cavity/channel positions distribution over the simulations, \n", "`gmx_clusterByFeatures holeplot` sub-command can be used. It reads radius file generated in the above command, extract\n", "radius values for the given segment of the cavity/channel, calculate average/distributions and plot it as a function\n", "of cavity/channel axis.\n", "\n", "**Options**\n", "* `-resplot` - enable plotting of residues positions distributions\n", "* `-vplot` - distributions are plotted as violin-plot\n", "* `-xmin` - minimum value of cavity/channel axis position to consider\n", "* `-xmax` - maximum value of cavity/channel axis position to consider\n", "* `-endrad` - ignore if radius is greater than this value (mostly at the opening end)\n", "* `-gap` - slab size along the cavity/channel\n", "* `-ax` - X, Y or Z axis along which cavity/channel axis is aligned\n", "* `-rlsize` - font-size of residues in residue-distribution plots." ] }, { "cell_type": "code", "execution_count": 1, "id": "f43d1f96-c22f-4ff5-8c68-49bb0d1bc795", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Reading file : radius.dat\n", "Reading frame 0 at time 0.000\n", "Maximum axis range is: (0.000 to 17.000)\n", "Reading frame 20000 at time 2000000.000\n", "Finishid reading.... Total number of frame read = 22501\n", "After removing axis points with any missing data:\n", "------------------------------------------------------------\n", "# Axis Mean radius Std. deviation\n", "------------------------------------------------------------\n", " 0.000 0.864 0.286\n", " 0.500 0.867 0.336\n", " 1.000 0.889 0.394\n", " 1.500 0.904 0.442\n", " 2.000 0.920 0.460\n", " 2.500 0.958 0.449\n", " 3.000 0.993 0.455\n", " 3.500 1.043 0.466\n", " 4.000 1.144 0.448\n", " 4.500 1.288 0.414\n", " 5.000 1.449 0.386\n", " 5.500 1.614 0.387\n", " 6.000 1.750 0.451\n", " 6.500 1.907 0.534\n", " 7.000 2.081 0.594\n", " 7.500 2.146 0.631\n", " 8.000 2.142 0.637\n", " 8.500 2.127 0.610\n", " 9.000 2.111 0.581\n", " 9.500 2.085 0.569\n", " 10.000 2.036 0.571\n", " 10.500 1.984 0.567\n", " 11.000 1.948 0.559\n", " 11.500 1.925 0.561\n", " 12.000 1.915 0.571\n", " 12.500 1.916 0.574\n", " 13.000 1.925 0.572\n", " 13.500 1.938 0.595\n", " 14.000 1.963 0.646\n", " 14.500 2.001 0.715\n", " 15.000 2.049 0.796\n", " 15.500 2.088 0.884\n", " 16.000 2.113 0.965\n", " 16.500 2.178 1.044\n", "------------------------------------------------------------\n" ] } ], "source": [ "%%bash\n", "\n", "gmx_clusterByFeatures holeplot -i radius.dat -resplot -vplot -o average_radius_distribution.png \\\n", " -xmin 0 -xmax 17 -endrad 6 -gap 0.5 -ax Y -rlsize 7" ] }, { "cell_type": "markdown", "id": "eb84c06c-9299-433f-8b16-450eceb5bcb2", "metadata": {}, "source": [ " " ] }, { "cell_type": "markdown", "id": "450fd568-1c28-4981-8b1a-064bcaac10d7", "metadata": {}, "source": [ "**Observations**\n", "\n", "As can be seen aboive, radius of cavity flucuated by at-least 2Å and much more along the cavity.\n", "Therefore, it does not provide a true picture of different shapes sampled during the simulations.\n", "Clustering could unravel all those shapes sampled during the simulations.\n", "\n", "### 5. Create feature file\n", "\n", "In the next step, we will create a file containing cavity radius as features for clustering. `gmx_clusterByFeatures holefeatures` can be used to create this file using the input `radius.dat` file, which is obtained in the previous step.\n", "\n", "**Options**\n", "* `-xmin` Minimum axis position value above which radius will be considered. Here, we take it as 5.5 because GLY122 is present at the bottom most region of the cavity and residue-distribution shows that GLY122 postions start from 5.5Å along the axis.\n", "* `-xmax` Maximum axis position value below which radius will be considered\n", "* `-endrad` Maximum radius to be considered for opening end of the cavity\n", "* `-gap` Radius calculation slab size\n", "* `-ax` Cavity/Channel axis" ] }, { "cell_type": "code", "execution_count": 2, "id": "58512d4d-c6c5-4380-a668-038425488e72", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Reading file : radius.dat\n", "Reading frame 0 at time 0.000\n", "Maximum axis range is: (5.500 to 17.000)\n", "Reading frame 20000 at time 2000000.000\n", "Finishid reading.... Total number of frame read = 22501\n", "After removing axis points with any missing data:\n", "------------------------------------------------------------\n", "# Axis Mean radius Std. deviation\n", "------------------------------------------------------------\n", " 5.500 1.629 0.384\n", " 6.000 1.750 0.451\n", " 6.500 1.907 0.534\n", " 7.000 2.081 0.594\n", " 7.500 2.146 0.631\n", " 8.000 2.142 0.637\n", " 8.500 2.127 0.610\n", " 9.000 2.111 0.581\n", " 9.500 2.085 0.569\n", " 10.000 2.036 0.571\n", " 10.500 1.984 0.567\n", " 11.000 1.948 0.559\n", " 11.500 1.925 0.561\n", " 12.000 1.915 0.571\n", " 12.500 1.916 0.574\n", " 13.000 1.925 0.572\n", " 13.500 1.938 0.595\n", " 14.000 1.963 0.646\n", " 14.500 2.001 0.715\n", " 15.000 2.049 0.796\n", " 15.500 2.088 0.884\n", " 16.000 2.113 0.965\n", " 16.500 2.178 1.044\n", "------------------------------------------------------------\n", "WARNING: radius value for the axis point was not calculated by hole. Zero will be written as feature value. However, large number of missing values will lead to wrong clustering. Therefore, please try to minimize or eliminate the missing values by changing axis-point range using xmin and xmax option and/or dataOccupancy option.\n", "(23, 22501)\n" ] } ], "source": [ "%%bash\n", "\n", "gmx_clusterByFeatures holefeatures -i radius.dat -o cavity_features.xvg \\\n", " -xmin 5.5 -xmax 17 -endrad 6 -gap 0.5 -ax Y" ] }, { "cell_type": "markdown", "id": "9bba5720-1604-4db7-a30a-91fc8ab83ac9", "metadata": {}, "source": [ "### 6. Clustering using radius as features\n", "\n", "Now, we use radius as the features and perform clustering. We will use [K-Means clustering](https://en.wikipedia.org/wiki/K-means_clustering). However, there is one **drawback** that the number of clusters should be known beforehand. `gmx_clusterByFeatures` implements several [cluster metrics](https://gmx-clusterbyfeatures.readthedocs.io/en/latest/commands/cluster.html#cmetric-prior) and also automatated way to decide number of clusters. We will use Elbow method to automatically determine the number of clusters.\n", "\n", "Following command will perform the clustering of conformations on the basis of cavity-radius profile. Explanation of options are as follows:\n", "* `-method kmeans`: Use K-Means clustering algorithm\n", "* `-ncluster 15`: K-Means clustering will be performed for 1 upto 15 clusters times each time. Finally, based on `-ssrchange` option, final number of clusters will be automatically selected.\n", "* `-cmetric ssr-sst`: Use Elbow method to decide final number of clusters.\n", "* `-nfeature 23`: Take 23 features from the feature file. Here it is 23 slabs for which cavity radius is dumped in feature file `cavity_features.xvg`.\n", "* `-sort features`: Sort the output clustered trajectory based on the distance in feature space from its central structure.\n", "* `-ssrchange 1.5`: Threshold (percentage) of change in SSR/SST ratio in Elbow method to automatically decide the number of clusters.\n", "* `-g cluster.log`: Log file containing clustering information and cluster-mterics.\n", "\n", "**index group order**\n", "\n", "1. **First index** group - Output group of atoms in the central structures\n", "\n", "2. **Second index** group - Group of atoms to calculate RMSD between central conformations of clusters as RMSD matrix, which is dumped in the **log file** with `-g` option. Here, it is Ligand without hydrogen atoms.\n", " \n", "3. **Third index** group - Used for Superimposition by least-square fitting. ONLY used in separate clustered trajectories to superimpose conformations on the central structure. \n", "\n", "\n", "**Content of the output `-g clustering-radius.log` file**\n", "\n", "It contains the command summary, and for each input cluster-numbers, number of frames in each clusters. At the end it dumps the **Cluster Metrics Summary**, which is important for deciding final number of clusters." ] }, { "cell_type": "code", "execution_count": 3, "id": "27121246-744b-4fb4-914c-295d0bc5dc6e", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "mkdir: cannot create directory ‘clustering-radius-output’: File exists\n", " :-) GROMACS - gmx_clusterByFeatures cluster, 2025.0-dev-20250210-6949615-local (-:\n", "\n", "Executable: gmx_clusterByFeatures cluster\n", "Data prefix: /project/external/gmx_installed\n", "Working dir: /home/raj/workspace/gmx_clusterByFeatrues/tutorials/protein-pca\n", "Command line:\n", " 'gmx_clusterByFeatures cluster' -f traj_superimposed.xtc -s inputs/ref_cavity.pdb -n inputs/onlyProtein.ndx -feat cavity_features.xvg -method kmeans -nfeature 23 -cmetric ssr-sst -ssrchange 1.5 -ncluster 15 -cpdb clustering-radius-output/central.pdb -g clustering-radius.log -clid cluster-id-radius.xvg\n", "\n", "\n", " :-) gmx_clusterByFeatures cluster (-:\n", "\n", " Author: Rajendra Kumar\n", "\n", " Copyright (C) 2018-2019 Rajendra Kumar\n", "\n", "\n", "gmx_clusterByFeatures is a free software: you can redistribute it and/or modify\n", "it under the terms of the GNU General Public License as published by\n", "the Free Software Foundation, either version 3 of the License, or\n", "(at your option) any later version.\n", "\n", "gmx_clusterByFeatures is distributed in the hope that it will be useful,\n", "but WITHOUT ANY WARRANTY; without even the implied warranty of\n", "MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n", "GNU General Public License for more details.\n", "\n", "You should have received a copy of the GNU General Public License\n", "along with gmx_clusterByFeatures. If not, see .\n", "\n", "THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n", "\"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n", "LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n", "A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n", "OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n", "SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED\n", "TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR\n", "PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF\n", "LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING\n", "NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS\n", "SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n", " \n", "\n", "Back Off! I just backed up clustering-radius.log to ./#clustering-radius.log.2#\n", "WARNING: all CONECT records are ignored\n", "Group 0 ( System) has 8327 elements\n", "Group 1 ( Protein) has 8327 elements\n", "Group 2 ( Protein-H) has 4217 elements\n", "Group 3 ( C-alpha) has 542 elements\n", "Group 4 ( Backbone) has 1626 elements\n", "Group 5 ( MainChain) has 2169 elements\n", "Group 6 ( MainChain+Cb) has 2660 elements\n", "Group 7 ( MainChain+H) has 2668 elements\n", "Group 8 ( SideChain) has 5659 elements\n", "Group 9 ( SideChain-H) has 2048 elements\n", "Group 10 (r_5-535_&_MainChain) has 2124 elements\n", "Group 11 (r_5-535_&_MainChain_&_!r_72-85_&_!r_257-268_&_!r_338-352_&_!r_435-444) has 1920 elements\n", "Group 12 ( r_79_&_CA) has 1 elements\n", "Group 13 ( r_341_&_CA) has 1 elements\n", "Group 14 ( r_293_&_CA) has 1 elements\n", "Group 15 (r_10-535_&_MainChain) has 2104 elements\n", "Group 16 ( r_203) has 11 elements\n", "Group 17 ( r_447) has 18 elements\n", "Group 18 ( oxyanion_hole) has 52 elements\n", "Group 19 ( acyl_pocket) has 84 elements\n", "Group 20 ( choline_site) has 45 elements\n", "Group 21 (peripheral_site) has 106 elements\n", "Select a group: Group 0 ( System) has 8327 elements\n", "Group 1 ( Protein) has 8327 elements\n", "Group 2 ( Protein-H) has 4217 elements\n", "Group 3 ( C-alpha) has 542 elements\n", "Group 4 ( Backbone) has 1626 elements\n", "Group 5 ( MainChain) has 2169 elements\n", "Group 6 ( MainChain+Cb) has 2660 elements\n", "Group 7 ( MainChain+H) has 2668 elements\n", "Group 8 ( SideChain) has 5659 elements\n", "Group 9 ( SideChain-H) has 2048 elements\n", "Group 10 (r_5-535_&_MainChain) has 2124 elements\n", "Group 11 (r_5-535_&_MainChain_&_!r_72-85_&_!r_257-268_&_!r_338-352_&_!r_435-444) has 1920 elements\n", "Group 12 ( r_79_&_CA) has 1 elements\n", "Group 13 ( r_341_&_CA) has 1 elements\n", "Group 14 ( r_293_&_CA) has 1 elements\n", "Group 15 (r_10-535_&_MainChain) has 2104 elements\n", "Group 16 ( r_203) has 11 elements\n", "Group 17 ( r_447) has 18 elements\n", "Group 18 ( oxyanion_hole) has 52 elements\n", "Group 19 ( acyl_pocket) has 84 elements\n", "Group 20 ( choline_site) has 45 elements\n", "Group 21 (peripheral_site) has 106 elements\n", "Select a group: Group 0 ( System) has 8327 elements\n", "Group 1 ( Protein) has 8327 elements\n", "Group 2 ( Protein-H) has 4217 elements\n", "Group 3 ( C-alpha) has 542 elements\n", "Group 4 ( Backbone) has 1626 elements\n", "Group 5 ( MainChain) has 2169 elements\n", "Group 6 ( MainChain+Cb) has 2660 elements\n", "Group 7 ( MainChain+H) has 2668 elements\n", "Group 8 ( SideChain) has 5659 elements\n", "Group 9 ( SideChain-H) has 2048 elements\n", "Group 10 (r_5-535_&_MainChain) has 2124 elements\n", "Group 11 (r_5-535_&_MainChain_&_!r_72-85_&_!r_257-268_&_!r_338-352_&_!r_435-444) has 1920 elements\n", "Group 12 ( r_79_&_CA) has 1 elements\n", "Group 13 ( r_341_&_CA) has 1 elements\n", "Group 14 ( r_293_&_CA) has 1 elements\n", "Group 15 (r_10-535_&_MainChain) has 2104 elements\n", "Group 16 ( r_203) has 11 elements\n", "Group 17 ( r_447) has 18 elements\n", "Group 18 ( oxyanion_hole) has 52 elements\n", "Group 19 ( acyl_pocket) has 84 elements\n", "Group 20 ( choline_site) has 45 elements\n", "Group 21 (peripheral_site) has 106 elements\n", "Reading frame 4 time 400.000 " ] }, { "name": "stdout", "output_type": "stream", "text": [ "=======================\n", " Cluster Log output \n", "=======================\n", "\n", "Command:\n", "=======================\n", "'gmx_clusterByFeatures cluster' -f traj_superimposed.xtc -s inputs/ref_cavity.pdb -n inputs/onlyProtein.ndx -feat cavity_features.xvg -method kmeans -nfeature 23 -cmetric ssr-sst -ssrchange 1.5 -ncluster 15 -cpdb clustering-radius-output/central.pdb -g clustering-radius.log -clid cluster-id-radius.xvg\n", "=======================\n", "\n", "Choose a group for the output:\n", "Selected 0: 'System'\n", "\n", "Choose a group for clustering/RMSD calculation:\n", "Selected 10: 'r_5-535_&_MainChain'\n", "\n", "Choose a group for fitting or superposition:\n", "Selected 11: 'r_5-535_&_MainChain_&_!r_72-85_&_!r_257-268_&_!r_338-352_&_!r_435-444'\n", "\n", "\n", " Input Trajectory dt = 100 ps\n", "\n", "\n", "\n", "###########################################\n", "########## NUMBER OF CLUSTERS : 1 ########\n", "###########################################\n", "\n", "===========================\n", "Cluster-ID\tTotalFrames\n", "1\t\t22501\n", "===========================\n", "\n", "\n", "\n", "###########################################\n", "########## NUMBER OF CLUSTERS : 2 ########\n", "###########################################\n", "\n", "===========================\n", "Cluster-ID\tTotalFrames\n", "1\t\t11602\n", "2\t\t10899\n", "===========================\n", "\n", "\n", "\n", "###########################################\n", "########## NUMBER OF CLUSTERS : 3 ########\n", "###########################################\n", "\n", "===========================\n", "Cluster-ID\tTotalFrames\n", "1\t\t10730\n", "2\t\t9339\n", "3\t\t2432\n", "===========================\n", "\n", "\n", "\n", "###########################################\n", "########## NUMBER OF CLUSTERS : 4 ########\n", "###########################################\n", "\n", "===========================\n", "Cluster-ID\tTotalFrames\n", "1\t\t6985\n", "2\t\t6795\n", "3\t\t6561\n", "4\t\t2160\n", "===========================\n", "\n", "\n", "\n", "###########################################\n", "########## NUMBER OF CLUSTERS : 5 ########\n", "###########################################\n", "\n", "===========================\n", "Cluster-ID\tTotalFrames\n", "1\t\t6529\n", "2\t\t6088\n", "3\t\t5529\n", "4\t\t2354\n", "5\t\t2001\n", "===========================\n", "\n", "\n", "\n", "###########################################\n", "########## NUMBER OF CLUSTERS : 6 ########\n", "###########################################\n", "\n", "===========================\n", "Cluster-ID\tTotalFrames\n", "1\t\t5775\n", "2\t\t5088\n", "3\t\t4794\n", "4\t\t3109\n", "5\t\t1922\n", "6\t\t1813\n", "===========================\n", "\n", "\n", "\n", "###########################################\n", "########## NUMBER OF CLUSTERS : 7 ########\n", "###########################################\n", "\n", "===========================\n", "Cluster-ID\tTotalFrames\n", "1\t\t5268\n", "2\t\t4624\n", "3\t\t3265\n", "4\t\t3174\n", "5\t\t2646\n", "6\t\t1767\n", "7\t\t1757\n", "===========================\n", "\n", "\n", "\n", "###########################################\n", "########## NUMBER OF CLUSTERS : 8 ########\n", "###########################################\n", "\n", "===========================\n", "Cluster-ID\tTotalFrames\n", "1\t\t4331\n", "2\t\t3408\n", "3\t\t3075\n", "4\t\t2879\n", "5\t\t2867\n", "6\t\t2719\n", "7\t\t1742\n", "8\t\t1480\n", "===========================\n", "\n", "\n", "\n", "###########################################\n", "########## NUMBER OF CLUSTERS : 9 ########\n", "###########################################\n", "\n", "===========================\n", "Cluster-ID\tTotalFrames\n", "1\t\t4341\n", "2\t\t3427\n", "3\t\t3083\n", "4\t\t2782\n", "5\t\t2730\n", "6\t\t2683\n", "7\t\t1495\n", "8\t\t1139\n", "9\t\t821\n", "===========================\n", "\n", "\n", "\n", "###########################################\n", "########## NUMBER OF CLUSTERS : 10 ########\n", "###########################################\n", "\n", "===========================\n", "Cluster-ID\tTotalFrames\n", "1\t\t4230\n", "2\t\t3259\n", "3\t\t3048\n", "4\t\t2659\n", "5\t\t2538\n", "6\t\t2271\n", "7\t\t1482\n", "8\t\t1405\n", "9\t\t811\n", "10\t\t798\n", "===========================\n", "\n", "\n", "\n", "###########################################\n", "########## NUMBER OF CLUSTERS : 11 ########\n", "###########################################\n", "\n", "===========================\n", "Cluster-ID\tTotalFrames\n", "1\t\t3098\n", "2\t\t3006\n", "3\t\t2976\n", "4\t\t2940\n", "5\t\t2501\n", "6\t\t2009\n", "7\t\t1538\n", "8\t\t1443\n", "9\t\t1405\n", "10\t\t794\n", "11\t\t791\n", "===========================\n", "\n", "\n", "\n", "###########################################\n", "########## NUMBER OF CLUSTERS : 12 ########\n", "###########################################\n", "\n", "===========================\n", "Cluster-ID\tTotalFrames\n", "1\t\t3011\n", "2\t\t2996\n", "3\t\t2735\n", "4\t\t2691\n", "5\t\t2073\n", "6\t\t1907\n", "7\t\t1489\n", "8\t\t1396\n", "9\t\t1320\n", "10\t\t1302\n", "11\t\t795\n", "12\t\t786\n", "===========================\n", "\n", "\n", "\n", "###########################################\n", "########## NUMBER OF CLUSTERS : 13 ########\n", "###########################################\n", "\n", "===========================\n", "Cluster-ID\tTotalFrames\n", "1\t\t2946\n", "2\t\t2685\n", "3\t\t2572\n", "4\t\t2233\n", "5\t\t1988\n", "6\t\t1916\n", "7\t\t1596\n", "8\t\t1373\n", "9\t\t1326\n", "10\t\t1191\n", "11\t\t1105\n", "12\t\t792\n", "13\t\t778\n", "===========================\n", "\n", "\n", "\n", "###########################################\n", "########## NUMBER OF CLUSTERS : 14 ########\n", "###########################################\n", "\n", "===========================\n", "Cluster-ID\tTotalFrames\n", "1\t\t2910\n", "2\t\t2698\n", "3\t\t2431\n", "4\t\t2107\n", "5\t\t1946\n", "6\t\t1919\n", "7\t\t1618\n", "8\t\t1393\n", "9\t\t1392\n", "10\t\t1218\n", "11\t\t919\n", "12\t\t734\n", "13\t\t733\n", "14\t\t483\n", "===========================\n", "\n", "\n", "\n", "###########################################\n", "########## NUMBER OF CLUSTERS : 15 ########\n", "###########################################\n", "\n", "===========================\n", "Cluster-ID\tTotalFrames\n", "1\t\t2508\n", "2\t\t2322\n", "3\t\t2069\n", "4\t\t1926\n", "5\t\t1917\n", "6\t\t1676\n", "7\t\t1668\n", "8\t\t1555\n", "9\t\t1350\n", "10\t\t1319\n", "11\t\t1285\n", "12\t\t944\n", "13\t\t752\n", "14\t\t723\n", "15\t\t487\n", "===========================\n", "\n", "\n", "\n", "===========================================================================================================\n", " Cluster Metrics Summary \n", "-----------------------------------------------------------------------------------------------------------\n", "Clusters SSR/SST D(SSR/SST) (Psuedo)F-stat. Silhouette-score Davies-bouldin-score\n", "2 39.50 39.503 14691.213 0.347 1.121 \n", "3 50.59 11.086 11517.190 0.337 1.117 \n", "4 57.15 6.563 10002.451 0.240 1.329 \n", "5 62.30 5.150 9294.526 0.241 1.271 \n", "6 65.75 3.452 8638.371 0.239 1.232 \n", "7 68.68 2.930 8222.448 0.231 1.272 \n", "8 70.67 1.983 7741.149 0.212 1.304 \n", "9 72.57 1.902 7437.596 0.218 1.263 \n", "10 74.22 1.647 7192.800 0.218 1.202 \n", "11 75.34 1.123 6870.415 0.209 1.273 \n", "12 76.28 0.946 6576.296 0.202 1.314 \n", "13 77.03 0.750 6286.124 0.200 1.299 \n", "14 77.87 0.837 6087.276 0.201 1.297 \n", "15 78.54 0.671 5879.271 0.196 1.331 \n", "===========================================================================================================\n", "\n", "\n", "#####################################\n", "Final number of cluster selected: 10\n", "#####################################\n", "\n", "Calculating central structure for cluster-10 ..." ] }, { "name": "stderr", "output_type": "stream", "text": [ "Reading frame 9 time 407500.000 \n", "Back Off! I just backed up cluster-id-radius.xvg to ./#cluster-id-radius.xvg.2#\n", "Reading frame 9 time 407500.000 \n", "GROMACS reminds you: \"Christianity may be OK between consenting adults in private but should not be taught to young children.\" (Francis Crick)\n", "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "===========================================\n", "Cluster-ID\tCentral Frame\tTotal Frames \n", "1\t\t19488\t\t4230\n", "2\t\t9807\t\t3259\n", "3\t\t1873\t\t3048\n", "4\t\t14884\t\t2659\n", "5\t\t20134\t\t2538\n", "6\t\t13371\t\t2271\n", "7\t\t22103\t\t1482\n", "8\t\t2417\t\t1405\n", "9\t\t21652\t\t811\n", "10\t\t4075\t\t798\n", "===========================================\n", "\n", "\n", "\n", "Extracting coordinates of the central structure...\n", "\n", "\n", "Calculating RMSD between central structures...\n", "\n", "\n", "=====================================\n", " Central structurs - RMSD matrix \n", "=====================================\n", " c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 \n", " 0.000 0.162 0.219 0.173 0.115 0.213 0.169 0.242 0.167 0.233 \n", " 0.162 0.000 0.172 0.174 0.160 0.148 0.159 0.184 0.161 0.183 \n", " 0.219 0.172 0.000 0.231 0.221 0.188 0.190 0.142 0.204 0.154 \n", " 0.173 0.174 0.231 0.000 0.166 0.220 0.162 0.241 0.169 0.230 \n", " 0.115 0.160 0.221 0.166 0.000 0.213 0.164 0.247 0.159 0.242 \n", " 0.213 0.148 0.188 0.220 0.213 0.000 0.214 0.175 0.208 0.174 \n", " 0.169 0.159 0.190 0.162 0.164 0.214 0.000 0.213 0.127 0.211 \n", " 0.242 0.184 0.142 0.241 0.247 0.175 0.213 0.000 0.229 0.140 \n", " 0.167 0.161 0.204 0.169 0.159 0.208 0.127 0.229 0.000 0.225 \n", " 0.233 0.183 0.154 0.230 0.242 0.174 0.211 0.140 0.225 0.000 \n", "=====================================\n", "\n", "\n", "Writing central structure to pdb-files...\n" ] } ], "source": [ "%%bash\n", "\n", "mkdir clustering-radius-output\n", "\n", "echo 0 10 11 | gmx_clusterByFeatures cluster -f traj_superimposed.xtc -s inputs/ref_cavity.pdb -n inputs/onlyProtein.ndx \\\n", " -feat cavity_features.xvg -method kmeans -nfeature 23 -cmetric ssr-sst -ssrchange 1.5 -ncluster 15 \\\n", " -cpdb clustering-radius-output/central.pdb -g clustering-radius.log -clid cluster-id-radius.xvg" ] }, { "cell_type": "markdown", "id": "dd57bb94-06c1-4b6b-81a6-06ef5c94cbb1", "metadata": {}, "source": [ "### 7. Analysis and visualizations\n", "\n", "Now, we will perform few analysis and visualizations.\n", "\n", "1. Calculating and plotting average cavity radius cluster-wise\n", "2. Visualizing clustered cavity shapes\n", "\n", "#### 1. Calculating and plotting average cavity radius cluster-wise\n", "\n", "`gmx_clusterByFeatures holeclustersplot` can be used to plot the average radius of each clusters. The options to the sub-commands are as follows:\n", "* `-xmin` - minimum value of cavity/channel axis position to consider\n", "* `-xmax` - maximum value of cavity/channel axis position to consider\n", "* `-endrad` - ignore if radius is greater than this value (mostly at the opening end)\n", "* `-gap` - slab size along the cavity/channel\n", "* `-ax` - X, Y or Z axis along which cavity/channel axis is aligned\n", "* `-dl` - Do not consider last two smallest clusters for calculation and plotting.\n", "* `-csv radius-clusterwise.csv` - Dump the average and standard-deviation of each clusters' radius to a csv file\n", "* `-o radius-cluster.png` - Plot the average radius values from each cluster" ] }, { "cell_type": "code", "execution_count": 4, "id": "bd093476-3795-4806-b6b1-031792886460", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Reading file : radius.dat\n", "Reading frame 0 at time 0.000\n", "Maximum axis range is: (5.500 to 17.000)\n", "Reading frame 20000 at time 2000000.000\n", "Finishid reading.... Total number of frame read = 22501\n", "After removing axis points with any missing data:\n", "------------------------------------------------------------\n", "# Axis Mean radius Std. deviation\n", "------------------------------------------------------------\n", " 5.500 1.629 0.384\n", " 6.000 1.750 0.451\n", " 6.500 1.907 0.534\n", " 7.000 2.081 0.594\n", " 7.500 2.146 0.631\n", " 8.000 2.142 0.637\n", " 8.500 2.127 0.610\n", " 9.000 2.111 0.581\n", " 9.500 2.085 0.569\n", " 10.000 2.036 0.571\n", " 10.500 1.984 0.567\n", " 11.000 1.948 0.559\n", " 11.500 1.925 0.561\n", " 12.000 1.915 0.571\n", " 12.500 1.916 0.574\n", " 13.000 1.925 0.572\n", " 13.500 1.938 0.595\n", " 14.000 1.963 0.646\n", " 14.500 2.001 0.715\n", " 15.000 2.049 0.796\n", " 15.500 2.088 0.884\n", " 16.000 2.113 0.965\n", " 16.500 2.178 1.044\n", "------------------------------------------------------------\n" ] } ], "source": [ "%%bash\n", "\n", "gmx_clusterByFeatures holeclustersplot -i radius.dat -clid cluster-id-radius.xvg -o radius-cluster.png -csv radius-clusterwise.csv \\\n", " -xmin 5.5 -xmax 17 -endrad 6 -gap 0.5 -ax Y -dl 2 -wd 8" ] }, { "cell_type": "markdown", "id": "53a9b5b3-211e-4532-afe3-26489025a8f5", "metadata": {}, "source": [ " " ] }, { "cell_type": "markdown", "id": "d248a665-0391-4589-ab31-ea89470c5f34", "metadata": {}, "source": [ "#### 2. Visualizing clustered cavity shapes\n", "\n", "To visualize the cavity shape of each cluster, we will use central structure pdb file and calculate the radius with same options as we did it originally with whole trajectory.\n", "This step also generate a pdb file with `-pdb` option containg cavity shape. \n", "\n", "Subsequently, we will use `sph_process` and `sos_triangle` tools (provided with HOLE package) to generate a file that can be used to draw the cavity in the **VMD**.\n", "\n", "Note: Remove `%%capture --no-stdout` and `%%capture --no-stderr` to populate all the output and errors generated from these commands. " ] }, { "cell_type": "code", "execution_count": 5, "id": "a358421e-b8ac-492c-b9d3-2e849f70ae70", "metadata": {}, "outputs": [], "source": [ "%%capture --no-stderr\n", "%%capture --no-stdout\n", "%%bash\n", "\n", "for i in 1 2 3 4 5 6 7 8\n", "do\n", "input=central_c${i}\n", "\n", "echo 11 1 | gmx_clusterByFeatures hole -f clustering-radius-output/${input}.pdb -s inputs/ref_cavity.pdb -n inputs/onlyProtein.ndx -rad bondi \\\n", " -catmid 1903 -cvect 0 1 0 -endrad 12 -sample 0.5 -pdb ${input}_sph -o dummy_radius.dat\n", "mv ${input}_sph.pdb ${input}.sph\n", "\n", "rm ${input}.sos ${input}.vmd_plot dummy_radius.dat\n", "sph_process -sos -dotden 15 -color ${input}.sph ${input}.sos\n", "sos_triangle -s < ${input}.sos > cavity-snapshots/${input}.vmd_plot\n", "rm ${input}.sph ${input}.sos\n", "done" ] }, { "cell_type": "markdown", "id": "7dd7b05a-567a-4a38-a244-4a846429f0e3", "metadata": {}, "source": [ "**Visualizing cavity shape using VMD**\n", "\n", "Above commands generate `cluster_.vmd_plot` files that can be loaded in VMD to visualize the cavity.\n", "\n", "1. Go to the cavity-snapshots library\n", "2. At first load the central strucure by using follwoing command: `vmd ../clustering-radius-output/central_c1.pdb -e snap.vmd`. This command will load central structure of first crystal structure and sets several representations.\n", "3. In VMD Main Window, click on `Analysis -> Extensions -> Tk Console`. It will open the Tk console. Write `source central_c1.vmd_plot` and it will plot the cavity." ] }, { "cell_type": "markdown", "id": "d0a4b704-22c7-40cf-b1a7-5feb404095c3", "metadata": {}, "source": [ "" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.18" } }, "nbformat": 4, "nbformat_minor": 5 }