Summer Research Fellowship Programme of India's Science Academies 2017
In silico investigation of the interaction between milk
protein and withaferin-A
Vijeta Srivastava
Sam Higginbottom University of Agriculture, Technology and Sciences, Allahabad, Uttar Pradesh
Guided by
Bhushan Keshav Patwardhan
Savitribai Phule Pune University, Pune, Maharashtra
1. Introduction
Withania somnifera(WS) is an Ayurvedic rasayana botanical. It is an important medicinal
botanical that has been used in Ayurvedic and indigenous medicine for over 3,000 years
having reported nootropic and anti-cancer properties. It is commonly known as Indian winter
cherries or Ashwagandha. The name ashwagandha is derived from the word Ashwa meaning
“horse” which is a symbol of muscle power and stamina.WS shows beneficial medicinal
property for health issues like stress, depression, muscle fatigue and neck/back stiffness,
tiredness, joint pains, arthritis of all types, calf myalgia (lower leg pains), blood pressure,
diabetic neuropathy and weakness due to diabetes, erectile dysfunction and low sperm count,
premature ejaculation [1].
According to Ayurveda, WS has growth promoting effect in growing children (812 years)
when administered with cow’s milk and promotes rejuvenation(1). This study aimed to
investigate this property by evaluating the interaction between milk proteins and the active
component (bioactive) of WS. Pharmacological studies have reported that Withaferin A (WA)
is one of the key bioactives of WS. In view of its varied therapeutic potential, it has also been
the subject of considerable modern scientific attention. The major chemical constituents of
the Withania genus, the with anolides, are a group of naturally occurring C28-steroidal lactone
triterpenoids built on an intact or rearranged ergostane framework, in which C-22 and C-26
are appropriately oxidized to form a six-membered lactone ring [1].
The principle constituents of milk are water, fats, protein, lactose (milk sugar) and minerals
(salts). Milk also contain other trace amount of other substance such as pigments, enzyme,
vitamin, phospholipids (substance with fat like properties), and gasses. Milk contains several
important nutrients; 200ml of milk gives calcium, protein, iodine, potassium, phosphorous
and vitamins B2 and B12. The protein component of milk is composed of numerous specific
proteins. The primary group of milk protein are casein. Casein has an appropriate amino acid
sequence that is responsible for growth and nourishment of an individual. Milk from the genus
Bos taurus is commonly used for human consumption. This milk is composed of 32 g/l of
protein, 80% casein and remaining 20% is composed of α-lactalbumin and β-lactoglobumin,
along with this milk protein casein exists in particles containing many thousands of
independent protein molecule known as casein micelles [2]. These independent particles give
milk its unique property. Particles are composed of four casein proteins α-S1 (38%), α-S2
(10%), β (40%), and κ (12%),exits in ratio of 4:1:4:1 together with calcium phosphate [2,4].
This study hypothesize that the milk protein, casein, acts as a carrier vehicle for the bioactive
WA which can enhance the bioavailabilty and function of the latter. This hypothesis has
derived from the following backgrounds. It is a common practise in Ayurveda to fortify the
milk with powder of roots of WS and scientific studies have reported the increase in
bioavailability of some compounds through the carrier function of casein proteins [2].
Aim: to investigate the interaction between cow’s milk proteins and Withaferin A.
Objectives
o Literature review of WS as a health supplement.
o Understanding of protein structures and protein-ligand interactions.
o Molecular docking evaluation of the interaction between the milk protein- casein and
the bioactive- Withaferin A.
2. Materials and methods
2.1. Literature survey
Literature survey was done using NCBI Database, PubMed
(https://www.ncbi.nlm.nih.gov/pubmed). PubMed is a free search engine accessing primarily
the MEDLINE database of references and abstracts on life sciences and biomedical topics.
The United States National Library of Medicine (NLM) at the National Institutes of
Health maintains the database as part of the Entrez system of information retrieval. The key
words used for search were [Withania Somnifera, growth], [Withania Somnifera, health] and
[Withania Somnifera, nutrition].
2.2. Software
Autodock Autodock vina - http://autodock.scripps.edu/downloads.
Autodock http://mgltools.scripps.edu/downloads.
Swiss PDB Viewer http://spdbv.vital-it.ch/disclaim.html.
Open Babel http://openbabel.org/wiki/Main_Page.
Discovery studio.
2.3. Understanding of protein structure and ligand interaction
The protein structure information was understood by referencing the textbook of
Biochemistry [3]. The different structures such as primary, secondary, tertiary and quaternary
were learnt. Protein-ligand interaction mechanisms were also learnt.
2.4. Molecular docking and visualization
2.4.1. Ligand structure retrieval
Structure of WA was obtained from the database, PubChem. The structure and chemical
information were given in the Table 1. The 3D structure was taken in SDF file format and is
converted in PDB (Protein Data Bank) file format using software Open Babel.
Table 1. Structure of Withaferin A and its chemical property.
Structure
Chemical Information
IUPAC Name: (4β,5β,6β,22R)-4,27-
Dihydroxy-5,6:22,26-diepoxyergosta-
2,24-diene-1,26-dion
Chemical formula: C
28
H
38
O
6
Molar mass: 470.61 g·mol
1
2.4.2. Protein structure retrieval
The structures of four casein isoforms such as α-S1, α-S2, β and κ which were searched in the
database RCSB protein databank did not yield any results. Hence the structures were modelled
using FASTA sequences of these proteins from NCBI database by online software such as
Swiss-model and I-Tasser.
2.4.2.1. Protein modelling: Protein structure was obtained by using SWISS MODEL and I-
TASSER (Fig. 1).
Fig. 1. The software programs used to build protein models.
2.4.2.2. SWISS-modelling: It is a structural bioinformatics web server. It is an automated
system for modelling the 3D structure of a protein from its amino acid sequence using
homology modelling techniques. Homology modelling is currently the most accurate method,
also known as comparative modelling of protein which helps in constructing an atomic
resolution n model of the target protein from its amino acid sequence.
2.4.2.3. I-TASSER: It is a protein structure modelling approach based on the secondary
structure enhanced profileprofile threading alignment (PPA) and the iterative
implementation of the Threading Assembly Refinement TASSER Programme [5]. It detects
structure template from the protein data bank by a technique called fold recognition or
threading. The full-length structure models are constructed by reassembling structural
fragments from threading template using replica exchange Monte Carlo (computational
algorithms) simulations [6]. I-TASSER has been extended for structure based protein function
predictions, which provides annotations of a ligand binding site, gene oncology and enzyme
commission by structurally matching structural models of the target protein in protein
function database [7].
2.4.3. Ramachandran plot analysis
A Ramachandran plot (also known as a Ramachandran diagram or a [φ,ψ] plot), originally
developed in 1963 by G. N. Ramachandran, C. Ramakrishnan, and V. Sasisekharan, is a way
to visualize energetically allowed regions for backbone dihedral angles ψ against φ of amino
acid residues in protein structure
Evaluation of best model was done based on Ramachandran plot Analysis using RAMPAGE
application. Prediction of best model is purely dependent on maximum number of residue in
favoured region.
2.4.4. Dock preparation
AutoDock is an automated procedure for predicting the interaction of ligands with bio
macromolecular targets. The idea of docking arises from problems in the design of bioactive
compounds, and in particular the field of computer-aided drug design. Progress in bimolecular
x-ray crystallography continues to provide important protein and nucleic acid structures.
These structures could be targets for bioactive agents in the control of animal and plant
diseases, or simply key to the understanding of fundamental aspects of biology. The precise
interaction of such molecules with their targets is important in the development process. The
aim of autodock4.0 is to provide a computational tool to provide accurate information to
determine biomolecule complexes.
Energy Minimization is a computational approach to optimize energy also known as geometry
minimization. The main motive for performing a energy minimization is to obtain physical
significance of the desired structure, optimized structures often correspond to a substance as
it is found in nature and the geometry of such a structure can be used in a variety of
experimental and theoretical investigations. It is a pre-docking step.
All the docking calculations were performed by using Autodock 4.2 Tools. Protein models
were first modified by adding polar hydrogen atoms, adding Kollman charges, choosing
macromolecule, choosing target by using builder molecule of Autodock as a result target
pdbqt file is prepared. The macromolecule was kept rigid, while all the torsional bonds of
ligands were set free to rotate also aromatic carbons are inserted in this ligand pdbqt file.
For preparation of Grid Parameter file (gpf), geometry optimization was carried out with grid
resolution of 0.375 Å and grid spacing of 80 Å _ 80 Å _ 80 Å for casein structure created by
SWISS MODEL and 126 Å _ 126 Å _ 126 Å for casein structure created by I-TASSER
Model followed by the preparation of gpf files. And for docking parameter file (dpf), Genetic
Algorithm was used, using default values. For each ligand, separate docking calculations were
performed using the Lamarckian genetic algorithm (LGA) method and the dpf file was
generated [4,8].
2.4.5. Docking
Molecular docking involves computationally exploring a search space that is defined by the
molecular representation used by the method, and ranking particular method to determine the
best binding mode [9]. Cygwin64 was used to run docking commands. Cygwin is a Unix-like
environment and command-line interface for Microsoft Windows. Cygwin provides native
integration of Windows-based applications, data, and other system resources with
applications, software tools, and data of the Unix-like environment. Cygwin tool is an
application within the Windows operating context. Ten different runs were performed using
cygwin command; as a result dlg and glg file formats were created. Autodock4.exe and
Autogrid4.exe were used for running cygwin commands.
Cygwin commands used in docking were
cd c:/
cd cygwin64
cd home
cd 1
tail -f a.glg &
autodock4.exe p a.dpf l a.dlg&
tail f a.dlg &
grep ‘^DOCKED’ a,dlg | cut –c9- > a.pdbqt
cut c-66 a.pdbqt > a.pdb
cat target.pdb a.pdb | grep –v ‘^END’ > complex.pdb
2.4.6. Visualization
2.4.6.1. Identification of best dock runs: Run having highest binding energy and presence of
hydrogen bond is considered as best complex. Ligand efficiency and inhibition constant are
the other major factors on which the best complex depends.
2.4.6.2. Conversion: The best complex file was converted into PDBQT format and used for
visualization and analysis.
2.4.6.3. Visualization with discovery studio: Discovery Studio 2017R6 was used for
visualization of the docking results. Ligand interaction mode was chosen to find the various
interactions between the target and the ligand. Labelling of the target was done using the
amino acid in 3 letter and ID style.
3. Results
3.1. Result of liturature search
The literature search showed the pharmacological importance of WS which are attributed
mainly to the with anolides present in the root. The efficacy of medicinally active with
anolides constituent depends on the absorption and transportation through the intestinal
epithelium [10].
WS have been reported to exhibit immunomodulatory, anticancer and other activities,
withanolides are also reported as potential in breast cancer [11]. Studies have showed the
effects of WA on the proliferation and metastatic activity of AGS cells. WA exerted a dose-
dependent cytotoxic effect on AGS cells. The effect was associated with cell cycle arrest at
the G2/M phase and the expression of apoptotic proteins. Additionally, WA treatment resulted
in a decrease in the migration and invasion ability of the AGS cells, as demonstrated using a
wound healing assay and a Boyden chamber assay. These results indicated that WA directly
inhibits the proliferation and metastatic activity of gastric cancer cells, and suggested that WA
may be developed as a drug for the treatment of gastric cancer [12].
WS also shows anti-malarial property and antiplasmodial activity [13]. The botanical has been
reported as effective for the treatment of sarcopenia which is the loss of muscle mass, strength
and function with ageing [14]. WS is also reported as an effective remedy for Parkinson's
disease (PD) [15].
3.2. Understanding of the protein structures and ligand interaction
The following information about protein structures and protein-ligand interaction were
understood.
Primary structure: Amino acids are linked by peptide bonds to form polypeptide chain it is
linked by amide bonds formed between the carboxyl group of one amino acid and the amino
group of the next. This linkage, called a peptide bond, has several important properties. First
it is resistant to hydrolysis so the proteins are remarkably stable kinetically. Second, the
peptide group is planner because the C-N bond has considerable double bond character. Third,
each peptide bond has both hydrogen bond donor (NH group) and hydrogen-bond acceptor
(the CO group). Hydrogen bonding between these backbone groups is a distinctive feature of
protein structure. Finally, the peptide bond is uncharged, which allows protein to form tightly
packed globular structure having significant amount of the backbone buried within the protein
interior. Because they are linear polymers, proteins can be described as sequence of amino
acid. Such sequences are written from the amino to the carboxyl terminus.
Secondary structure: Polypeptide chain can fold into regular structure such as alpha helix,
beta sheet, turns and loops. In the α helix, the polypeptide chain twist into a tightly packed
rod. Within the helix, the CO group of each amino acid is hydrogen bonded to the NH group
of the amino acid four residues along the polypeptide chain. In the β strands connected by
NH-to-CO hydrogen bonds come together to form β sheets.
Tertiary structure: Water soluble proteins fold into compact structure with non-polar cores.
The compact, asymmetric structure that individual polypeptides attain is called tertiary
structure. The tertiary structures of water-soluble proteins have features in common: (1) an
interior formed of amino acid with hydrophobic side chains and (2) a surface formed largely
of hydrophilic amino acids that interacts to environment. In these proteins, the hydrophobic
amino acids are on the surface to interact with the environment, whereas hydrophilic groups
are shielded from the environment in the interior of the protein.
Quaternary structure: Polypeptide chains can assemble into multisubunit structure. Protein
consisting of more than one polypeptide chain display quaternary structure and each
individual polypeptide chain is called a sub-unit. Quaternary structure can be as simple as two
identical subunits or as complex as dozens of different subunit. In most cases, the subunits
are held together by non-covalent bond.
The amino acid sequence of a protein determine its three dimensional structure: The
amino acid sequence completely determines the three dimensional structure and hence all
other properties of a protein. Some proteins can be unfolded completely yet refold efficiently
when placed under conditions in which the folded form of the protein is stable. The amino
acid sequence of a protein is determined by the sequence of bases in a DNA molecule. This
one-dimensional sequence information sequence information is extended into the three
dimensional world by the ability of protein to fold spontaneously. Protein folding is a highly
cooperative process; structural intermediates between the unfolded and folded forms do not
accumulate. The versatility of protein is further enhanced by covalent modifications. Such
modification can incorporate functional groups not present in the 20 amino acids. Other
modifications are important to the regulation of protein activity. Through their structural
stability, diversity, and chemical reactivity, protein makes possible most of the key processes
associated with life.
Ligand interaction: Ligand-mediated signal transmission through molecular complementary
is essential to all life processes; these chemical interactions comprise biological recognition
at molecular level. The evolution of the protein functions depends on the development of
specific sites which are designed to bind ligand molecules. Ligand binding capacity is
important for the regulation of biological functions. Protein-Ligand interactions occur through
the molecular mechanics involving the conformational changes among low affinity and high
affinity states. Ligand binding interactions changes the protein state and protein function.
Key concepts of protein ligand interaction
Every biological reaction is initiated by protein-ligand interaction step. Such reactions
never involve in the binding of single ligand or single step.
Binding of two or more ligands to a same protein indicates mutual interaction.
Ligand binding plays an important role in regulation of biological function.
Ligand binding may leads to the conformational changes in proteins.
Ligand and macromolecule interaction provides the strength of the interaction.
3.3. Molecular docking
3.3.1. Amino acid sequences of target proteins
The FASTA formats used for model building are given below
a) Alpha S1 casein:
>sp|P02662|CASA1_BOVIN Alpha-S1-casein OS=Bos taurus GN=CSN1S1 PE=1 SV=2
MKLLILTCLVAVALARPKHPIKHQGLPQEVLNENLLRFFVAPFPEVFGKEKVNELSK
DIGSESTEDQAMEDIKQMEAESISSSEEIVPNSVEQKHIQKEDVPSERYLGYLEQLLR
LKKYKVPQLEIVPNSAEERLHSMKEGIHAQQKEPMIGVNQELAYFYPELFRQFYQL
DAYPSGAWYYVPLGTQYTDAPSFSDIPNPIGSENSEKTTMPLW
b) Alpha S2 casein:
>sp|P02663|CASA2_BOVIN Alpha-S2-casein OS=Bos taurus GN=CSN1S2 PE=1 SV=2
MKFFIFTCLLAVALAKNTMEHVSSSEESIISQETYKQEKNMAINPSKENLCSTFCKEV
VRNANEEEYSIGSSSEESAEVATEEVKITVDDKHYQKALNEINQFYQKFPQYLQYLY
QGPIVLNPWDQVKRNAVPITPTLNREQLSTSEENSKKTVDMESTEVFTKKTKLTEEE
KNRLNFLKKISQRYQKFALPQYLKTVYQHQKAMKPWIQPKTKVIPYVRYL
c) Beta casein:
>sp|P02666|CASB_BOVIN Beta-casein OS=Bos taurus GN=CSN2 PE=1 SV=2
MKVLILACLVALALARELEELNVPGEIVESLSSSEESITRINKKIEKFQSEEQQQTEDE
LQDKIHPFAQTQSLVYPFPGPIPNSLPQNIPPLTQTPVVVPPFLQPEVMGVSKVKEAM
APKHKEMPFPKYPVEPFTESQSLTLTDVENLHLPLPLLQSWMHQPHQPLPPTVMFPP
QSVLSLSQSKVLPVPQKAVPYPQRDMPIQAFLLYQEPVLGPVRGPFPIIV
d) Kappa casein:
>sp|P02668|CASK_BOVIN Kappa-casein OS=Bos taurus GN=CSN3 PE=1 SV=1
MMKSFFLVVTILALTLPFLGAQEQNQEQPIRCEKDERFFSDKIAKYIPIQYVLSRYPS
YGLNYYQQKPVALINNQFLPYPYYAKPAAVRSPAQILQWQVLSNTVPAKSCQAQP
TTMARHPHPHLSFMAIPPKKNQDKTEIPTINTIASGEPTSTPTTEAVESTVATLEDSPE
VIESPPEINTVQVTSTAV
3.3.2. Models of casein
The models built for casein is forms using Swiss model and I-Tasser are given in Figs. 2 32.
a) For alpha S1: 214 residues were found; 0 templates were found.
b) Models of alpha S2 casein obtained from SWISS modelling
222 residues were found and 4 models were built.
Fig. 2. Model 01.
Fig. 3. Model 02.
Fig. 4. Model 03.
Fig. 5. Model 04.
Table 2. Analysis of best model of alpha S2 casein created by Swiss model.
ALPHA S2 CASEIN IN RAMACHANDRAN PLOT
Swiss model
Percentage
Sequence identity
(%)
Model 01
92.90
20.00
Model 02
100
24.39
Model 03
100
24.39
Model 04
92.90
20.00