mmtk questions

Andrew Dalke dalke@tull.mag.com
Thu, 4 Sep 1997 13:04:25 -0700


>From Konrad's response:
> Worse, I don't know enough about typical applications [with nucleic
> acids and lipids] to do the job right!

  Nor, alas, do I.  I just know other people use them.

> Whether
> they should inherit from the protein classes, or maybe both should
> inherit form a common base class "SequenceMolecule", depends on
> what they need to do, which I don't know.

  I don't think either is correct.  A sequence is just a specific
enumeration (and ordering) of residues, rather like what PeptideChain()
returns.
  Let me describe what I did for VMD:
A "molecule" contains a set of residues and atoms.  There could be
  several chemical molecules in a "molecule"
A residue contains references to the atoms
A fragment is a set of residues of the molecule which can be
  reached by following covalent bonds.  This is often equivalent
  to a started chemical definition of a molecule, but helps
  catch strange things like two pieces of DNA with a protein linkage.
A protein/nucleic fragment is a set of protein/nucleic residues
  which can be reached by following backbone bonds

So there are several parallel indexings of the residues in a molecule.

In general, I think class structures don't map very well to molecular
structures except for Molecule/Residue/Atom.

> Does anyone know of tendencies to agree on a better file format
> [than the PDB]?

  There are a couple I know of.  The "offical" replacement for the
PDB is mmCIF (http://www.pdb.bnl.gov./mmcif.html) but I can't say
it is much better.  Still has an 80 column limit, uses continuation
markers and "implicit loops" like fortran code, and doesn't have a
real idea of objects.
  The other is NCBI's ASN.1 format for MMDB, which provides a
C-based reader.  See http://www.ncbi.nlm.nih.gov/Structure/ .

> >   How can you detect between an all-hydrogen model and a united atom
> > or polar hydrogen model?

> That's not how it works. You specify the amount of hydrogens when
> you create a peptide chain object.

  Then I misunderstood.  I know that xplor lets me pick which
topologies I want, and I didn't see the equivalent option in MMTK,
so I figured it was guessing.  Actually, they are doing somewhat
different things.  By making you pick the topology file, xplor
lets you force the topology any way you want, while MMTK picks
things based on the amount of hydrogens you want.
  Is that correct?

> I was surprised too that Amber treats [charges for terminals]
> differently!

It does make sense but I wonder about the usefulness of it.  I'm
guessing they used QM to get the charges, and when you start doing
that I want to know how much the charges are affected by the residue
neighbors.  For instance, how does G-Y affect the charges of the
tyrosine as compared to H-Y?  Also, I can't see why charmm's
parameters don't do that.

> > bondlist.  How can this be overridden?  For example, one
> This conforms to the Amber94 definition, and it can't be overridden
> other than by defining another force field.

  Again, this might be showing my inexperience in non-charmm force
fields.  However, I'm looking at charmm22/topallh22x.pro and
see entries like:

>  PRESIDUE FHEM ! FIX UP THE HEME BY DELETING UNWANTED AUTOGENERATED ANGLES
>                ! unliganded heme patch
>
>
>  DELE ANGLE  1NA   1FE   1NC
>  DELE ANGLE  1NB   1FE   1ND
>
>
>  END {FHEM}

and special entries for things like the HIS to heme attachment.  These
redefine some of the automatic assignment available in charmm, and
I don't know if the method from amber is smarter than this.
  Ions and metals have always been "strange."  How does MMTK deal with
zinc fingers?  This is a special case similar to S-S bonds.  Thus,
how can you guarantee the auto-generation will be correct for all the
possible systems thrown at it and not need some way to override that?

> >   With the randomPointInBox function, adding a Monte Carlo-based
> > volume function should be easy.
>
> I don't quite understand what you mean, but I probably agree ;-)

  A way to find the volume of a protein:
    Find the box that encloses the protein
    Extend each side by the length of the largest atom's radius
    Let V be the volume of this box
    Choose N points randomly distributed in the box
    Let P be the number of points "inside" a atom
    The protein volume is (P/N +/- 1/sqrt(N) ) * V
(Unless I misremember my stats.)

> If anyone wants to offer me a job as a full-time MMTK developer,
> I'd probably accept ;-)

  And if they are looking for someone else to assist ... :)

						Andrew
						dalke@mag.com