[MMTK] Partial energy computation

Konrad Hinsen hinsen at cnrs-orleans.fr
Fri Nov 21 11:07:36 CET 2003

On Thursday 20 November 2003 04:14, Dmitriy Morozov wrote:

> As for my question, while your advice does explain the 4 times
> increase in the intersubset calculations, it does not explain
> why it takes so long to compute the terms within the subset (as

It does: for each subset, the list of force field terms is recomputed 

> well as the new timing for the intersubset calculations). The

That is something else indeed.

> adjusted code (that performs 1+20 energy evaluations (and times
> the 20)) is attached. It's output is:
> 2539.67 + 2753.09 + -299.20 = 4993.56
> Time as a whole 18.6157569885
> Time for the first set 12.9738460779
> Time for the second set 15.5259660482
> Time for between the sets 19.7227729559

That is pretty much what I would have expected. Note the most of the runtime 
is for the electrostatics, and the multipole method costs pretty much the 
same for a subset as for the whole system. The computation between the two 
sets is most expensive because some terms have to be subtracted (and are thus 
computed twice).

> Is the problem with using Collections for subsets? Is there any
> other way to describe subsets?

The collections cause no overhead except in the initial setup. Internally, the 
subset is defined as a bitvector.

> What I'm trying to do is to split the work for computing the
> total energy of the system across multiple (a lot for a large
> system) processors by assigning each processor to compute the
> contributions to the energy due to a subset of atoms.

I see... but that is not what the energy evaluators in MMTK were designed for. 
For parallelization, one would split terms by algorithmic rather than 
physical criteria, and non-local methods (multipole, ewald) require special 
treatment. The routines in MMTK are designed to study interaction energies.

There is a parallelization scheme in MMTK, but it is efficient only for 
shared-memory machines with a small number of processors (I have used it with 
up to eight). Massive parallelism requires a different design and different 
algorithms, which are usually less efficient on single-processor machines. 
Parallelization is not a trivial task, unfortunately.

> be my next step). So, I would appreciate your opinion? Do you
> think this (splitting energy computation between the processors)
> is doable? Is it doable with MMTK?

Not with MMTK as it is today. I hope to be able to work on better 
parallelization methods in the near future.

Your best bet for massively parallel computations would probably be NAMD or 
Gromacs, but those are MD codes that cannot be easily made to do Monte-Carlo.

> As a separate question, in Universe.energyEvaluator() I notice
> that it saves an instance of ForceField.EnergyEvaluator in
> _evaluator dictionary (that's a hashtable, right?) with subsets
> as keys - how does it hash subsets? Perturbations won't affect
> the hash, will they?

Not if the atoms involved are the same. Configurations changes don't matter, 
they change just the values of the terms, not their structure.

Konrad Hinsen                            | E-Mail: hinsen at cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-
Rue Charles Sadron                       | Fax:  +33-
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais

More information about the mmtk mailing list