PMapper perceives pharmacophoric properties1
of atoms of given molecular
structures. When identified, pharmacophoric features are assigned (mapped) to
corresponding atoms and stored along with the structure. This information
can be used in searching large compound libraries for structures exhibiting
similar therapeutic activity.
In this document a brief overview of pharmacophore feature identification
is given first, while further documents describe how pharmacophores are used
in various software tools developed by ChemAxon (like
ScreenMD, JKlustor).
pmapper [<options>] [<input files>]
Prepare the usage of the pmapper script or batch file
as described in Preparing the Usage of JChem
Batch Files and Shell Scripts.
Alternatively, the PMapper class can be directly invoked:
Win32 / Java 2 (assuming that JChem is installed in c:\jchem):
java -cp "c:\jchem\lib\jchem.jar;%CLASSPATH%" \
chemaxon.pharmacophore.PMapper [<options>] [<input files>]
Unix / Java 2 (assuming that JChem is installed in /usr/local/jchem):
java -cp "/usr/local/jchem/lib/jchem.jar:$CLASSPATH" \
chemaxon.pharmacophore.PMapper [<options>] [<input files>]
-h, --help this help message
-c, --config <filepath> path of the XML configuration file
-o, --output <filepath> output file path (default: stdout)
-t, --tag name of the SDFile tag to store the
Pharmacophore Map (default: PMAP)
-S, --sdf-output SDF output (otherwise only PMAP list)
-g, --ignore-error continue with next molecule on error
-v, --verbose print calculation warnings to the console
The command line parameter --config is mandatory. This
specifies the path and filename of a configuration file without which the
program cannot operate. A detailed description of the format of this
configuration file is given below.
The standardizer configuration can be given under a separate
StandardizerConfiguration element node in the main configuration XML
specified with the --config command line parameter.
These transformations ensure an unambiguous molecule representation among the
possible mezomer and tautomer forms of the same molecule. Other possible
standardization include aromatization and hydrogenization.
The standardization is only used to determine the pharmacophore points, but the
original molecule will be written to the output. If this parameter is missing
and no StandardizerConfiguration node is found in the configuration XML
then no standardization is performed on the molecules. For a complete description
of the standardization procedure and its configuration see the
Standardizer Documentation.
If the command line parameter --ignore-error is specified, then import/export errors
will not stop the processing but the error is written to the console and the molecule is skipped.
By default, the program exits in case of molecule import/export erros.
Most molecular file formats are accepted ( MDL molfile, Compressed molfile, SDfile, Compressed SDfile, SMILES, etc.).
If no input file name is given in the command line the standard input is read.
If no output file name is given, results are written to the standard output.
Unless -P (or --PMAPonly) is specified,
the default output format is SDfile. The pharmacophore map tag is written
into the output file, the default tag name is PMAP. This name can
be changed by specifying the --tag option.
If the only output required is the pharmacophore map generated, then
the --PMAPonly flag has to be specified.
Pharmacophore map data (value associated with the PMAP tag) can be
visualized using
MarvinView. In order to do so a
palette definition property file
has to be provided using the -p flag of mview and the
name of the pharmacophore map flag (which is PMAP by default) has to be specified
using the -t flag.
Pharmacophore maps generated by PMapper are
substantially determined by the configuration file (specified following the
--config mandatory command line parameter). The configuration
file specifies
The configuration file is an XML file.
The main pharmacophore point
definition is given in the <Pharmacophores>
section. Its <AtomSet> subsections define the
pharmacophore point types: its unique id, its symbol used in the output as attributes
and the definition expression in the text subsection.
The <Search> section specifies the substructure
search attributes that are different from the default setting.
See the Molecule Evaluator Documentation on Search Attributes
for a precise description of the possible attributes and their default values.
The <Functions> section enables the user to add
user-defined functions to the set of functions accessible from the expression strings: the
<ID> attribute defines the reference name of the function, the
<Class> specifies the function java class.
The <Mols> section defines molecule constants used in
the pharmacophore definition expressions (e.g. as query structures). These molecule constants
are either given by molecule files (strictly speaking paths and names of molecular files) or
as smiles strings.
The <Plugins> section defines the calculation plugin
modules used in quantitative approximation of chemical properties. The charge, pKa,
logP calculations can be referenced in this way: the <ID> attribute
defines the reference name of the plugin (this name is used in the expressions), the
<Class> specifies the plugin java class and the <Param>
subnodes provide the plugin parameters as name-value pairs. For a complete description of the
plugin mechanism see calculator plugins.
Example
<?xml version="1.0" encoding="UTF-8"?>
<!-- Pharmacophore configuration file -->
<PharmacophoreConfiguration Version ="0.3">
<Mols>
<Mol ID="pos" Structure="../PD/Positive.mol"/>
<Mol ID="amine" Structure="../PD/Amine.mol"/>
<Mol ID="amide" Structure="../PD/Amide.mol"/>
<Mol ID="nitro" Structure="../PD/Nitro.mol"/>
<Mol ID="hydrazide" Structure="../PD/Hydrazide.mol"/>
<Mol ID="amidine" Structure="../PD/Amidine.mol"/>
</Mols>
The molecular files referred in the above pharmacophore configuration XML file contain the molecular structures displayed below.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
In these structural queries atoms may optionally be labeled with a map indices, which identify certain atoms of the pharmacophore group. Map indices must be unique among the atoms og a given molecule. Map indices can be used in pharmacophore type definition rules.
The <Pharmacophores> section is the pharmacophore
definition section. In this section rules for every pharmacophore types can be
defined. These
rules combine the pharmacophore point feature definitions
(query structures), preconfigured java method calls
(molecule evaluator) including
quantitative approximation of chemical properties by invoking
calculator modules (Calculator Plugins)
in logical expressions.
We provide two sample XML configurations.
<Search StereoCareChecking="false"/>
In this example we set StereoCareChecking to false and leave
all other attributes at their default values.
<Functions>
<Function ID="data" Class="chemaxon.util.expression.function.MolUtil"/>
</Functions>
The chemaxon.util.expression.function.MolUtil java class is configured to
be referenced with the name data from the expression strings. For example,
data("atomcount") returns the number of atoms in the input molecule. Here
"atomcount" is a function argument of
chemaxon.util.expression.function.MolUtil
denoting the type of data enquiried.
<Pharmacophores>
<AtomSet ID="Cationic" Symbol="+">
<![CDATA[ (pos && !nitro:1) ||
((amine:1 || hydrazine || amidine) && !(amide:1 || nitro:1)) ]]>
</AtomSet>
</Pharmacophores>
In the above example only one rule is introduced which combines several
query structures to define the cationic pharmacophore feature. According to the
rule an atom can be considered cationic if it matches the substructure defined in
the Positive.mol or Amine.mol or Hydrazine.mol
or Amidine.mol files, but not
that in the Amide.mol and Nitro.mol files at the
corresponding positions.
More meaningfully, if it has a positive formal charge, or it is the nitrogen in
an aliphatic amine or hydrazine, but it is neither an amide nor notro group
nitrogen, or it is the carbon of either an amidine or guanidine group, but not
carbamide carbon.
In the pharmacophore type definition rules map indices of the atoms can be referred after the ID (symbolic internal name) of the corresponding structure, separated by a semicolon, and in such case only the atom specified by the map index is subject to matching. Multiple atom map indices can also be used, these should be separated with comma characters. In the case when no map index is specified in the logical expression all atoms of the query are matched and evaluated.
A more complex example (pharma-calc.xml) is given below.
<Functions>
<Function ID="formalcharge" Class="chemaxon.util.expression.function.AtomProperties">
<Param Name="property" Value="charge"/>
<Param Name="pH" Value="7"/>
</Function>
</Functions>
<Plugins>
<Plugin ID="acceptor" Class="chemaxon.marvin.calculations.HBDAPlugin">
<Param Name="type" Value="acc"/>
<Param Name="pH" Value="7"/>
</Plugin>
<Plugin ID="donor" Class="chemaxon.marvin.calculations.HBDAPlugin">
<Param Name="type" Value="don"/>
<Param Name="pH" Value="7"/>
</Plugin>
<Plugin ID="ioncharge" Class="chemaxon.marvin.calculations.IonChargePlugin">
<Param Name="pH" Value="7"/>
</Plugin>
</Plugins>
This is an example for calculator plugin decalaration with parameters. The IonCharge plugin
calculates the distribution for the ionic forms of the input molecule, takes those forms for
which the occurrence percentage is at least 1 percent at pH=7
taking into account maximum 8 ionizable atoms to avoid combinatorial
explosure. The plugin sorts the atomic charge values on each ionic form according to descending
occurrence percentage, the above declared min and max functions
enable us to take the minimum or maximum charge value among these on each atom.
Take an example for a complete set of pharmacophore property definitions using this declaration:
<Mols>
<Mol ID="arom" Structure="[*;a]"/>
<Mol ID="cx" Structure="[C,F,Cl,Br,I,At]"/>
</Mols>
<Pharmacophores>
<AtomSet ID="Aromatic" Symbol="r">arom</AtomSet>
<AtomSet ID="Cationic" Symbol="+"> 0) || (ioncharge() > 0.4) ]]></AtomSet>
<AtomSet ID="Anionic" Symbol="-"></AtomSet>
<AtomSet ID="HydrogenBondDonor" Symbol="d"></AtomSet>
<AtomSet ID="HydrogenBondAcceptor" Symbol="a"></AtomSet>
<AtomSet ID="Hydrophobic" Symbol="h"></AtomSet>
</Pharmacophores>
This configuration defines the following pharmacophore properties:
0.4, both taken on the major
microspecies at pH 7, while
-0.4, both taken on the major
microspecies at pH 7.
7.
7.
Note that in this case we reference previously defined atom sets with the atom set ID enclosed in
{...} brackets (e.g. {Cationic}). The reason for this is that atom sets are
java.util.BitSet objects and without this special notation writing the ID of an atom set
in an expression refers to the java.util.BitSet object itself.
In the pharmacophore definitions above by {Cationic} we refer to the elements of the
BitSet object, in other words we refer to those atom indices i for which
Cationic.get(i) returns true. The {Cationic} notation
is added to the standard expression syntax to enable the processing of this additional set-theoretic
functionality.
cd test/pharmacophore pmapper -c Pharmacophores.xml
One map per line is written on the output:
h;h;h;a h;h;h;a/d r;r;r;r;r;r;h;h;h d;h;h;+/d;-;-/a;-/a
nci10000.smiles located in the ./test/pharmacophore
directory and writes results in the file named nci10000.sdf to be
created in the same directory:
pmapper -S -c Pharmacophores.xml nci10000.smiles -o nci10000.sdf
pmapper -S -c Pharmacophores.xml -s Standardizer.xml nci10000.smiles -o nci10000.sdf
pmapper -S -c Pharmacophores.xml med10000.sdf -o med10000-out.sdf mview -t PMAP -p Colors.ini med10000.sdf
A screen-shot of MarvinView showing the structures with colored pharmacophore points:
pmapper -S -c Pharmacophores.xml med10000.sdf | mview -p colors.ini -t PMAP -
Note, that such piping does not work in Windows.