chemaxon.sss.search
Class MCS

java.lang.Object
  extended bychemaxon.sss.search.MCS

public class MCS
extends java.lang.Object

The MCS class find the largest substructure common to two molecular structures.
It supports various options when the two molecules' graphs are matched against each other: it can either ignore or take atom type, bond type, charge, hybridization into account.

The present implementation searches for the connected MCES (maximum common edge subgraph).

The class provides a simple command-line interface too. It takes two molecular structures and calculates (and outputs) their MCES.

Since:
JChem 3.0
Author:
Vargyas Miklos

Field Summary
static int DEFAULT_MIN_COMMON_SIZE
          default minimum size of common structures, do not find smaller than this
static int MODE_EXACT
          Exact MCS search mode.
static int MODE_FAST
          A well balanced compromise between MODE_EXACT and MODE_TURBO.
static int MODE_TURBO
          Fastest search mode.
 
Constructor Summary
MCS()
          Creates a new object for common structure search.
 
Method Summary
 boolean findFirst()
          Performs graph matching according to the mode set.
 boolean findNext()
          Searches for the next hit.
 int[] getResult()
          Gets the result of an MCS/MCES/SSS search.
 Molecule getResultAsMolecule()
          Gets the MCS/MCES as a structure represented by a Molecules object.
 java.lang.String getResultAsSmiles(boolean unique)
          Gets the MCS/MCES as structure represented as SMILES.
 int[] getResultQueryAtoms()
          Gets query atoms that are part of the MCS/MCES.
 MolBond[] getResultQueryBonds()
          Gets bonds of the query structure that are part of the MCS/MCES found.
 int getResultSize()
          Gets the size (number of atoms) of the MCS/MCES (and SSS) found.
 int[] getResultTargetAtoms()
          Gets target atoms that are part of the MCS/MCES.
 MolBond[] getResultTargetBonds()
          Gets bonds of the target structure that are part of the MCS/MCES found.
static void main(java.lang.String[] args)
          Simple command line interface mainly for testing purposes.
 boolean search()
          Performs graph matching according to the mode set.
 void setDontBreakRingBonds(boolean dontBreakRingBond)
           
 void setFastSearch(boolean fastSearch)
          Deprecated. use setMode(MODE_TURBO) instead
 void setIgnoreAtomType(boolean ignore)
          Sets atom type matching mode.
 void setIgnoreBondType(boolean ignore)
          Sets bond type matching mode.
 void setIgnoreCharge(boolean ignore)
          Sets atom formal charge matching mode.
 void setIgnoreHybridization(boolean ignore)
          Sets atom hybridization matching mode.
 void setIgnoreIsotopes(boolean ignore)
          Sets isotope matching mode.
 void setIgnoreQueryProperties(boolean ignore)
          Deprecated. Since version 5.0 query properties are not supported by the MCS search.
 void setMCESMode(boolean mcesMode)
          Deprecated. MCES is the only structure matching mode supported by the MCS class from version 5.0.
 void setMCSMode(boolean mcsMode)
          Deprecated. Strict MCS (vs. MCES) is not available from version 5.0
 void setMinimumCommonSize(int minMCSsize)
          Sets the minimum required size of the MCS found.
 void setMode(int mcsMode)
          Sets searching strategy for the MCS search.
 void setMolecules(Molecule query, Molecule target)
          Sets the two molecular structure graphs to be matched.
 void setQuery(Molecule query)
          Sets the two molecular structure graphs to be matched.
 void setSSSMode(boolean sssMode)
          Deprecated. Substructure search is not supported by MCS form version 5.0
 void setTarget(Molecule target)
          Sets the two molecular structure graphs to be matched.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_MIN_COMMON_SIZE

public static final int DEFAULT_MIN_COMMON_SIZE
default minimum size of common structures, do not find smaller than this


MODE_EXACT

public static final int MODE_EXACT
Exact MCS search mode. Exact in this context means exhaustive search without involving heuristics or other ways to prune potential solutions without prooving their aren't solutions. This is the fastest but the most reliable search option for finding the MCS of two molecules.


MODE_TURBO

public static final int MODE_TURBO
Fastest search mode. In some cases this search mode does not find the exact solution. Symmetrical structures are the most likely to lead to a common structure that is not maximal. Use with caution, mainly to obtain first insight into the dateset and then carry out a more thorough 'slow search' (ie. setFastSearch( false ) which is the default anyway.


MODE_FAST

public static final int MODE_FAST
A well balanced compromise between MODE_EXACT and MODE_TURBO.

Constructor Detail

MCS

public MCS()
Creates a new object for common structure search.

Method Detail

setQuery

public void setQuery(Molecule query)
Sets the two molecular structure graphs to be matched.

Parameters:
query - query molecule

setTarget

public void setTarget(Molecule target)
Sets the two molecular structure graphs to be matched.

Parameters:
target - target molecule

setMinimumCommonSize

public void setMinimumCommonSize(int minMCSsize)
Sets the minimum required size of the MCS found. Common parts below this size limit are ignored.

Parameters:
minMCSsize - minimum size of MCS

setIgnoreAtomType

public void setIgnoreAtomType(boolean ignore)
Sets atom type matching mode. By default atom types are considered (not ignored) in matching.

Parameters:
ignore - ignore atom type (true) or take it into account (false)

setMolecules

public void setMolecules(Molecule query,
                         Molecule target)
Sets the two molecular structure graphs to be matched.

Parameters:
query - query molecule
target - target molecule

setIgnoreCharge

public void setIgnoreCharge(boolean ignore)
Sets atom formal charge matching mode. By default atom charges are considered (not ignored) in matching.

Parameters:
ignore - ignore atom charge (true) or take it into account (false)

setIgnoreHybridization

public void setIgnoreHybridization(boolean ignore)
Sets atom hybridization matching mode. By default atom hybridization states are considered (not ignored) in matching.

Parameters:
ignore - ignore atom hybridization (true) or take it into account (false)

setIgnoreIsotopes

public void setIgnoreIsotopes(boolean ignore)
Sets isotope matching mode. By default isotopes of the same element are not examined in matching.

Parameters:
ignore - ignore isotopes (true) or take them into account (false)

setIgnoreBondType

public void setIgnoreBondType(boolean ignore)
Sets bond type matching mode. By default bond types are ignored in matching, that is, a double bond matches a single bond etc.

Parameters:
ignore - ignore bond type (true) or take it into account (false)

setIgnoreQueryProperties

public void setIgnoreQueryProperties(boolean ignore)
Deprecated. Since version 5.0 query properties are not supported by the MCS search.

Sets query mode. By default query properties are ignored in matching.

Parameters:
ignore - ignore/consider query properties

setDontBreakRingBonds

public void setDontBreakRingBonds(boolean dontBreakRingBond)

setMCSMode

public void setMCSMode(boolean mcsMode)
Deprecated. Strict MCS (vs. MCES) is not available from version 5.0

Sets matching mode to Maximimum Common Subgraph. In this implementation MCS is always one connected component.
This method does not change atom and bond matching mode flags.

Parameters:
mcsMode - search for maximum common subgraph

setMCESMode

public void setMCESMode(boolean mcesMode)
Deprecated. MCES is the only structure matching mode supported by the MCS class from version 5.0.

Sets matching mode to Maximimum Common Edge Subgraph. This is often reffered to as the induced subgraph in the literature. In most cheminformatics applications MCES is saught for rather than MCS, though often MCES is ment when MCS is said.
This is the default matching mode.
In this implementation MCES is one connected component.
This method does not change atom and bond matching mode flags.

Parameters:
mcesMode - search for maximum common edge subgraph

setSSSMode

public void setSSSMode(boolean sssMode)
Deprecated. Substructure search is not supported by MCS form version 5.0

Sets matching mode to SubStructure Search. In this mode the role of the query graph and the target graph is not symmetrical and NOT interchangeble.
This method sets atom type and bond type related flags to take atom type, charge and bond type into account in search (however, isotope types are not considered).

Parameters:
sssMode - substructure search

setMode

public void setMode(int mcsMode)
Sets searching strategy for the MCS search. Accepted values are:

Parameters:
mcsMode - search strategy for maximum common substructure

setFastSearch

public void setFastSearch(boolean fastSearch)
Deprecated. use setMode(MODE_TURBO) instead

Sets fast search mode. Fast search mode is an approximate search that in some cases does not find the exact solution. In particular simmetrical structures are affected. Use with caution, mainly to obtain first insight into the dateset and then carry out a more thorough 'slow search' (ie. setFastSearch( false ) which is the default anyway.

Parameters:
fastSearch - fast or normal search speed

search

public boolean search()
Performs graph matching according to the mode set.

Returns:
solution found or not

findFirst

public boolean findFirst()
Performs graph matching according to the mode set. This method does the same as search() except that it allows the subsequent call of findNext().

Returns:
solution found or not

findNext

public boolean findNext()
Searches for the next hit. This method should be called after findFirst().

Returns:
next solution found or not

getResult

public int[] getResult()
Gets the result of an MCS/MCES/SSS search. The resultis the mapping from query graph atoms to target graph atoms. Internal atom indexes are used, the array returned is indexed by query graph atom indices.

Returns:
mapping between query and target graph atoms

getResultSize

public int getResultSize()
Gets the size (number of atoms) of the MCS/MCES (and SSS) found.

Returns:
number of atoms of the MCS

getResultAsSmiles

public java.lang.String getResultAsSmiles(boolean unique)
Gets the MCS/MCES as structure represented as SMILES.

Parameters:
unique - indicates if unique smiles are required to be returned
Returns:
MCS as SMILES string

getResultAsMolecule

public Molecule getResultAsMolecule()
Gets the MCS/MCES as a structure represented by a Molecules object.

Returns:
MCS/MCES as a Molecule

getResultQueryAtoms

public int[] getResultQueryAtoms()
Gets query atoms that are part of the MCS/MCES.

Returns:
query atoms found in the MCS

getResultTargetAtoms

public int[] getResultTargetAtoms()
Gets target atoms that are part of the MCS/MCES.

Returns:
target atoms found in the MCS

getResultQueryBonds

public MolBond[] getResultQueryBonds()
Gets bonds of the query structure that are part of the MCS/MCES found.

Returns:
query structure bonds in the MCS

getResultTargetBonds

public MolBond[] getResultTargetBonds()
Gets bonds of the target structure that are part of the MCS/MCES found.

Returns:
target structure bonds in the MCS

main

public static void main(java.lang.String[] args)
Simple command line interface mainly for testing purposes. This program can take two molecules, the first is referred to as query, the second is the target. Both are defined in structure files, though query can also be specified by a string (e.g. SMILES) in the command line. Some flags are also accepted, see UsageInfo.

Parameters:
args - command line arguments (filenames, flags, options)