These examples demonstrate the use of the
Standardizer program.
The purpose is to show how to run the standardize command and explain
some of its command line options
as well as its configuration.
For a description of reaction mapping, see the Reaction mapping section of the Reactor Manual.
These examples run the standardize UNIX shell script under UNIX / Linux
or the standardize.bat batch file under Windows.
To run these examples:
PATH (all systems) and the JCHEMHOME (under Windows)
environment variables have to be set as described in the
Preparing and Running JChem's Batch Files and
Shell Scripts manual.
cd jchem/examples/standardizerIn Windows:
cd jchem\examples\standardizer
The examples show how to set up a configuration containing several standardization steps.
The input molecule is stored in the file input.mol:
![]() |
standardize -c Standardizer1.xml input.mol
or equally,
you can give the standardizer tasks in an action string
as well as specify the input molecule as a SMILES string on the command line
(this command is wrapped to more lines only for better readability, originally it is a single line):
standardize -c "aromatize..dehydrogenize..[O-:2][N+:1]=O>>[O:2]=[N:1]=O..N=[N:1]#[N:2]>>N=[N+:1]=[N-:2]
..C[N+:1][H:2].[F,Cl,Br,I;-:3]>>C[N:1]..[H:4][N:3][C:1]=[C:2]>>[H:4][C:2][C:1]=[N:3]..[H:4][O:3][C:1]=[C:2]>>[H:4][C:2][C:1]=[O:3]..clean"
"N#N=NCC(\C=C\O)C(CC[NH3+])C([H])([H:1])C1=CC(=C(Cl)C(=C1)N(=O)=O)[N+]([O-])=O.[Cl-]"
Note, that the action string
is a ".."-separated list of standardizer action tasks,
some of them are Standardizer keywords, others are SMARTS reaction strings.
The result is:
[H:1]C(C(CCN)C(CC=O)CN=[N+]=[N-])c1cc(c(Cl)c(c1)N(=O)=O)N(=O)=O
standardize -c Standardizer1.xml -f sdf:-a -o result1.sdf input.molin which case the result molecule is saved in result1.sdf in SDF format.
sdf:-a as output format in the -f parameter
because our molecule is aromatized due to standardization,
but the SDF format is supposed to store the dearomatized form.You can also use the action string and/or the SMILES input molecule string as above (this command is wrapped to more lines only for better readability, originally it is a single line):
standardize -c "aromatize..dehydrogenize..[O-:2][N+:1]=O>>[O:2]=[N:1]=O.. N=[N:1]#[N:2]>>N=[N+:1]=[N-:2]..C[N+:1][H:2].[F,Cl,Br,I;-:3]>>C[N:1].. [H:4][N:3][C:1]=[C:2]>>[H:4][C:2][C:1]=[N:3]..[H:4][O:3][C:1]=[C:2]>>[H:4][C:2][C:1]=[O:3] ..clean" -f sdf -o result1.sdf input.mol
standardize -c Standardizer1.xml -f sdf:-a -o result1.sdf "N#N=NCC(\C=C\O)C(CC[NH3+])C([H])([H:1])C1=CC(=C(Cl)C(=C1)N(=O)=O)[N+]([O-])=O.[Cl-]"
The result molecule is shown below:
![]() |
To see what happened, look at the subsections of the Actions section the XML configuration file
Standardizer1.xml or the ".."-separated sections of the action string
(this command is wrapped to more lines only for better readability, originally it is a single line):
"aromatize..dehydrogenize..[O-:2][N+:1]=O>>[O:2]=[N:1]=O..N=[N:1]#[N:2]>>N=[N+:1]=[N-:2].. C[N+:1][H:2].[F,Cl,Br,I;-:3]>>C[N:1]..[H:4][N:3][C:1]=[C:2]>>[H:4][C:2][C:1]=[N:3].. [H:4][O:3][C:1]=[C:2]>>[H:4][C:2][C:1]=[O:3]..clean"Each section describes a transformation. These transformations are performed on the input molecule in the order they appear in the configuration file. We show these transformations below:
Observe, that the mapped H (with atom map 1)
has not been removed since this simple action removes explicit H
atoms that are unmapped, non-isotope, uncharged and non-radical.
In our next example we will see a more sophisticated
method for implicitizing hydrogens.
![]() |
![]() |
![]() |
![]() |
![]() |
<Clean> section
of the Standardizer Configuration.
removeexplicitH action tag
which enables us to remove also mapped and radical hydrogens. For a description on this type of
dehydrogenization, see the
<RemoveExplicitH> section in the
Standardizer Configuration.
Note, that this way of dehydrogenization is not available in
action strings.
Removal action tag, which keeps the largest disconnected
fragment of the molecule and removes all others. For a description of this action, see the
<Removal> section in the
Standardizer Configuration.
Run Standardizer by:
standardize -c Standardizer2.xml input.molThe result is:
[NH3+]CCC(Cc1cc(c(Cl)c(c1)N(=O)=O)N(=O)=O)C(CC=O)CN=[N+]=[N-]
standardize -c Standardizer2.xml -f sdf:-a -o result2.sdf input.molin which case the result molecule is saved to the file result2.sdf (SDF format).
Note, that we set sdf:-a as output format in the -f parameter
because our molecule is aromatized due to standardization,
but the SDF format is supposed to store the dearomatized form.
The result molecule is shown below:
![]() |
Observe, that this time the mapped hydrogen has also been removed, and the positive charge on the ammoniumhalogenide nitrogen has remained unchanged.
The use and meaning of command-line options in the above commands:
| Option | Description | Default |
|---|---|---|
-c |
configuration file/string | - |
-f |
specifies the output format (e.g. 'sdf', 'mol') | 'smiles' |
-o |
specifies the output file path | standard output (console) |
Let's consider this molecule (alias.mrv):

The methyl group was set an alias label of carboxylic group and the hydroxyl
with an alias label of a nitro group. Still, these ligands are taken into account
in their original form: a methyl and a hydroxyl group.
The nomenclature
alias ATOM or pseudo atom may be confusing, after all, we
are handling groups. The 'atom' denotation only refers to the fact that one atom of the structure
is replaced by an entity - not always a functional group.
The pseudo benzoate
is visualized with italic style.
The SMILES format of the molecule is: CC1=CC(*)=CC(O)=C1.
In the Standardizer the
AliasToGroup action replaces all alias and pseudo atoms by the correspondent
abbreviated group provided in the list of abbreviated groups. The abbreviated group may have
only one attachment point. This action will result contracted S-groups.
The first example shows how to convert the atoms with Standardizer. In the Create Configuration panel of Standardizer application add the operation needed:

In the text filed at the bottom you find the description of the operations. The Alias to Group command does not need any further settings.
The conversion can be also run from command line. Run the Standardizer command line application by:
standardize -c aliastogroup alias.mrv -f mrv -o alias_output.mrv
Here the configuration was given by a simple action string.
The result molecule will contain contracted abbreviated groups:

The atom style changed due to the conversion: all groups are in normal font and the bonds are connected to the chemically relevant atom.
The second example shows how to gain converted and expanded or ungrouped abbreviated groups. To have expanded groups in your structures, insert the Expand Group command after the conversion:

Note, that the operations in Standardizer are executed in the given order, so ungrouping or expanding abbreviated groups will not include alias and pseudo atoms before transforming them to groups.
After creating the configuration (i.e. the set of operations you want your molecules to be subject to) you have the possibility to save the configuration in XML format that can be reloaded later or used in command line. In Standardizer command line application the aliastogroup_expand.xml is used to standardize the molecule:
standardize -c aliastogroup_expand.xml alias.mrv -f mrv -o alias_output.mrv
The result after the group converting and expanding will be this:

Template based clean is a special way of cleaning with the help of a predefined template file. The template file contains pre-cleaned sample structures for which the usual clean algorithm fails due to their exceptional spatial arrangements. For example, bicycles, bridged polycycles, crown ethers and cycloalkanes are such structures. Our example template file contains some of these:
![]() |
![]() |
![]() |
Template based clean works in the following way: templates are searched in the target molecule in the order as they are specified in the template file. The first matching is processed: template atom coordinates are copied to the corresponding target atoms and the remaining atoms are cleaned with partial clean.
The corresponding Standardizer task is the
Clean task with
the following attributes: Type="TemplateBased" TemplateFile="clean_templates.sdf" where
clean_templates.sdf is the template file. The template based clean task can also
be specified in the simple action string as "clean:clean_templates.sdf".
Now clean some test molecules with template based clean using the above example template file. The input molecules are stored in clean_test.sdf:
![]() |
Run template based clean in either of the following ways:
standardize -c "aromatize..clean:clean_templates.sdf" clean_test.sdf -f sdf -o clean_test_output.sdf standardize -c StandardizerTBClean.xml clean_test.sdf -f sdf -o clean_test_output.sdf standardize -c StandardizerTBClean.txt clean_test.sdf -f sdf -o clean_test_output.sdf
Pre-aromatization is important in order to recognize the single-or-aromatic bonds of the templates.
Note, that the action string can also be written in a file either as it is or with writing each task in a separate line as in StandardizerTBClean.txt. The usual XML configuration StandardizerTBClean.xml can also be used.
The result file clean_test_output.sdf is shown below:
![]() |
Note, that molecule 4 is not cleaned, while it matches the same template as molecule
3. The reason is that it contains some extra bridges between N atoms - and we accept a
template as matching only if there is no shorter path between template matching atoms in the target
molecule than the corresponding path in the template.