iFragMent User Guide

iFragMent assigns chemical compounds to biological pathways using a profile-based approach described in [paper]. This is conceptually similar to to assigning proteins to functional/structural families using sequence profiles.

For a given input structure, iFragMent will report a list of assigned pathways in the selected database and their associated scores (z-score and p-value). Additionally, a graphical representation of the molecule highlighting the fragments (substructures) contributing most to the assignments is also generated.

Input

The input for iFragMent are the structures of one or more chemical compounds to be matched against the profile databases. These can be entered in the textbox or uploaded from a file. See Supported formats below. Additionally, the user has to select a database of profiles to search against.
There are three sets of chemical profiles available:

By clicking Search the input compound(s) will be matched against the profiles of the selected database and each compound-pathway match will be assigned a p-value and z-score.

Results

Results are displayed separately for each input compound in tables sorted in descending p-value order. Pathways with a p-value below 0.05 will be omitted if there are more than five results above that value. The table can be sorted by other columns by clicking the corresponding headers.

For a given compound-pathway assignment, click Show to highlight in the molecule of the compound the three fragments (substructures) contributing most to the assignment. These are the fragments with higher values in the pathway profile and hence contributing more to the score of that compound-pathway match. They are equivalent to the conserved/functional positions in sequence profiles.

Color code used is for the fragment with the highest value, for the second and for the third. Intersection of fragments is represented by the mixture of the corresponding colors: , and . For clarity, is used for the intersection of three fragments and is used for unmatched parts of the molecule.

Supported input formats

Input structures must be entered in either smi (list of SMILES), MDL molfile (mol) or Structure Data File (sdf) formats, by uploading the corresponding files. For SMILES, they can also be entered directly in the text box (one per line preceded by ID<space>). Multiple structures can be analyzed at once using the sdf or smi formats.
If you want to use your own IDs for identifying the molecules in the result page they must follow these rules:

  • smi: Any word separated by a white space from the structure will be considered as ID.
  • mol: The ID for the structure will be taken from the name of the file (without extension).
  • sdf: The ID for each structure will be taken from the associated data field only if there is a single one.

Citing iFragMent

When reporting any data or representation obtained with this server, please cite:

Lopez-IbaƱez, J., Pazos, F. & Chagoyen, M. Predicting biological pathways of chemical compounds with a profile-inspired approach. BMC Bioinformatics 22, 320 (2021) [link]