Brief Introduction

The Anatomical Therapeutic Chemical (ATC) classification system developed and maintained by the World Health Organization (WHO) Collaborating Center for Drug Statistics Methodology (WHOCC), is currently the most widely recognized classification system for drugs. It divides drug substances into different groups according to the organ or system on which they act and their therapeutic, pharmacological and chemical properties.

The prediction of ATC classification of drugs is not only helpful for studying the utilization of drugs and knowing their therapeutic, pharmacological and chemical properties, but also provides valuable information for drug side-effect discovery and drug repositioning study. Meanwhile ATC classification prediction for chemical compounds also contributes to new drug development.

SPACE (Similarity-based Predictor of ATC CodE) is just designed to predict drug-ATC class (/ATC code) associations, which uses logistic regression framework to integrate multiple heterogeneous data sources including chemical structures, target proteins, side-effects, drug-induced gene expression and chemical-chemical associations to construct the prediction model.

For each submitted compound, SPACE will give the list of predicted candidate ATC codes ranked according to the decreasing probability_score measuring the possibility that the drug belongs to the ATC code, and meanwhile for each predicted candidate ATC code, various supporting evidence will also be provided. In addition, when the number of the query compounds is not smaller than 2, users can further do the enrichment analysis to check the significantly enriched (predicted candidate) ATC codes among query compounds, of which a typical application is to analyze potential therapeutic/pharmacological/chemical properties of the Traditional Chinese Medicine (TCM) composed of multiple compositive compounds.

Please see the Documents page for more information.

Please cite: Liu Z, Guo F, Gu J, Wang Y, Li Y, Wang D, Lu L, Li D, He F. Similarity-based prediction for Anatomical Therapeutic Chemical classification of drugs by integrating multiple data sources. Bioinformatics. 2015, 31(11):1788-95. [PubMed]

1. Please input your compound list

The current supported format of the input compound list is: one compound per line, and for each compound, the PubChem_CID or structural formula (including InChI, SMILES, SDF, MOL2)should be provided, please see the "Example"! You are highly recommended to use PubChem_CID to represent your compound.

Our prediction system is suitable for the ATC code prediction of ‘new compounds (potential drugs)’ only with structural information. For these new compounds, users can input their structures of InChI, SMILES, MOL2 or SDF format, or draw their chemical structures by JSDraw. Of course, for these new compounds, the ATC code prediction is only based on two structure-based features because obviously the new compound has no reported non-structural information.

2. Parameter setting

  • a. Select the level of predicted ATC codes* :
  • b. Prediction result parameter setting:

    For each query compound, predicted candidate ATC codes are ranked according to the order of decreasing probability_score predicted by SPACE. You can select only check candidate ATC codes ranked in the top N AND with probability_score greater than P.

       i)  Input the number of top candidate ATC codes of result parameter: N (>=1 integer)
       ii) Input the probability_score cutoff: P>= ([0, 1])
  • c. Enrichment analysis:

    When the number of query compounds is greater or equal to 2, users can do this enrichment analysis to check the significantly enriched (predicted) candidate ATC codes among the query compounds. Generally this function is useful when users want to analyze the potential therapeutic/pharmacological/chemical properties of a Traditional Chinese Medicine (TCM) composed of multiple compositive compounds.

    In the enrichment analysis, we only consider candidate ATC codes of query compounds ranked in the top N AND with probability_score greater than P.

       i)  Input the number of top candidate ATC codes used for enrichment analysis: N (>=1 integer)
       ii) Input the probability_score cutoff for the enrichment analysis: P>= ([0, 1])
       iii) Select the "control" dataset for the enrichment analysis:
       iv)  Input the P_value cutoff of the enrichment analysis: ([0, 1])

3. Now start to predict and analyze

 Be notified by email. Recommended! (Tick this box if you want to be notified by email. In the email, a link will be provided, by which you can check the progress of your job and the analysis result when finished.)
If you can't receive this Email in 2 minutes, please check you spam mail just in case.