Downloads

Cite us

ChemRxiv - preprint
Nainala VC, Rajan K, Kanakam SRS, Sharma N, Weißenborn V, Schaub J, et al. COCONUT 2.0: A comprehensive overhaul and curation of the collection of open natural products database. ChemRxiv. 2024; doi:10.26434/chemrxiv-2024-fxq2s

Version 1
Sorokina, M., Merseburger, P., Rajan, K. et al. COCONUT online: COlleCtion of Open Natural prodUcTs database. J Cheminform 13, 2 (2021). https://doi.org/10.1186/s13321-020-00478-9

COCONUT data is released under the Creative Commons CC0 license, allowing for free use, modification, and distribution without any restrictions. No attribution is required when utilizing this data.

Complete (active/inactive) COCONUT dataset

Use Cases

Learn how COCONUT data is currently being used

Version: August 2024 Fragment Analysis

In silico molecule fragmentation

Analyse molecular structures by identifying [specific substructures like] functional groups and scaffolds, gaining insights into chemical diversity to aid in drug design, combinatorial chemistry, and molecular fingerprinting.

Fragments_Ertl_algorithm.csv Download generalized Ertl algorithm functional groups of COCONUT natural products along with their frequencies as canonical SMILES codes (stereochemistry disregarded); ordered by decreasing frequency.

Items_Ertl_algorithm.csv Download COCONUT natural products (COCONUT ID + canonical SMILES codes; stereochemistry disregarded) along with their individual generalized Ertl algorithm functional groups as canonical SMILES codes (stereochemistry disregarded) in CSV format.

Fragments_Scaffold_Generator.csv Download molecular scaffolds of COCONUT natural products along with their frequencies as canonical SMILES codes (stereochemistry disregarded); ordered by decreasing frequency.

Items_Scaffold_Generator.csv Download COCONUT natural products (COCONUT ID + canonical SMILES codes; stereochemistry disregarded) along with their individual molecular scaffolds as canonical SMILES codes (stereochemistry disregarded) in CSV format.

Fragments_Scaffold_Generator_Scafold_tree.csv Download molecular scaffolds and parent scaffolds following the scaffold tree methodology of COCONUT natural products along with their frequencies as canonical SMILES codes (stereochemistry disregarded); ordered by decreasing frequency.

Items_Scaffold_Generator_Scafold_tree.csv Download COCONUT natural products (COCONUT ID + canonical SMILES codes; stereochemistry disregarded) along with their individual molecular scaffolds and parent scaffolds following the scaffold tree methodology as canonical SMILES codes (stereochemistry disregarded) in CSV format.

Version: August 2024 Drug Discovery HTS Deep Learning

Drug discovery

Synthetic feasibility and NP-likeness scores guide drug discovery by prioritizing compounds that are easier to synthesize and structurally similar to natural products, enhancing the efficiency of high-throughput screening (HTS) and deep learning models in identifying promising drug candidates.

Download A text file containing the latest COCONUT IDs, SMILES, NP-likeness score, synthetic accessibility score, and QED drug-likeness score, all calculated using RDKit utilities.
Version: September 2024 Mass Spec

Small Molecule Mass Spec Research

Leverage the molecular formulas and weights of natural products to identify and characterize novel compounds, enabling the exploration of bioactive molecules with potential therapeutic applications.