Command Line Interface
smartclass automatically installs the command smartclass.
See smartclass --help for usage details.
smartclass
Smartclass - Classify chemical structures using SMARTS patterns.
A tool for classifying chemical structures against SMARTS-based chemical class definitions from Wikidata or custom sources.
# Query Wikidata for chemical classes smartclass querywikidata -q query.rq -o output.tsv
Usage
smartclass [OPTIONS] COMMAND [ARGS]...
Options
- --version
Show the version and exit.
- -v, --verbose
Increase verbosity (-v for INFO, -vv for DEBUG).
combinecsvfiles
Combine multiple CSV/TSV files into one.
Merges rows from multiple input files with the same schema.
Usage
smartclass combinecsvfiles [OPTIONS]
Options
- -i, --input-file <input_file>
Required Input CSV/TSV file(s) to combine.
- -o, --output <output>
Required Output file path.
getlatestchembl
Download and process the latest ChEMBL database.
Generates fingerprints for molecules in ChEMBL for use in classification and similarity searches.
Usage
smartclass getlatestchembl [OPTIONS]
Options
- -f, --fp-len <fp_len>
Fingerprint length for molecular fingerprints.
- Default:
2048
- -m, --max-atoms <max_atoms>
Maximum number of atoms in molecules to process.
- Default:
50
- -r, --report-interval <report_interval>
Progress reporting interval (number of molecules).
- Default:
50000
- -t, --tautomer-fingerprints, --no-tautomer-fingerprints
Include tautomer fingerprints in output.
loadpkgdata
Load and display bundled package data.
Shows the chemical classes, mappings, and MIA data included with the smartclass package.
Usage
smartclass loadpkgdata [OPTIONS]
querywikidata
Query Wikidata using SPARQL and export results.
Execute a SPARQL query against Wikidata and optionally apply chemical transformations to the results.
Examples:
Usage
smartclass querywikidata [OPTIONS]
Options
- -q, --query <query>
Path to SPARQL query file (.rq). If omitted with –output, the command runs the default class query batch.
- -o, --output <output>
Output file path (CSV or TSV based on extension). Requires –query. If omitted with –query, the command runs the default class query batch.
- -r, --remove-prefix, --keep-prefix
Remove Wikidata entity prefix from results.
- -t, --transform <transform>
Apply a transformation to query results (e.g., check_smiles, transform_inchi_to_inchikey, transform_smiles_to_inchi).
- Options:
check_smiles | transform_inchi_to_inchikey | transform_inchi_to_mass | transform_inchi_to_smiles_canonical | transform_inchi_to_smiles_isomeric | transform_smiles_to_formula | transform_smiles_to_inchi | transform_smiles_to_mass | transform_smiles_i_to_smiles_c | transform_formula_to_formula
- -u, --url <url>
SPARQL endpoint URL.
- Default:
'https://query.wikidata.org/sparql'
searchclasses
Classify chemical structures against SMARTS-based classes.
Match input SMILES strings against chemical class definitions using substructure searching. Results include the matched class and structural similarity metrics.
Examples:
Usage
smartclass searchclasses [OPTIONS]
Options
- -c, --classes-file <classes_file>
TSV file with chemical class definitions (class, structure columns).
- -d, --classes-name-id <classes_name_id>
Column name for class identifiers.
- Default:
'class'
- -e, --classes-name-smarts <classes_name_smarts>
Column name for SMARTS patterns.
- Default:
'structure'
- -f, --include-hierarchy, --no-hierarchy
Use chemical hierarchy for faster BFS-based searching.
- -i, --input-smiles <input_smiles>
Input file containing SMILES (CSV/TSV with ‘smiles’ column).
- -s, --smiles <smiles>
SMILES string(s) to classify. Can be specified multiple times.
- -z, --closest-only, --all-matches
Return only closest matching class per structure.
searchclasses-all-sources
Classify structures against SMARTS, SMILES, and CXSMILES class sources.
This command runs three classification passes and combines the results, adding a class_source field (smarts, smiles, cxsmiles).
Usage
smartclass searchclasses-all-sources [OPTIONS]
Options
- --classes-smarts-file <classes_smarts_file>
TSV file with SMARTS-based class definitions.
- Default:
PosixPath('scratch/wikidata_classes_smarts.tsv')
- --classes-smiles-file <classes_smiles_file>
TSV file with SMILES-based class definitions.
- Default:
PosixPath('scratch/wikidata_classes_smiles.tsv')
- --classes-cxsmiles-file <classes_cxsmiles_file>
TSV file with CXSMILES-based class definitions.
- Default:
PosixPath('scratch/wikidata_classes_cxsmiles.tsv')
- -f, --include-hierarchy, --no-hierarchy
Use chemical hierarchy for faster BFS-based searching.
- -i, --input-smiles <input_smiles>
Input file containing SMILES (CSV/TSV with ‘smiles’ column).
- -s, --smiles <smiles>
SMILES string(s) to classify. Can be specified multiple times.
- -z, --closest-only, --all-matches
Return only closest matching class per structure.
- --output-dir <output_dir>
Directory for combined output files.