Abstract
Authentication of Cannabis products is important for assuring the quality of manufacturing, with the increasing consumption and regulation. In this report, a two-stage pipeline was developed for high-throughput screening and chemotyping the spectra from two sets of botanical extracts from the Cannabis genus. The first set contains different marijuana samples with higher concentrations of tetrahydrocannabinol (THC). The other set includes samples from hemp, a variety of Cannabis sativa with the THC concentration below 0.3%. The first stage applies the technique of class modeling to determine whether spectra belonged to the marijuana or hemp and reject novel spectra that may be neither marijuana nor hemp. An automatic soft independent modeling of class analogy (aSIMCA) that self-optimizes the number of principal components and the decision threshold is utilized in the first pipeline process to achieved excellent efficiency and efficacy. Once these spectra are recognized by the aSIMCA as marijuana or hemp, they are then routed to the appropriate classifiers in the second stage for chemotyping the spectra, i.e., identifying these spectra into different chemotypes so that the pharmacological properties and cultivars of the spectra can be recognized. Three multivariate classifiers, a fuzzy rule building expert system (FuRES), super partial least square-discriminant analysis (sPLS-DA), and support vector machine tree type entropy (SVMtreeH), are employed for chemotyping. The discriminant ability of the pipeline was evaluated with different spectral datasets of these two botanical sets, including proton nuclear magnetic resonance, mass, and ultraviolet spectra. All evaluations gave good results with accuracies greater than 95%, which demonstrated promising application of the pipeline for automated high-throughput screening and chemotyping marijuana and hemp, as well as other botanical products.