Laboratory Data Transparency

We at Confidence Analytics are huge fans of Transparency. While we take CONFIDENTIALITY of our customer data very seriously, we also want our database to be a valuable tool for our industry. Here, we present some anonymized data in a series of visual representations that we hope will help educate and prompt research questions among our viewers. If nothing else, we hope you will appreciate that marijuana data can be beautiful.

If you have any questions for us, or if you have a comment you’d like us to post here, or if you just want to start a dialogue, we even welcome criticism, contact us at:

Please continue to check back with us at this page, as we will continue to update with new information.

Flower Potency Over time


This chart depicts the THC Total values for all flower samples submitted to Confidence Analytics and reported to the WSLCB’s seed-to-sale traceability system to date. As you can see, it’s pretty steady at 18% average, but there’s always a slight dip in the late fall season. Typically, the central 50% of the distribution is between 15 and 20%.


Microbiology and Failure Rates


Of principal concern to many growers is the risk of failure in the microbiology test required by the legal marijuana market. Products are required to be tested for certain microbiological contaminants, and values above a certain threshold result in failure of the material to meet the “fit for consumption” criteria. Failures can lead to dramatic loss of revenue and can put a business out of production. Rather than look for ways around the microbial test, we at Confidence look to our data for clues as to why the product has failed, and we work closely with our clients to help them develop plans to avoid more failures in future lots.

When samples are failing, it’s usually caused by a systemic problem in the facility of production or processing. In our database, microbial values are highly correlated with producer name, indicating that different marijuana production and processing facilities have different average levels of contamination and therefore different probabilities of failure. Some growers never fail… they never even come close to failing. Others struggle with it repeatedly.

In our experience, the contamination usually takes place at the time of trimming and so that’s where we tend to start our investigations. When crops are failing, we send our scientists out to grow operations all around the state to work collaboratively with the marijuana manufacturing staff to identify possible sources of contamination. Sometimes it’s easy to find and fix; sometimes it’s not. We’ve observed that by thinking critically about the issue, most producer/processors have been able to make adjustments to their facility or process that lead to reductions in failure rates over time, sometimes drastic reductions.


Failure rates for the microbial test have fallen dramatically over time. There are certainly numerous factors that contribute to the improvement in Pass rates, but probably the biggest contributor is simply that producer/processors have been made aware of their microbial bioburdens and they’ve taken steps to reduce their risk of contamination. That’s Quality Assurance at work.

The first thing that jumps out about these scatter plots is that the distribution of microbial values is very wide. Some flower samples have microbial burdens so high that we can’t even count them, and we just record those values as “999,999” or “too numerous to count.” You can see those samples clustered up at the very top of the scatter plots. Other samples have so few microbes on them that we aren’t able to detect any. That’s not a surprise.  It’s just the nature of the beast.

Because the distribution is so wide compared to the failure threshold, it might look like a lot of the samples are failing. In reality, more than 85% of the samples are below the failure threshold and are a PASS in the State treaceability system. In fact, most of the samples are clustered near the zero mark because most samples are pretty clean.

Visualizing the trend in the scatter plots might take a little imagination at first, but you’ll notice that the left side of the graph has more high values than does the right side. To make this visualization easier, we’ve included a moving 60-day failure rate to give you a rough idea of the trend over time, and we’ve also fit our data to a logistic regression model which uses test date to predict probability of failure and estimates that the failure rate decays by 0.271% of its existing value per day (p-value = 1.35e-46; 95% confidence interval = 0.234 – 0.308). We’ve gone from a failure rate of roughly 12% to a failure rate of roughly 3.5% in just over 15 months, and we predict it’ll get better still as the industry improves its practices.

Cannabinoid Total compared to individual Cannabinoids


Unlike most pharmaceuticals, marijuana is not a single-molecule drug. It contains a vast collection of pharmacologically active constituents, which interact inside the body to enhance, inhibit, and modify each other’s effects. How the marijuana industry measures potency or dosage of a product has repeatedly been a thing of contention, and it’s clear that the way in which cannabinoids are expressed on package labels is a point of confusion for many consumers.  As scientists who spend our time measuring hundreds of marijuana samples, we often find ourselves scratching our heads about what exactly people mean when they say “potency”.

One thing’s for sure: marijuana contains cannabinoids (and terpenoids) and those things affect your body. The marijuana market is naturally interested in understanding how those cannabinoids are distributed among the population of cannabis plants, and how to measure the effects of each varietal so as to advertise a potency for each strain.

Take a look at the distribution of cannabinoid values for all flower samples tested by Confidence Analytics to date (2016-01-04). Below is that distribution in a scatterplot which depicts THC, THCA, Available THC (also known as “Total Active THC” or “THCmax”), Available CBD, and Cannabinoid Total [click for descriptions] for all flower samples tested. Beneath the scatterplot, we’ve included an excerpt from one of our Certificates of Analysis and we point to where on the scatterplot the values from that one certificate reside.

certificate distribution

Notice first that every point on the scatterplot is unique… just like every sample is unique. These samples come from far and wide and they represent many different strains grown by many different producers at many different points in time.

Next, take notice that each cannabinoid appears to have a normal distribution of values. Not only that, the middle three values are clearly two distributions each. THC is probably also two distributions, but it’s hard to see that given the extremely small lower limit. Cannabinoid Total — which is a sum of all the measurable cannabinoids — is the only result that is a single distribution. The dual distributions are caused by the distinction between THC-heavy strains (most of the data) and CBD-heavy strains. Cannabinoid Total is a measure of cannabinoid resin content in the plant material, and that value is distributed normally across all cannabis plants. The other four are measures of discrete molecules and their distributions are much more strain dependent.

In case you’re wondering, the two numbers on the certificate that have boxes around them are the only two cannabinoid values required to be on the package label at retail. You are not prohibited from putting other numbers from the cert on your label, so long as you label them accurately, but the two numbers in boxes are required. We encourage our customers to also include Cannabinoid-Total on their label and highlight it as the summary measure of cannabinoid content.

That said, we’ll also offer up our strong opinion that none of the numbers on the Certificate of Analysis are yet well correlated with “potency”. Potency is a subjective judgement, which is dependent on the biochemistry of the user and is controlled by complex interactions between the human and dozens of different molecules, including both phyto- and endo-cannabinoids while probably also modified by terpenoids. Labs like ours can measure cannabinoid/terpenoid content of a sample material, but potency is something you feel in your mind and body. The two things are different. The complicated dose-response relationship of marijuana is not yet well characterized and that presents a major obstacle to patients, care providers, and recreational users alike. More on that in a future post.