Cannabis Strain Study

Brian420pm

Well-Known Member
I scrubbed data from 3 websites early this year... Leafly, Cannosas and All Bud. Short of hands-on experience, three variables are useful to determine what deserves attention, further investigation, and ultimately what helps drive future decisions, be it purchase, stocking shelves or home grow candidates.

- Popularity
- Positive Ratings
- Negative Ratings


Popularity by itself is little more than following the herd... you'll win some and lose some. As a member of a larger process, Popularity can be seen as a filter... a way to increase surety in your decisions down the road.

Data Rule #1 = Look at strains that have more than 1,000 reviews at Leafly.

Positive Ratings alone also tell a partial tale. Let's look at strains that are BOTH Popular and have Positive Ratings.

Data Rule #2 = Of the strains that pass Data Rule #1, keep all those that have at least a 4.4 review score.

This is a pretty exclusive and interesting group... around 30 strains pass the grade. If we stop here chances are good you'll have a good experience, right? Well not so fast!

Let's look at the last variable that ties it all together... Negative Ratings. Leafly ratings measure how many people report a strain as "Below Average" and "Would Not Recommend". I pulled this data for each of the 30 strains and calculated what percentage of reviewers reported those Negative Ratings.

Data Rule #3 = Of the strains that pass Data Rule #1 and #2, keep all those that have 10% or less "Below Average" and "Would Not Recommend".

NOW the cream is really rising to the top! Can you believe there were strains that passed Rule #1 and #2 AND had a negative rating of over 30%! Think about that... almost 1/3rd of every reviewer gave some of these popular and high rated strains a thumbs down! This is the perfect example of the saying, "You either LOVE it or HATE it".

That leads us to the list below, which shows the method above applied to all three review websites, sorted by the smallest negative score. Also, since Leafly has more data than the other two, the scores are weighted accordingly.

The "3 Month Growth" column is simply the percentage increase in reviews at Leafly over a recent, three month period.

Notes:

- GG4 This strain has the lowest negative score at Leafly, by far, at around 1%! The next closest was many multiples higher at 5%. I initially did this study before a trip to Colorado, so of course I bought GG4. After 35+ years away from consuming, I picked a winner to come back to. :)

- GELATO I read an article recently about how this strain is coming on strong, well look at the growth rate. It towers above all in this list at %14 popularity growth. These spikes may be explained by supply/demand surges, for example a temporary glut on the market. Seeing this metric over the course of a year will be most interesting.

- The Top 5 in 3-month popularity growth from this list are GELATO, PINAPPLE EXPRESS, DURBAN POISON, SUPER LEMON HAZE and CHERRY PIE. Consumers and retailers TAKE NOTE, these strains are WINNERS and currently HOT!

- I'm NOT commercially affiliated with any brand or product, just a retired Data Analyst/Programmer keeping my nose in what I find interesting :)

Did YOUR favorite make the cut?

1775168
 
Love this type of stuff.

Woo! I do this in most aspects of my life... use data to help direct my decisions with confidence.

On this particular study I did it manually, from the site pages and viewing the HTML source, but I have the knowledge and programming tools to automate if needed.
 
Just checked and looks like there are some published APIs, at least google hits (didn’t dig in though). Could provide some easier analytics fun down the line. If you’re really into programming, another rabbit hole you might enjoy is stuff like the Arduino hackathon grow room automation and HydroPi projects - very interesting. I got some good deals on BlueLab gear otherwise I probably would’ve tinkered more in that arena. Lots of whitespace for affordable grow room automation for those with the know how IMO.

Fun to see the data you pulled, definitely keep it up!

P.S. please don’t waste your time or weed water curing (previous post on separate thread)
 
Practical strain info is always useful. Especially in this day in age. I read so much over the years I kind of have a head full of that kind of info. With as many strains as there sare from the 80's and 90's still around. It is easy to know that there must be something to them.

Crowdsourcing strain info is going to be helpful in ways. I just don't know how correct it will be. You mention that you are averaging out info. You are also considering lots of pertinent information.

The problem is. That over 50 percent of the people growing a certain strain are growing a copy or inferior genetics. Let alone the differences in strains by different breeders. Hard to compare something that has so many variables. Plus there is a huge amount of difference between a good pheno of a strain and a great one. Not everyone gets a great pheno but they still have the same strain.

With that said. It is a good idea to compare all the info out there together as a group. It is going to help. In the end at least some of the cream is going to rise to the top.
 
The problem is. That over 50 percent of the people growing a certain strain are growing a copy or inferior genetics. Let alone the differences in strains by different breeders. Hard to compare something that has so many variables. Plus there is a huge amount of difference between a good pheno of a strain and a great one. Not everyone gets a great pheno but they still have the same strain.

You can't really consider this stuff in the same regard that you could with standard data sets, either, IMHO. NONE of the people writing reviews have tried and reviewed every strain. Most people have tried relatively few of them (under 25 for most, probably under 150 for almost all).

Did you give more weight to the people who've reviewed the most strains? Did you give less weight to individual reviews in which "this was the first strain I ever personally grew" (or words to that effect) appeared, lol? Did you track those people who gave less-glowing reviews to sativa-leaning strains to find out if they just generally prefer indicas? And vice versa?

Speaking of sativa-leaning strains, did you give a 1.n multiplier to them so as to balance their awesomeness against their lengthier flowering periods and the fact that a lot of people seem to be really impatient?

Do you have some kind of mathematical formula to account for the fact that lots of people seem to be more inclined to give a review when there was a positive experience than when there was nothing especially great - or, for that matter, especially bad - about it, and that it's technically possible that a lot of those higher-ranked strains might be receiving high praise by what could actually turn out to be a vocal minority but have a high (or at least statistically significant) majority of people who have tried the strain and merely thought, "Meh..." and, therefore, weren't motivated to bother posting a review on? And that, with this being (again, technically) possible, that some lower-ranked strains might have a smaller percentage of people who've tried it and had that reaction?

And then there are the strains that hit you really well, you write a glowing review of... and by the end of the week, or the week after, you're either kind of bored with or burnt out / overly tolerant of, versus the strains that you don't find yourself wanting to go out and rent billboard space to advertise to the world... but you end up growing time after time.

There's the "popularity train," too. Half the f*ck*ng world wants to climb aboard it at any one time. All the more power to 'em, lol (maybe the bridge will be out ;) ). But that phenomenon probably affects the ranking, too. Because those SOBs can't seem to sit still, and oh look, here comes another popularity train rolling into the station. Better step back - it looks like another stampede. . . .
 
You can't really consider this stuff...
None of the people writing reviews have tried...
Most people have tried relatively few...
Did you give more weight to...
Did you give less weight to...
Did you track those people...
Did you give a 1.n multiplier to...
Do you have some kind of mathematical formula to...

Wow, just wow! lol I've been in the data analyst business a LONG time, and I can tell you with certainty there IS no such thing as a "standard data set". In a brand new industry, or any business really, you work with the data you have.

Trends will propel the industry forward on many levels. That's why it's important to use a more complete picture of popularity, positive and negative data points that I present in the OP.

Despite your apparent disdain for data driven decisions and all the negative energy it took you to write your criticisms, I'm still very glad I was able to share one method that helps drive progress in this space. I was actually contacted by a dispensary owner that wants to see the results going forward, (never have the following words been delivered in such a PERFECT atmosphere), so put THAT in your pipe and smoke it! ha

With that I will now perform the Steve Martin break up ceremony, on YOU...

I break with thee
I break with thee
I break with thee
<throws dog poop on your shoes>
 
Sorry, Jack, for the misunderstanding. I quoted your post as a starting point for my reply to the OP and then built on it. I probably should have added something to make that clear, but didn't think to do so at the time. Then, when I returned to the thread to see about posting some clarifications of what I was getting at, I saw that the OP appeared to be absolutely sure of himself (zealots have faith, lol, while doubters look for evidence in order to figure out whether they're right or wrong) and in no mood to respond to criticism other than defensively - which would have probably worked better if he had stated that he'd been in the data analytics field for a long time (as a person in the data analyst business would appear to be involved in dealing with the people who study the data (as opposed to actually dealing with the data), which strikes me as being more of a human resources position. My taking note of that would be classed as nit picking if it was a statement made by someone who was just doing this kind of thing as a hobby... but most people who actually work or who have worked in a particular and somewhat specialized field tend to know what said field is called :rolleyes: .

So at that point, I didn't know whether I was responding to someone who had been involved in it at some point in the past and had just been away from it for so long that he both misspoke and failed to "check his work before submitting it," or was just someone who was trying to do a thing and wished to give people the impression that he'd done this kind of work professionally before.

I can respond to either, of course, but like to know which it is because I tend to be harsher on a professional than an amateur; it's like if I ask my buddy to help me with the plumbing repair vs. asking a professional plumber. If I turn on your kitchen faucet afterwards and water sprays the ceiling, I'll be a lot more forgiving towards my friend than I would be towards the professional plumber because of the assumption that the latter is supposed to know how to do plumbing repairs (that being his job and all).

While I was still trying to figure that out, I read on and saw where the OP felt the need to tie on his little boy bib and post some kind of drivel about putting feces on my shoes. At that point I gave up on even attempting to help the guy and decided that if he had ever had some kind of job working with data, it was probably while being employed in someone's marketing department and looked for a thread started by someone less likely to come down with a serious case of Butt-Hurt over a little criticism about some of the more obvious problems with what he was trying to accomplish. Or, alternatively, someone who was capable of insulting me at greater than a fourth grade level :p.

It's kind of a shame, really. I think that what the guy is trying to accomplish is an admirable task, if he would approach the subject as something inherently different than, for example, peak river crest data where that data isn't affected by people's perceptions (which wouldn't be an issue if the same set of people had reviewed all of those strains, but that's not how this stuff works, lol, which most people would have taken for granted from the start), there isn't a good chance that different people's "measuring sticks" would be of different sizes with different scales, the people who are measuring their very first river crest don't unconsciously inflate the height because, for the first time, they managed to measure it from start to finish instead of having to pay some street dealer to <COUGH> hand them a prepackaged sack of measurement, et cetera.

If I rank a set of strains, and you (you in this instance being anyone who cares to) rank a completely different set of strains... then HtH does someone else come along, throw all of those rankings into one list, and expect the thing to be useful? "Oh, I see that one of mine is ranked higher than two of yours. Does that mean that the one of mine is a better strain?" Well, first it might be a good idea to define "better," lol. But, aside from that, we cannot use such a ranking to compare the strains in any kind of "overall" manner because we did not each review the same set of strains.

In theory, it'd help somewhat if one or two strains appeared on both of our lists. Might be able to suggest that, if I rank two strains fourth and sixth, and you ranked the same two strains as third and fourth, that a strain that one of us ranked as second would also be considered by the other person to be better than the two strains that our lists had in common. But in practice, that doesn't always work. Much of the reasons for and against a strain are going to be subjective reasons and, again, it's made even worse when you're trying for some kind of correlation of subjective data when there's no absolute commonality (in other words, when different people are each submitting only portions of the complete set of data you're trying to rank.

<SHRUGS> That's why in blind taste tests, they don't stop one person and have them taste sample A, stop another person and have them taste sample B, stop yet another person and have them taste sample C... and then ask the three people which sample tasted best. They cannot possibly be expected to know - because none of the three people have actually tasted all three samples!

This guy is trying to rank strains based on the opinions of people who have not "tasted" all of the strains that he is ranking. FFS.

But it's cool. In about two weeks or so someone will probably release another strain with a catchy name, some sparkly pictures, and a nice description and it'll hit the top ten most popular strain lists about five minutes later. Or a new breeder/seedbank will become a sponsor on a few cannabis forums, pass out a bunch of free seeds, and the owner's strains will get a nice bump in their popularity.

Fawk, even with one person the subjectivity of the thing comes into play. I thought many of the strains I was smoking 30 years ago were the cat's ass. Now... only two of them would rank on my personal top ten list (would be three, but that Thai strain my buddy brought back during one of his breaks from stirring sh!t in other government's pots for ol' Uncle Sam kept producing opposite-sex flowers and I finally gave up on trying to get it to settle down and stop turning into a room full of shims - it sure was a dreamy strain, though ;) ).

Anyway, y'all have fun. I'm going to unsubscribe so you can play with the somewhat questionable data to your heart's content. I mainly just posted this time in order to explain to Jack that I wasn't debating his points but, instead, using them as a preface to what I wanted to say (type).
 
I completed another 3 month study of how many reviews were entered for each of the 18 strains that made my initial cut, as explained in the OP. I was surprised how little change there was compared to the prior three months... the most popular strains are STILL the most popular by this specific measure.

GELATO is crushing everything else out there, with review numbers growing by a whopping 25% in the last 6 months. The next closest is DURBAN POISON at 9%, PINEAPPLE EXPRESS at 8% and all the rest are within 4% of that.

Gelato 25.1%
Durban Poison 8.7%
Pineapple Express 8.4%
Super Lemon Haze 7.1%
Cherry Pie 7.1%
GG4 6.9%
Jack Herer 6.6%
Northern Lights 6.5%
Tahoe OG Kush 5.5%
Death Star 5.4%
Sour Diesel 5.2%
Blue Dream 5.1%
Green Crack 4.9%
Platinum GSC 4.9%
GSC 4.8%
Purple Urkle 4.3%
Super Silver Haze 4.2%
Hindu Kush 4.0%
 
That is the social media thing showing up. What's new is always what is best lol. What is really funny is how the older strains are till on the list and doing well. Some over 20 years old.

There may be greatness hidden in new strains. Hard to beat the solid genetics from the old days. Back when strains were made and not just some feminized pheno's.
 
Here is the 9 month update, tracking the growth of number of reviews at Leafly of some top strains explained in the OP. Still very consistent with the 3 and 6 month growth patterns. If you need to know what strains are generating the most reviews while having the least negatives, this is your golden ticket!


Screenshot_3.jpg

 
No way to really rate strains when you have personal preference involved. Like you mentioned the cream should rise to the top. In this day in age with 100's of seed banks out there it makes it harder.

Problem is you have lots of breeders releasing strains by the same name. Most are all made with different plants so the pheno thing changes the outcome. Plus not all seeds in a pack are going to be great plants. The hope is you will find one great plant to mother and clone. So few strains are stabilized these days it makes them had to hybridize with any kind of stability. Then there are strains by the same name built with totally different genetics. One White Cookies is not like the others.

I stick with my statement about the old school strains. They hold good ratings because they were made right to begin with.

Don't get me wrong. I think your rating system does have lots of merit. It might help if you can rate one Blue Dream by one breeder. You are going to find out one Blue Dream will have better ratings then the others. Just using strain names is too general. You have to remember 30 or 40 companies at least have a Blue Dream. They are not all the same.

The task before you is larger than you thought LMAO. Probably bigger than anyone could accomplish anymore. Good job on all the work it took to get the ratings you have. Not many people will stick with it long enough for solid results.
 
I captured a full year of data now from leafly and I'm glad to see clear and consistent trends. The most popular strains with the lowest negatives reported continue to lead the pack across all time frames, 3 months, 6 months and for the year.

Of the strains chosen as described in the OP, here they are ranked by the growth of how many people are reviewing them.

The last 3 Months...

Screenshot_1.jpg


The last 6 months...

Screenshot_2.jpg


For the entire year of 2019...

Screenshot_3.jpg


For those in cannabis retail and those that want to explore new strains, these leaders are well worth your attention!
 
good list. it seems to fall in line with many, many of the grow journals here. in fact i've had my finger on the trigger on a lot of those strains and just pick something else that isn't so popular lol. Northern Lights is the only one on that list i have. I'm surprised grand daddy purple isn't on there. it's gets a lot of mention.
 
I'm surprised grand daddy purple isn't on there.

Thanks for the comment! Granddaddy Purple didn't make the original cut because even though it's very popular, a year ago it was below the 4.4 star average star requirement... it's currently exactly 4.4.
 
Thanks for the comment! Granddaddy Purple didn't make the original cut because even though it's very popular, a year ago it was below the 4.4 star average star requirement... it's currently exactly 4.4.
Well i think it's a useful list, thank you for sharing it. i saved it to my weed folder lol.
 
Back
Top Bottom