The generative AI GPT-2 became available in 2019, followed by GPT-3 and ChatGPT, able to create large amounts of written work in a short span of time. They also are able to write reviews that may or may not represent an actual human user’s authentic experience with a product or service.
We analyzed 245,000 reviews about thousands of companies and products at Capterra, G2, and TrustRadius, popular sites for reviews about B2B software. Our AI detector examined the text of reviews to find how much content was likely generated by humans or AI.
All three sites have seen the percentage of AI-generated reviews rise in the past four years. Before that time all three sites averaged under 10%. The introduction of GPT-2 in February 2019 is the starting point for increases, particularly for TrustRadius. However, TrustRadius began a notable decline in the period between the introduction of GPT-3 and ChatGPT, gaining some control over the issue and looking more comparable to G2 and Capterra. But a stark difference between TrustRadius and the other two sites appeared once ChatGPT was released near the end of 2022 with TrustRadius well below the other two who rose to more than 25% of reviews being AI-generated. Perhaps this reduction is not surprising, as TrustRadius says it rejects 47% of reviews submitted and that “suspicious user” is the #1 reason for rejection.
At all three sites, there was a consistent pattern observed where the highest ratings had the most AI-generated reviews. This finding suggests that AI-generated reviews are bolstering products’ ratings average rather than denigrating or “review bombing” them. The association between ratings and a higher AI-generated percentage of reviews held true even at TrustRadius, even as AI-generated reviews have dropped overall. Glowing reviews with top scores are more likely to be AI-generated than more critical ones.
The three sites did not all exhibit the same patterns in the rates of AI-generated reviews over time. While they all saw a rise with the early releases of GPT-2 and GPT-3, things changed after ChatGPT launched November 30, 2022. Capterra’s AI-generated reviews more than doubled from 15% for the first 11 months of 2022 to 33.6% for the months from December 2022 through September 2023. G2 rose from 15.2% for the same first 11 months of 2022, but then leapt to an average 25.6% since then. In contrast, TrustRadius had a 17.4% drop in AI-generated reviews after ChatGPT compared to the first part of 2022 (12.9% vs. 10.7%). The G2 post-ChatGPT average would be even higher except that there was a drop from a high of 34.6% in June to under 20% for August, September, and the first half of October 2023. These drops at G2 and TrustRadius suggest that some sites recognize the problem and have made efforts to combat it.
These findings collectively suggest that AI-generated reviews are becoming more prevalent across these platforms, particularly since ChatGPT for G2 and Capterra. Our study also highlights how the proportion of AI-generated content can vary based on ratings and reviewer anonymity. The problem these reviews present for review sites and their users is clear: What use are reviews if they don’t come from actual users? Some AI-generated text could be from users utilizing generative AI to write reviews based on input they provide from their experiences. But is that what is happening for 1 in 4 reviews at G2 and 1 in 3 at Capterra? The verification procedures have not been enough to contain the problem, but there are other tools like AI detectors that could help. The platforms have seen fluctuations in the prevalence of AI-generated reviews over time, with Capterra experiencing a particularly significant increase.
Have you seen a thought leadership LinkedIn post and wondered if it was AI-generated or human-written? In this study, we looked at the impact of ChatGPT and generative AI tools on the volume of AI content that is being published on LinkedIn. These are our findings.
We believe that it is crucial for AI content detectors reported accuracy to be open, transparent, and accountable. The reality is, each person seeking AI-detection services deserves to know which detector is the most accurate for their specific use case.