The Most Accurate AI Content Detector
Try Our AI Detector
AI Studies

Can Upgraded Claude 3.5 Sonnet Be Detected by AI-Content Detectors?

Anthropic launched its latest AI model, upgraded Claude 3.5 Sonnet in October 2024. Review a brief study of Originality.ai’s accuracy in detecting upgraded Claude 3.5 Sonnet AI-generated content.

Anthropic has launched its new AI model, upgraded Claude 3.5 Sonnet.

Anthropic announced the upgrade to Claude 3.5 Sonnet on October 22, 2024. The announcement came just a few months after Claude 3.5 Sonnet was initially released on June 20, 2024. 

With the initial release of Claude 3.5 Sonnet (in June 2024), Anthropic claimed that it outperformed its peers, such as OpenAI’s GPT-4o, Google’s Gemini-1.5 Pro, Meta’s Llama-400b, and even the company’s proprietary models — Claude 3 Haiku and Claude 3 Opus. 

It now notes that its upgraded Claude 3.5 Sonnet (October 2024) outperforms its predecessor and delivers across-the-board improvements that show wide-ranging improvements on industry benchmarks, with particularly strong gains in agentic coding and tool use tasks. 

So, we put the Originality.ai AI detector to the test to determine its accuracy in detecting AI-generated text created by this upgraded Claude 3.5 Sonnet model. 

To establish the AI Checker’s accuracy, this brief study generated 1000 Claude 3.5 Sonnet text results and then ran them through the Originality.ai AI checker. 

Is the upgraded version of Claude 3.5 Sonnet AI Content Detectable?

Yes — Claude 3.5 Sonnet text is yet detectable with 99.0% accuracy for the Originality.ai Turbo 3.0.1.

Try our AI Detector here.

Dataset

To evaluate the detectability of upgraded Claude 3.5 Sonnet, we prepared a dataset of 1000 Claude 3.5 Sonnet generated text samples.

AI-Generated Text Data

For AI-text generation, we used upgraded Claude 3.5 Sonnet based on three approaches given below:

  1. Rewrite prompts: We generated the content by providing the model with a customized prompt along with some articles (probably generated by LLMs) as a reference to rewrite the prompts. (450 Samples)

  2. Rewrite human-written text: For the second method, we generated the content by attempting to use the provided prompt to bypass the AI Detection tool. To accomplish this, we asked upgraded Claude 3.5 Sonnet to rewrite the human-written text, which we fetched from an open-source dataset (325 Samples)
    1. One-Class Learning for AI-Generated Essay Detection
      1. Paper: https://www.mdpi.com/2076-3417/13/13/7901
      2. Dataset: https://github.com/rcorizzo/one-class-essay-detection
  3. Write articles from scratch: Finally, for the third approach, we generated the articles from scratch based on a set of topics (fiction and nonfiction) such as history, medicine, mental health, content marketing, social media, literature, robots, the future, etc. (225 Samples).

Evaluation

To evaluate the efficacy, we used our Open Source AI Detection Efficacy tool:

Originality.ai has three models — Model 3.0.1 Turbo, 1.0.0 Lite, and Multi Language for the purpose of AI text detection.

  • Version 3.0.1 Turbo — If your risk tolerance for AI is ZERO! It is designed to identify any use of AI, even light AI.
  • Version 1.0.0 Lite — If you are okay with slight use of AI (i.e., AI editing).
  • Multi Language — Detect AI content across 15 languages.

Learn more about which AI detection model is best for you and your use case.

The open-source testing tool returns a variety of metrics for each detector tested, each of which reports on a different aspect of that detector’s performance, including:

  • Sensitivity (True Positive Rate): The percentage of the time the detector identifies AI text correctly.
  • Specificity (True Negative Rate): The percentage of the time the detector identifies human-written text correctly.
    Accuracy: The percentage of the detector’s predictions that were correct.
  • F1: The harmonic mean of Specificity and Precision, often used as an agglomerating metric when ranking the performance of multiple detectors (a performance measurement that combines recall and precision to evaluate models).

If you'd like a detailed discussion of these metrics, what they mean, how they're calculated, and why we chose them, check out our blog post on AI detector evaluation. For a succinct snapshot,  the confusion matrix is an excellent representation of a model's performance.

Below is an evaluation of both the models on the above dataset. 

Confusion Matrix

Figure 1. Confusion Matrix on AI-only dataset with Model 3.0.1 Turbo

Evaluation Metrics

For this small test to reflect the Originality.ai detector’s ability to identify Claude 3.5 Sonnet content, we looked at the True Positive Rate or the percentage of the time the model correctly identified AI text as AI out of a 1000 sample Claude 3.5 Sonnet content. 

Model 3.0.1 Turbo:

  • Recall (True Positive Rate) = 99.0%

Conclusion

Our study confirms that the content generated by upgraded Claude 3.5 AI-generated text is highly detectable with our AI detector. The Model 3.0.1 Turbo excelled with a 99.0% accuracy

These results highlight the effectiveness of the Originality.ai AI detector in identifying AI-generated content, ensuring reliable detection across various text generation approaches.

Interested in learning more about AI detection? Check out our guides:

Jonathan Gillham

Founder / CEO of Originality.ai I have been involved in the SEO and Content Marketing world for over a decade. My career started with a portfolio of content sites, recently I sold 2 content marketing agencies and I am the Co-Founder of MotionInvest.com, the leading place to buy and sell content websites. Through these experiences I understand what web publishers need when it comes to verifying content is original. I am not For or Against AI content, I think it has a place in everyones content strategy. However, I believe you as the publisher should be the one making the decision on when to use AI content. Our Originality checking tool has been built with serious web publishers in mind!

More From The Blog

Al Content Detector & Plagiarism Checker for Marketers and Writers

Use our leading tools to ensure you can hit publish with integrity!