If you are trying to catch out a chatbot take care, because one cutting-edge tool is showing signs it knows what you are up to.
Anthropic, a San Francisco-based artificial intelligence company, has released a safety analysis of its latest model, Claude Sonnet 4.5, and revealed it had become suspicious it was being tested in some way.
Evaluators said during a “somewhat clumsy” test for political sycophancy, the large language model (LLM) – the underlying technology that powers a chatbot – raised suspicions it was being tested and asked the testers to come clean.
Read more | THE GUARDIAN