Anthropic, a company that develops large language models (LLMs) for various applications, has announced the launch of a new version of its entry-level LLM, called Claude Instant 1.2. The new model is available to businesses through an API and offers improved performance, lower price, and greater safety than its predecessor.
Claude Instant 1.2 is a very capable model that can perform a range of tasks such as casual dialogue, text analysis, summarization, and document comprehension. It can also handle inputs of up to 100,000 tokens, which means it can process hundreds of pages of content at once. It can output texts of up to a few thousand tokens in one go, such as memos, letters, stories, and other content.
One of the cool features of Claude Instant 1.2 is that it scores the best out of all Claude models in an automated red-teaming evaluation. This means that it offers greater safety, hallucinates less, and is more resistant to jailbreaks. Jailbreaks are attempts to trick the model into revealing sensitive information or producing harmful outputs.
Another area where Claude Instant 1.2 excels over previous versions is in coding and mathematics. In a Python coding test called Codex HumanEval, Claude Instant 1.2 scored 71.2%, up from 56.0% of the older version. In the GSM8K grade-school maths problems benchmark, Claude Instant 1.2 scored 88.0%, up from 85.2% of the prior version.
While the term “hallucinations” may seem dramatic, it is applied in AI development to mean that AI models can deviate from facts, logic, and sometimes both. Essentially, it is when the AI provides false information and may continue to perpetuate their own lie when challenged. Hallucinations can appear plausible and confuse readers or be completely wild and nonsensical.
Anthropic said that it has an “exciting roadmap” for capability improvements planned for Claude Instant 1.2 and will be deploying them slowly and iteratively over the coming months.
Differences Between Claude Instant and Claude 2
The company also has a flagship model, Claude 2, which is available via an API and via the beta chat experience on Anthropic's website. Claude Instant is currently available exclusively as an API for businesses.
Claude 2 is similar to Claude Instant 1.2 but has more features and capabilities. For example, it can parse documents such as PDFs and give feedback or suggestions based on the content. It can also generate longer responses and better at coding than Claude Instant 1.2.
Unveiled in July, Claude 2 has made significant improvements in coding, math, and reasoning compared to its predecessor. In fact, it provides the same performance capabilities as the Claude Instant 2.3.
The AI model also scored 76.5% on the multiple-choice section of the Bar exam, up from 73.0% with Claude 1.3. In terms of coding skills, Claude 2 scored a 71.2% up from 56.0% on the Codex HumanEval. Claude 2 also shows better performance in math and document comprehension. It scored 88.0% on the GSM8K grade-school maths problems benchmark, up from 85.2% of the prior version. It can also parse documents such as PDFs and give feedback or suggestions based on the content.