The Facts
Anthropic recently announced Claude 2, their latest and most capable language model.
Claude 2 appears to be the second most capable model (on the standard benchmarks) released to date, behind only GPT-4. Claude 2 is ~ 4-5x cheaper per token than GPT-4 and supports a 100k token context window.
Why it matters
With Claude 2 being potentially the biggest competition to GPT-4 yet, it’s interesting to poke at the competitive landscape of language models. Let’s slice it a few ways:
The most obvious design space is simple: “smarter” models are essentially always larger, making them slower and more expensive. Which model makes sense for an application will likely be highly use-case dependent:
Fine-tuning can move points around on this curve in an interesting way:
There are also a handful of other criteria you might measure or care about:
Steerability — if you’re creating AI companions, the alignment done by hosted models may interfere with your ability to create companions. Open-source models are innately advantaged here.
Compliance / Security / Privacy — If you need to host a model in your VPC
Most of the decision criteria thus far have been on the cost/performance curve though, and OpenAI has dominated that market.
My thoughts
Claude 2 sits at an interesting place in the curve — it won’t win for the use cases that need the smartest models (like reasoning agents), but it could carve out a niche for use cases that need smart + latency performance.
There are a lot of other angles to attack to produce better models (with fine-tuning being the most exciting of them, IMO).
For now, excited to have another competitive entry!