The Facts
OpenAI was planning to sunset old versions of their model endpoints in September but quickly reversed course last week:
We previously communicated to developers that
gpt-3.5-turbo-0301
,gpt-4-0314
andgpt-4-32k-0314
models were scheduled for sunset on Sept 13, 2023. After reviewing feedback from customers and our community, we are extending support for those models until at least June 13, 2024.
This came shortly after Matei Zaharia (the cofounder of Databricks) released a controversial research report that demonstrated significant behavior changes (in some cases, regressions) between those older endpoints and the newer model versions.
The sunsetting policy also came under fire when it was announced; I recommend checking out this thread to see some developers’ complaints.
Why it matters
We’re seeing a symptom of a more significant challenge for teams offering LLM endpoints as a service; LLM endpoints are fundamentally underdocumented APIs. The problem is simple: prompts. LLM APIs are incredibly sensitive to prompt variations, yet it’s impossible to document every possible prompt permutation. OpenAI has their prompt “cookbook” as their best attempt.
When you upgrade an undocumented API, you are making undocumented, likely breaking changes to an API endpoint. Undocumented changes to APIs are a big no-no! OpenAI’s response is that they are making their models “smarter,” but smarter is only a small fraction of what developers want out of a language model API.
If you have to make breaking API changes (and I think you do if you’re OpenAI), the best practice is to give your users a long time to migrate to the new endpoint — with models as large as GPT-4, that can be an expensive proposition.
My thoughts
I see these challenges as another boon for open-source LLMs — the only way to avoid challenging, potentially unnecessary migrations is to host your own LLM endpoint. While that may not be important for everyone, it is necessary for some. Migrations are expensive!