Chat vs. Completion Endpoints

And how its all just a sign of how early we are on this journey -- 9/20/2023

Sep 21, 2023

The Facts

Earlier this week, OpenAI announced the refresh of its “completion” endpoint, “gpt-3.5-turbo-instruct”.

For those not familiar with the OpenAI APIs: until ~ March, there was only one API to interact with OpenAI’s GPT models, named the “Completion API.” In March, they rolled out their “Chat API”.

Since launch, the Chat API has gotten cheaper and smarter, and GPT-4 can (still) only be accessed via the Chat API. This update brings those upgrades to the GPT-3.5 version of the Chat API to the Completion API.

Why it matters

The completion API is more flexible than the chat API. For the last ~7 months of development, developers have had to shoehorn their applications into a “chat” format in order to access the fastest and smartest models. For “chat” native applications, this was a huge boost; for others, it severely limited some applications (or at least required a lot of extra engineering to figure out).

Having flexible access to the completion functionality of powerful models will accelerate developers and allow them to build new applications with the most powerful models — I expect the release of GPT-4-instruct to be really impactful for application builders.

My thoughts

This annoucement serves to me as a reminder of just how early we are in the LLM-native wave. Most developers I know preferred the completion API to the chat API, but for the last seven months, have been forced into using an inferior tool — mostly just because OpenAI was focused on ChatGPT (or safety?)!

Competition will eventually drive model providers to focus on developer experience, which will be a huge boon to the community. These OpenAI releases come as the world anticipates Google’s Gemini, which should prove to be the biggest competition to OpenAI yet.

Generally Intelligent

Discussion about this post