OpenAI, which is owned by Microsoft, announced its new sizeable multimodal model, “GPT-4,” which accepts image and text input.
“We have created GPT-4, the latest milestone in OpenAI’s quest to enhance deep learning,” the company said in a blog post on Tuesday.
“We spent 6 months iteratively adapting GPT-4, taking lessons from our competitive testing program as well as ChatGPT, producing our best scores for objectivity, controllability, and resistance to going off-track.”
Compared to GPT-3.5, this new AI model is more reliable, creative, and able to process complex instructions.
GPT-4 outperforms existing Large Language Models (LLMs), including most State-of-the-Art (SOTA) models, which may involve benchmark-specific constructs or additional training methods.
“In 24 of the 26 languages tested, GPT-4 outperformed GPT-3.5 and other English LLMs (Chinchilla, PaLM) in English, including resource-poor languages such as Latvian, Welsh, and Swahili,” according to the company.
Companies are also using this new model internally, with major implications for support, sales, content moderation, and programming functions.
Unlike text-only settings, this model can accept text and image commands, allowing the user to define any visual or spoken task.
The basic GPT-4 model, like previous GPT models, is trained to predict the next word in a document. It is trained with licensed and publicly available data.
ChatGPT Plus subscribers can access GPT-4 at chat.openai.com with limited use, while developers can register on the GPT-4 API waiting list.
“We hope that GPT-4 will become a valuable tool to improve people’s lives by supporting many applications,” the company said.