GPT-4 is said to have human-level performance on various professional and academic benchmarks, such as passing a simulated bar exam or writing creative stories. But what does this mean for the future of AI and society?
GPT-4 is the successor of GPT-3.5, which was released last year as a test run of OpenAI's new deep learning stack and supercomputer co-designed with Azure. GPT-4 is much larger and more capable than GPT-3.5, with over 1 trillion parameters and the ability to process 100 billion pixels per second.
— OpenAI (@OpenAI) March 14, 2023
Bing Chat already uses GPT-4
“We are happy to confirm that the new Bing is running on GPT-4, which we've customized for search. If you've used the new Bing preview at any time in the last five weeks, you've already experienced an early version of this powerful model. As OpenAI makes updates to GPT-4 and beyond, Bing benefits from those improvements. Along with our own updates based on community feedback, you can be assured that you have the most comprehensive copilot features available.”
Human-Level Performance on Various Benchmarks
According to OpenAI's announcement, GPT-4 exhibits human-level performance on various professional and academic benchmarks, such as passing a simulated bar exam with a score around the top 10% of test takers or solving math problems from Olympiads and AP exams. It also shows improved factuality, steerability, and alignment with human values compared to GPT-3.5.
OpenAI says that GPT-4 is more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5. For example, it can write summaries of news articles based on their headlines or captions for images based on their content.
OpenAI is releasing GPT-4's text input capability via ChatGPT Plus and the API (via waitlist). The image input capability is still in preview mode and will be available to a single partner initially. OpenAI is also open-sourcing OpenAI Evals, its framework for automated evaluation of AI model performance.
GPT-3 vs. GPT-3.5 vs. GPT-4
GPT-3, GPT-3.5 and GPT-4 are names of different versions of the large language models developed by OpenAI. GPT-3 was released in 2020 and had 175 billion parameters, making it the largest language model at that time. It could generate coherent texts on various topics and tasks, but also had limitations such as factual errors, bias and lack of multimodality.
GPT-3.5, released in 2022 as an intermediate version between GPT-3 and GPT-4 uses 350 billion parameters and improved on some aspects of GPT-3, such as factuality, steerability and guardrails. It also powers ChatGPT, the conversational AI platform that allows users to chat with GPT-3.5 on various topics and domains.
GPT-4 is a multimodal model that can accept both text and image inputs and generate text outputs using 700 billion parameters, enabling it to pass a simulated bar exam with a score around the top 10% of test takers. In contrast, GPT-3.5's score was around the bottom 10%. GPT-4 is also more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5. However, it is also more expensive to use than GPT-3.5 As OpenAI acknowledges, there are still many challenges and risks associated with scaling up AI systems and ensuring their safety and alignment. You can read more about GPT-4 on OpenAI's website or try it out on ChatGPT Plus, which already uses GPT 4.
Tip of the day: For the most part, Windows apps are stable, but they can still be still thrown out of whack by updates or configuration issues. Many boot their PC to find their Microsoft Store isn't working or their Windows apps aren't opening. Luckily Windows 11 and Windows 10 have an automatic repair feature for apps that can resolve such issues.