TUE APR 11 2023

A Deep Dive into OpenAI's GPT-4 and Its Evolution from GPT-3.5

by Bithika Bishesh

Artificial intelligence and natural language processing have come a long way in recent years, and GPT-4 is the latest language model to show off these advancements. In this blog post, we will explore the main features of GPT-4, highlighting how it has evolved from its predecessor, GPT-3.5/GPT-3, to become a more powerful and versatile tool for a wide range of applications.

What is GPT-4?

OpenAI's GPT-4 represents the latest advancement in the company's line of large language model systems (LLM). In general, language model systems aim to predict the next word in a sentence and intelligently integrate their inputs. However, GPT-4's multimodal capabilities enable it to process both text and images, setting it apart from earlier models.

How is GPT-4 better than GPT-3.5 /GPT-3?

GPT-4 is an advanced version of its predecessor, GPT-3, with improvements in various areas, including size, performance, and capabilities. Here are some general improvements in the newer version:

1) Processing Power: GPT-4 processes inputs, including images, and text while GPT-3.5 can only accept textual input.

2) Response Generation: GPT-4 can generate longer responses with up to 25,000 words, while GPT-3.5 was limited to 8,000 words.

3) Improved Accuracy: It can generate 40% more accurate responses than the previous version. GPT-4 is also 82% less likely to produce content that is deemed inappropriate or offensive.

4) Model size: GPT-4 is a larger than GPT-3, which has 175 billion parameters. A larger model can capture more nuanced understanding of language and encode a broader range of knowledge.

5) Training data: GPT-4 is trained on a more recent and much larger dataset than GPT-3, allowing it to have more up-to-date information and a better understanding of recent trends, events, and language usage.

6) Generalization: GPT-4 exhibits better generalization capabilities, enabling it to understand and generate more coherent and contextually accurate responses in a wider range of situations.

7) Fine-tuning and customization: The newer version provides better support for fine-tuning, allowing developers to create more specialized and accurate applications using the GPT-4 model.

8) Supports more languages: The latest GPT model has exhibited its proficiency in backing up more than 26 diverse languages such as Ukrainian, Korean, Germanic languages, and many others. Remarkably, GPT-4 surpasses the English-language competence of GPT-3.5 in 24 out of those 26 languages.

RBRM-Based Training Method 

OpenAI employed a similar training approach to ChatGPT, but further refined it using an array of rule-based reward models (RBRMs).

1) Pre-training: GPT-4 model is trained using a large dataset of text from the Internet to predict the next word. This helps the model understand grammar, context, and common phrases.

Training language models on vast text datasets have led to the emergence of various capabilities. These include few-shot learning, as well as the proficiency to perform diverse natural language tasks, such as question answering, arithmetic, and classification, across various domains.

2) Fine-tuning: After pretraining, the model is fine-tuned using reinforcement learning from human feedback (RLHF). This process helps the model produce outputs that are more aligned with user intent and makes it more controllable and useful.

3) Addressing potential issues: Extra safety measures are added to ensure GPT-4 behaves appropriately, whether dealing with safe or unsafe requests. This involves:

a) Creating an additional set of safety-relevant RLHF training prompts that teach GPT-4 how to respond to different types of requests, like refusing harmful requests or accepting harmless ones.

b) Designing a set of rules (called rule-based reward models or RBRMs) that help guide GPT-4's behavior during the fine-tuning process. RBRMs are zero-shot GPT-4 classifiers. They take three inputs: the prompt (optional), the output from the policy model, and a human-written rubric (e.g., a set of rules in multiple-choice style) for evaluating the output. The RBRM then classifies the output based on the rubric.

4) Rewarding good behavior: GPT-4 is made to follow the desired behavior by rewarding it for making the right choices in the practice scenarios. For example, GPT-4 will refuse to explain how to build a bomb if prompted to do so.

5) Optimizing RBRM weights and providing additional data: GPT-4's behavior is further improved by adjusting the rules and providing more data to target specific areas it needs to improve.

Limitations of GPT-4

GPT-4, despite its advancements, has limitations including unreliability through hallucinations, a restricted context window, and inability to learn from experiences. Additionally, it consumes vast computational resources, resulting in environmental concerns and high costs. The model may also inadvertently exhibit biased behavior, as it learns from historical data that might contain biases.


GPT-4 represents a significant leap forward in the world of language models, incorporating user feedback, reinforcement learning, fusion modeling, and an increased token capacity to deliver a more powerful and adaptable tool for a wide variety of applications. As the field of artificial intelligence continues to grow and evolve, we can expect to see even more exciting developments and innovations in the years to come.