Sign Up to Our Newsletter

Be the first to know the latest tech updates

[mc4wp_form id=195]

How to Use GPT-5 Effectively

How to Use GPT-5 Effectively


, and it possesses powerful and helpful features. The model has a variety of parameters and options you can choose from, which you have to correctly select to optimize GPT-5’s performance for your application area.

In this article, I’ll deep-dive into the different options you have when using GPT-5, and help you choose the optimal settings to make it work well for your use case. I’ll discuss the different input modalities you can use, the available features GPT-5 has, such as tools and file upload, and I’ll discuss the parameters you can set for the model.

This article is not sponsored by OpenAI, and is simply a summary of my experiences from using GPT-5, discussing how you can use the model effectively.

GPT-5 Infographic
This infographic highlights the main contents of this article. I’ll discuss how GPT-5 handles multimodal inputs and how you can use that effectively. Furthermore, I’ll cover tool calling and the reasoning effort/verbosity settings. I’ll also discuss structured output and when that is useful, as well as file uploads. Image by ChatGPT.

Why you should use GPT-5

GPT-5 is a very powerful model you can utilize for a wide variety of tasks. You can, for example, use it for a chatbot assistant or to extract important metadata from documents. However, GPT-5 also has a lot of different options and settings, a lot of which you can read more about in OpenAI’s guide to GPT-5. I’ll discuss how to navigate all of these options and optimally utilize GPT-5 for your use case.

Multimodal abilities

GPT-5 is a multimodal model, meaning you can input text, images, and audio, and the model will output text. You can also mix different modalities in the input, for example, inputting an image and a prompt asking about the image, and receive a response. Inputting text is, of course, expected from an LLM, but the ability to input images and audio is very powerful.

As I’ve discussed in previous articles, VLMs are extremely powerful for their ability to directly understand images, which usually works better than performing OCR on an image and then understanding the extracted text. The same concept applies to audio as well. You can, for example, directly send in an audio clip, and not only analyze the words in the clip, but also the pitch, talking speed, and so on from the audio clip. Multimodal understanding simply allows you a deeper understanding of the data you’re analyzing.

Tools

Tools is another powerful feature you have available. You can define tools that the model can utilize during execution, which turns GPT-5 into an agent. An example of a simple tool is the get_weather() function:

def get_weather(city: str):
   return "Sunny"

You can then make your custom tools available to your model, along with a description and the parameters for your function:

tools = [
    {
        "type": "function",
        "name": "get_weather",
        "description": "Get today's weather.",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "The city you want the weather for",
                },
            },
            "required": ["city"],
        },
    },
]

It’s important to ensure detailed and descriptive information in your function definitions, including a description of the function and the parameters to utilize the function.

You can define a lot of tools to make available to your model, but it’s important to remember the core principles for AI tool definitions:

  • Tools are well described
  • Tools do not overlap
  • Make it obvious to the model when to use the function. Ambiguity makes tool usage ineffective

Parameters

There are three main parameters you should care about when using GPT-5:

  • Reasoning effort
  • Verbosity
  • Structured output

I’ll now describe the different parameters and how to approach selecting them.

Reasoning effort

Reasoning effort is a parameter where you select from:

Minimal reasoning essentially makes GPT-5 a non-reasoning model and should be used for simpler tasks, where you need quick responses. You can, for example, use minimal reasoning effort in a chat application where the questions are simple to answer, and the users expect rapid responses.

The more difficult your task is, the more reasoning you should use, though you should keep in mind the cost and latency of using more reasoning. Reasoning counts as output tokens, and at the time of writing this article, 10 USD / million tokens for GPT-5.

I usually experiment with the model, starting from the lowest reasoning effort. If I notice the model struggles to give high-quality responses, I move up on the reasoning level, first from minimal -> low. I then continue to test the model and see how well it performs. You should strive to use the lowest reasoning effort with acceptable quality.

You can set the reasoning effort with:

client = OpenAI()
request_params = {
        "model" = "gpt-5",
        "input" = messages,
        "reasoning": {"effort": "medium"}, # can be: minimal, low, medium, high
    }
client.responses.create(**request_params)

Verbosity

Verbosity is another important configurable parameter, and you can choose from:

Verbosity sets how many output tokens (excluding thinking tokens here) the model should output. The default is medium verbosity, which OpenAI has also stated is essentially the setting used for their previous models.

Suppose you want the model to generate longer and more detailed responses, you should set verbosity to high. However, I mostly find myself choosing between low and medium verbosity.

  • For chat applications, medium verbosity is good because a very concise model may make the users feel the model is less helpful (a lot of users prefer some more details in responses).
  • For extraction purposes, however, where you only want to output specific information, such as the date from a document, I set the verbosity to low. This helps ensure the model only responds with the output I want (the date), without providing additional reasoning and context.

You can set the verbosity level with:

client = OpenAI()
request_params = {
        "model" = "gpt-5",
        "input" = messages,
        "text" = {"verbosity": "medium"}, # can be: low, medium, high
    }
client.responses.create(**request_params)

Structured output

Structured output is a powerful setting you can use to ensure GPT-5 responds in JSON format. This is again useful if you want to extract specific datapoints, and no other text, such as the date from a document. This guarantees that the model responds with a valid JSON object, which you can then parse. All metadata extraction I do uses this structured output, as it is extremely useful for ensuring consistency. You can use structured output by adding the “text” key in the request params to GPT-5, such as below.

client = OpenAI()
request_params = {
        "model" = "gpt-5",
        "input" = messages,
        "text" = {"format": {"type": "json_object"}},
    }
client.responses.create(**request_params)

Make sure to mention “JSON” in your prompt; if not, you’ll get an error if you’re using structured output.

File upload

File upload is another powerful feature available through GPT-5. I discussed earlier the multimodal abilities of the model. However, in some scenarios, it’s useful to upload a document directly and have OpenAI parse the document. For example, if you haven’t performed OCR or extracted images from a document yet, you can instead upload the document directly to OpenAI and ask it questions. From experience, uploading files is also fast, and you’ll usually get rapid responses, mostly depending on the effort you ask for.

If you need quick responses from documents and don’t have time to use OCR first, file upload is a powerful feature you can use.

Downsides of GPT-5

GPT-5 also has some downsides. The main downside I’ve noticed during use is that OpenAI does not share the thinking tokens when you use the model. You can only access a summary of the thinking.

This is very restrictive in live applications, because if you want to use higher reasoning efforts (medium or high), you cannot stream any information from GPT-5 to the user, while the model is thinking, making for a poor user experience. The option is then to use lower reasoning efforts, which leads to lower quality outputs. Other frontier model providers, such as Anthropic and Gemini, both have available thinking tokens.

There’s also been a lot of discussion about how GPT-5 is less creative than its predecessors, though this is usually not a big problem with the applications I’m working on, since creativity usually isn’t a requirement for API usage of GPT-5.

Conclusion

In this article, I’ve provided an overview of GPT-5 with the different parameters and options, and how to most effectively utilize the model. If used right, GPT-5 is a very powerful model, though it naturally also comes with some downsides, the main one from my perspective being that OpenAI doesn’t share the reasoning tokens. Whenever working on LLM applications, I always recommend having backup models available from other frontier model providers. This could, for example, be having GPT-5 as the main model, but if it fails, you can fall back to using Gemini 2.5 Pro from Google.

👉 Find me on socials:

📩 Subscribe to my newsletter

🧑‍💻 Get in touch

🔗 LinkedIn

🐦 X / Twitter

✍️ Medium

You can also read my other articles:



Source link

Eivind Kjosbakken

About Author

TechToday Logo

Your go-to destination for the latest in tech, AI breakthroughs, industry trends, and expert insights.

Get Latest Updates and big deals

Our expertise, as well as our passion for web design, sets us apart from other agencies.

Digitally Interactive  Copyright 2022-25 All Rights Reserved.