Ever since ChatGPT was released back in November 2022, it has been leading the way in AI innovation and democratization. Thanks to ChatGPT, from school going kids to grandparents and everyone in between have access to the most advanced AI models created by mankind in a very easy to use web and mobile app. It’s the most widely adopted AI product created so far. And believe it or not, it just got much better for everyday users with it’s new features and if you weren’t already using ChatGPT, now is the time to try it out. At first let’s talk about this new model called GPT-4o and why it’s better than ever.
What is GPT-4o?
GPT-4o (“o” for “omni”) is the latest multimodal AI model created by OpenAI. Multimodal means it accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs. That way people can interact with AI even more naturally just like we would with any other human. Does that mean it costs more than the previous models? NO! That’s the great part about omni. It uses a new kind of tokenization that makes it much more efficient. For developers who want to build apps with GPT-4o will have access to the API, which is half the price and twice as fast as GPT-4 Turbo, the previous flagship model, with 5 times more limits. If you don’t know what Tokens are or how to use the API, well, don’t worry, ChatGPT is there for you.
All ChatGPT users will have access to the GPT-4o model to play with from the beginning. The paid users will have 5 times more limits than the free users. But that’s not as bad because you still have the GPT-3.5 model as a backup. When your omni credits for the hour run out, it automatically falls back to GPT-3.5 model until omni comes back again after the waiting period. My suggestion would be to keep only the very important or crucial discussions and problems for omni and just use GPT-3.5 for regular tasks that don’t require the more intellectual model or greater context window.
As of now, only text and image features are available to use with omni and audio features will be rolling out soon. The app will be able to act as a Her-like voice assistant, responding in real time and observing the world around you. OpenAI and some of their partners already released many demo videos to show off how good it is and it is, indeed, mind-blowing!
On another curious note, the original voice artist of the AI in the movie Her, Scarlett Johansson, had issues with OpenAI regarding the voice of one of the AI voices called Sky. She and many others couldn’t help but notice the similarity between Sky’s voice to hers and threatened to file a lawsuit. Though OpenAI released a statement that Sky wasn’t created from Scarlett’s voice, they had since taken down the AI voice, at least temporarily.
Nevertheless, without going into much technicalities, GPT-4o outperforms every other state of the art multimodal models out there in all benchmarks and the best part is, it’s available for free. If you want me to write about how the model works and how it’s better than other SOTA models in a more technical lens, just let me know in the comments. But now let’s talk about some more new Free features.
Previous Paid Features, Now Free!
Web Search
When ChatGPT was first released it had this data cut-off point of 2019 and it didn’t know anything recent or access the internet to get recent information. Over time the web search feature was added for the plus users. It just opens up so many possibilities and solves issues like hallucinating weird stuffs when it doesn’t know something. Well, now web search is available for free users too and it just takes ChatGPT’s capability to a new level even if you are using just GPT-3.5.
Data Analysis
Now you can upload any of your Excel, CSV, PDF or JSON files to ChatGPT and get comprehensive analytics about your data. It can create interactive tables, visualize using charts, create summaries of your findings and many more things, all for free. Of course there are limits like how many files can be added to a single conversation or the max size of a single file, but used smartly, it can be just the most useful tool in your arsenal that sets you apart from your colleagues or classmates.
Photo Analysis
You can upload any photo you take and have a conversation about it. With this, the only limitation is your imagination. You can troubleshoot why your grill won’t start, take image of a menu in a foreign language and have food recommendations, explore the contents of your fridge to plan a meal, take a picture of an ongoing match on the TV and ask it to explain the rules or analyze a complex graph for work-related data. To focus on a specific part of the image, you can use the drawing tool in the mobile app. Like I said, if you can imagine it, you can do it, all for free.
Chat with your Files
You can upload other files to ChatGPT and have conversation about it. It supports all common file extensions for text files, spreadsheets, presentations, and documents. You can upload any documents and ask ChatGPT to retrieve any information from it, summarize it, compare it to another documents, turn a presentation into a report etc tasks that were just available to plus users before.
Custom GPTs and GPT Store
Custom GPTs are like preprogrammed GPT apps that are more focused to a particular task. You don’t need to write a detailed and specific prompt to make ChatGPT behave in a certain way, custom GPTs in the GPT store should already have something for your need created by someone in the community or OpenAI engineers. Some GPTs are just textual prompts while others are connected to the internet or APIs of other services that turns ChatGPT into a way more powerful tool. Some of my favorite GPTs are:
- Tutor Me
- Code Copilot
- DALL-E
- Data Analyst
- Scholar GPT
- Canva etc
There are just soo many options that it would take an entire article itself to review all the great GPTs out there.
Memories
Not so long ago ChatGPT used to drive me crazy with it’s short context window and memory. Maybe I am having an important discussion about something and after couple of round of back and forth conversation, it forgets everything and asks how can I help you today? Nothing is more infuriating than a dumb AI. Well, to tackle this issue OpenAI added the capability to add memories in ChatGPT, which is much easier than creating a new model with larger context window. During any conversation, you can ask ChatGPT to remember certain things and it will remember them across all your conversations. But it was introduced as a paid feature. In the new release this feature has also been made available for free! You can also customize your ChatGPT by providing information about yourself and how it should respond or address you and things like that. Combining the memory feature with GPT-4o’s longer context window, ChatGPT’s capability to carry on longer and meaningful conversation is now way better than before.
I believe that OpenAI is democratizing AI by Monopolizing it. Does that make any sense? In a nutshell, no other AI product, capable of doing so many things at this insane level, has reached millions of people before ChatGPT. But OpenAI has the capacity of doing that only because they have a monopoly in the AI market. Is that a good thing or a bad thing? I will discuss about that in a future article. For now, just go to your ChatGPT and explore all these new features and make the best use of them as much as possible. Which feature do you think would be the most useful to you? Share your ideas down in the comments!