Connect with us

Software Technology

OpenAI’s next-gen AI-Language Model, GPT-4 can now understand both text and Images!



OpenAI has just released its latest AI model, GPT-4 which is an improvement of the previous version and is now available in ChatGBT and Bing.

OpenAI recently announced the release of its latest AI language model, GPT-4, after several months of rumours and speculation. The company boasts that this model is more innovative and collaborative than its predecessors and has greater accuracy in solving complex problems. GPT-4 has the ability to analyse both text and image input but can only respond in text form. However, OpenAI warns that the model has similar issues to previous language models, such as generating fake information or “hallucinating” and creating harmful or violent text. Despite this, the release of GPT-4 represents a significant advancement in the field of AI language models and has the potential to power a range of new applications.

OpenAI has forged partnerships with several firms, including Duolingo, Stripe, and Khan Academy, to integrate GPT-4 into their products. The latest model is accessible to the general public through the ChatGPT Plus subscription, costing $20 per month, and is currently powering Microsoft’s Bing chatbot. Additionally, it is available as an API for developers to utilize, and the waitlist for access, according to OpenAI, will start admitting users today.

In a research blog post, OpenAI noted that the difference between GPT-4 and its predecessor, GPT-3.5, is subtle in everyday conversations. Sam Altman, the CEO of OpenAI, remarked on Twitter that while GPT-4 is still imperfect and constrained, it appears to be more impressive during the first use than it does with additional usage.

OpenAI has claimed that GPT-4’s advancements are apparent in the model’s performance across various tests and benchmarks, such as the Uniform Bar Exam, LSAT, SAT Math, and SAT Evidence-Based Reading & Writing exams. GPT-4 has achieved scores in the 88th percentile and above in the mentioned tests, and a complete list of exam scores is available here. Despite speculation regarding GPT-4’s capabilities over the past year, OpenAI’s announcement suggests that the improvements are more incremental, as the company had cautioned previously.

OpenAI CEO, Sam Altman, expressed his reservations about the hype surrounding GPT-4’s release in an interview back in January, stating that “People are begging to be disappointed, and they will be.” Altman further stated that “The hype is just like… We don’t have an actual AGI, and that’s sort of what’s expected of us.”

Speculation surrounding GPT-4’s release was fueled further last week when a Microsoft executive unintentionally divulged in an interview with the German press that the system would launch this week. The executive also hinted at the system’s multimodal capabilities, which means it can generate various forms of content beyond text. AI researchers believe that multi-modal systems that integrate text, audio, and video offer the most promising approach to building more capable AI systems.

OpenAI has confirmed that GPT-4 is indeed multimodal but operates in fewer mediums than initially predicted. The system can accept inputs in both text and image formats and produce text outputs. OpenAI states that the model’s capacity to interpret text and images simultaneously enables it to handle more intricate inputs. The company has released samples that demonstrate the system’s ability to explain memes and unusual images, as shown below:

The evolution of GPT models:

The first model, GPT-1, was introduced in 2018, followed by GPT-2 in 2019, and GPT-3 in 2020. These models are trained on large datasets of text, typically scraped from the internet, and use statistical patterns to predict the next word in a sentence. Despite its simple mechanism, GPT has proven to be a highly flexible system capable of generating, summarizing, and rephrasing writing, as well as performing other text-based tasks like translation and code generation.

OpenAI initially delayed the release of their GPT models over concerns that they would be used for malicious purposes such as generating spam and misinformation. However, in late 2022, the company launched ChatGPT, a conversational chatbot based on GPT-3.5 that was made available to the public. This launch triggered a wave of interest in AI chatbots, with Microsoft following suit with their own AI chatbot called Bing, and Google scrambling to catch up.

The wider availability of these AI language models has raised concerns about potential problems and challenges. The education system, for example, is still grappling with the existence of software that can write college essays. Online platforms like Stack Overflow and sci-fi magazine Clarkesworld have had to close submissions due to the influx of AI-generated content, and early uses of AI writing tools in journalism have been rocky at best. Nevertheless, some experts believe that the harmful effects have been less severe than anticipated.

In announcing the launch of GPT-4, OpenAI emphasized that the system had undergone six months of safety training and that in internal tests, it was less likely to respond to disallowed content and more likely to produce factual responses than GPT-3.5. However, the system still has its limitations and can output harmful content. For instance, Microsoft revealed that its Bing chatbot was powered by GPT-4 all along, and many users found ways to bypass Bing’s safeguards, leading to dangerous advice, threats, and misinformation. Furthermore, GPT-4 lacks knowledge about events that occurred after September 2021, when the majority of its data was cut off.


Copyright © 2023 Futurfeed