【AI with Oliver】OpenAI GPT 4 Breaking News Visual ChatGPT coming with pictures and editing videos. Stay Tuned!

AI with Oliver ：OpenAI GPT 4 Breaking News Visual ChatGPT coming with pictures and editing videos. Stay Tuned!

Is Open AI going to publicly release GPT 4 or Visual ChatGPT. A new research paper was posted by artificial intelligence engineers from Microsoft Asia detailing the details of Visual ChatGPT which allows people to enter prompts and tell ChatGPT to edit images and totally transform them into different pictures or styles. The 22 new Visual Foundational Models can be sequenced in steps to generate a thought pattern which enables ChatGPT to self check until it gets the answer. Through natural language processing chatGPT understands the changes in the image descriptions and edits them until completion. Is this the future of AI Videos? Are graphic designers even needed anymore? Will this replace jobs or boost productivity to new heights?
AI with Oliver explores this in detail. Subscribe for more breaking news about AI and technology.

Visual ChatGPT (potentially GPT 4) will be built by designing a series of prompts to inject visual information of pictures into ChatGPT, which can then solve the visual questions step-by-step.

Researchers from Microsoft Asia just released a paper and GitHub link on March 8, 2023 outlining Visual ChatGPT and the math behind it.

It will use force thinking to invoke more Visual Foundational Models (VFM) to get the desired result and self-check by generating “thoughts” like Thought: does this picture show an apple and a water glass? If yes, proceed. If no, use different VFM to regenerate this step.

“By forcing Visual ChatGPT to say “Thought: Do I need to use a tool?”, M(Q) makes it easier to pass regex match correctly. In contrast, without force thinking, A3 may wrongly generate the end of thoughts token and directly consider all of its ChatGPT outputs as the final response.”

The sequenced thoughts will occur behind the scenes until a self-check is passed successfully to fulfil the users request of the answer result they want from the picture.

“With the chained naming rule, Visual ChatGPT can recognize the file type, trigger the correct VFM, and conclude the file dependency relationship naming rule”

The new idea is to convert VFMs into language and have a multi-step dialogue with ChatGPT outlining the new intra-step changes with worded image descriptions while outlining the dependencies with new image names.
This process uses more tokens and may take longer because one question may utilize the same tokens and compute as 20 questions.

Perhaps a limitation on the number of VFMs used will be imposed and a premium more costly version will provide access to more VFMs. The VisualChatGPT they outline can be run on 4 NVIDIA V100 GPU’s.

Will OpenAI release VisualChatGPT or GPT 4 this week and will they charge $50 a month for the new picture editor? Since Videos are just motion pictures or 60 pictures per second, will Video Generation be released as well? Are graphic designers out of a job or will they triple their productivity?

Github Visual ChatGPT
https://github.com/microsoft/visual-c…

Microsoft Asia Research Paper Visual ChatGPT
https://arxiv.org/pdf/2303.04671.pdf

#ai #aitools #aijourney #chatgpt #gpt4 #openai #microsoft #visualeffects #visualGPT #graphicdesign #news #ainews #future @AIwithOliver