10 ways GPT-4 is impressive but still flawed

AI Basics

The system seemed to respond appropriately. But the answer didn’t take into account the doorway height. This can also prevent tanks and cars from passing.

OpenAI CEO Sam Altman says the new bot can reason “a little bit.” But that reasoning skill falls apart in many situations. Earlier versions of ChatGPT recognized that height and width mattered, so handled this question a bit better.

According to OpenAI, the new system could put students in the top 10% or so of the unified bar exam to qualify as a bar in 41 states and territories. According to the company’s tests, he scored 1,300 (out of 1,600) on the SAT and scored 5 (out of 5) on high school biology, calculus, macroeconomics, psychology, statistics, and history exams. can be obtained.

Earlier versions of the technology did not pass the Unified Bar Exam and did not score well on most Advanced Placement tests.

On a recent afternoon, Brockman demonstrated his testing skills by posing the new bot with a few paragraphs of bar exam questions about a man who owns a diesel truck repair business.

The answer was correct, but full of legal jargon. So Brockman asked the bot to explain the answer in plain English to a layman. It did too.

The new bot seemed to reason about things that had already happened, but wasn’t very good when asked to make hypotheses about the future. It seemed that he was using what the other person said.

When Dr. Etzioni asked the new bot, “What are the key problems to be solved in NLP research in the next decade?” could not formulate a completely new idea.

New bots still make stuff. Called “hallucinations,” this problem plagues all major chatbots. The system doesn’t understand what’s true and what’s not, so it can generate completely wrong text.

Asking for the address of a website describing the latest cancer research could generate a non-existent Internet address.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *