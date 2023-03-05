OpenAI's ChatGPT has taken the world by storm with the AI chatbot managing to answer complex questions and crack tests. However, ChatGPT has most recently failed to crack the Indian Civil Services (UPSC) exam of 2022.

ChatGPT, nonetheless, has succeeded in several other tests including a Wharton MBA exam, the United States Medical Licensing Examination (USMLE), and the Multistate Bar Exam (MBE).

The ChatGPT chatbot has managed to pass some of the tougher tests out there, but they have also exposed its weaknesses. Here is a look at the major tests ChatGPT passed and an analysis of its results.

Wharton's MBA exam

Wharton professor Christian Terwiesch gave ChatGPT a test on the Operations Management course and in his paper, assessed "OpenAI’s Chat GPT3 has shown a remarkable ability to automate some of the skills of highly compensated knowledge workers in general and specifically the knowledge workers in the jobs held by MBA graduates including analysts, managers, and consultants."

In his analysis, the professor noted "First, it does an amazing job at basic operations management and process analysis questions including those that are based on case studies. Not only are the answers correct, but the explanations are excellent. Second, Chat GPT3 at times makes surprising mistakes in relatively simple calculations at the level of 6th grade Math. These mistakes can be massive in magnitude. Third, the present version of Chat GPT is not capable of handling more advanced process analysis questions, even when they are based on fairly standard templates."

ChatGPT's performance was rated B to B-, with the Wharton professor noting, "Until Wharton allowed students more flexibility in which courses they take, this Operations Management course was a required course that every student had to take. However, we did allow students to waive this course if they could demonstrate content mastery on a waiver exam. The performance of Chat GPT3 reported above would have been sufficient to pass the waiver exam, though by a very small margin."

United States Medical Licensing Examination (USMLE)

The United States Medical Licensing Examination is considered to be one of the toughest tests in the country and is conducted in three stages. It usually requires four years of preparation to pass and the pass threshold, though changing every year, is approximately 60 per cent.

Initially, ChatGPT managed 46 per cent accuracy without any prompting. However, with more model training performance improved to over 50 per cent in all examinations, crossing 60 per cent in most analysis.

"Therefore, ChatGPT is now comfortably within the passing range. Being the first experiment to reach this benchmark, we believe this is a surprising and impressive result," the paper noted.

The test scored the lowest in Step 1, followed by Step 2CK, and then Step 3. Most of those who take the test also consider the first step being the toughest of the lot.

Multistate Bar Exam (MBE)

In the MBE, ChatGPT managed 50 per cent score accuracy and is the first of three tests after which one is eligible to practice law. A paper analysing its performance notes ChatGPT "significantly outperformed the baseline rate of random guessing", and adds, "Without any fine-tuning, it currently achieves a passing rate on two categories of the Bar and achieves parity with human test-takers on one."

ChatGPT is constantly trying to improve, with fine-tuned algorithms, more dataset training, and an attempt to interact with humans better. When asked what ChatGPT is doing to better itself, it notes down the points below.

OpenAI has also stated that ChatGPT has only been fed data till 2021, so anything after that falls beyond the scope of its core knowledge. Further, ChatGPT is expected to improve more once it has access to the internet.