ChatGPT passes radiology board exam, but still has limitations

The latest version of the artificial intelligence chatbot ChatGPT has passed a radiology board-style examination.

Researchers put ChatGPT to the test using 150 multiple-choice questions modelled on the Canadian Royal College and American Board of Radiology exams.

This breakthrough underscores the vast potential of AI in medical fields, yet it also reveals certain limitations that affect its dependability, two studies said.

ChatGPT, a deep-learning model developed by OpenAI, is known for generating humanlike responses based on the input it receives.

Its pattern recognition abilities allow it to interpret and respond to vast amounts of data, but it sometimes produces factually incorrect responses because of the absence of a source of truth in its training data.

“The use of large language models like ChatGPT is rapidly expanding and will only continue to grow,” said Dr Rajesh Bhayana, an abdominal radiologist and technology lead at University Medical Imaging Toronto, Toronto General Hospital.

“Our research offers valuable insight into how ChatGPT performs in a radiology setting, emphasising its immense potential while shedding light on current reliability issues.”

ChatGPT's usage and influence have been growing significantly. Notably, it was recently named the fastest-growing consumer application in history. It is also being integrated into popular search engines like Google and Bing, which both physicians and patients use for medical inquiries.

Argentine artist Sofia Crespo holds one of her works as she poses for a photo at the Estrela garden in Lisbon on June 8, 2022. AFP
Sofia Crespo creates her works with the help of artificial intelligence. AFP
She is part of the 'generative art' movement, where humans create rules for computers which then use algorithms to generate new forms, ideas and patterns. AFP
Sofia Crespo holds one of her works as she poses for a photo. AFP
Undated handout photo issued by Aidan Meller of a portrait of Queen Elizabeth II, painted by an ultra-realistic humanoid robot artist. The painting, titled 'Algorithm Queen', was painted by Ai-Da robot, an AI robot built in 2019 that creates drawings, paintings and sculptures.
The Ai-Da robot, the world's first ultra-realistic humanoid robot, on view for a show organised by the Concilio Europeo dell'Arte (Council of Europe) at the 59th International Art Exhibition in Venice, Italy, on April 20, 2022. EPA
Ai-Da paints an image during a photocall in central London. AFP

The AI chatbot managed to correctly answer 69 per cent of the questions, just short of the passing grade of 70 per cent.

However, it showed a noticeable gap in performance between lower-order thinking (84 per cent) and higher-order thinking questions (60 per cent), particularly struggling with descriptions of imaging findings, calculations and classifications, and the application of concepts.

Given that the AI has not received any radiology-specific training, these struggles were not unexpected.

A newer version — GPT-4 — was released in March, the release was an improved version of the AI including enhanced advanced reasoning capabilities. In a follow-up study, GPT-4 answered 81 per cent of the same questions correctly, exceeding the passing threshold and outperforming its predecessor, GPT-3.5.

Despite these improvements, GPT-4 did not show any progress on lower-order thinking questions and answered 12 questions incorrectly that GPT-3.5 had answered correctly. This inconsistency raises questions about the AI's reliability in information gathering.

“ChatGPT gave accurate and confident answers to some challenging radiology questions, but then made some very illogical and inaccurate assertions,” said Dr Bhayana.

“Given how these models function, the inaccurate responses should not be surprising.”

The studies noted a tendency of ChatGPT to produce inaccurate responses, termed hallucinations. Although less frequent in GPT-4, this tendency still limits the chatbot's current usability in medical education and practice.

Despite the limitations, the researchers see potential in using ChatGPT to spark ideas and aid in the medical writing process and data summarisation, as long as the information is fact-checked.

“To me, this is its biggest limitation. At present, ChatGPT is best used to spark ideas, help start the medical writing process and in data summarisation. If used for quick information recall, it always needs to be fact-checked,” Dr Bhayana said.

Interview

Lebanese author Elias Khoury: 'I feel that we are beyond despair'

Jetour T1 specs

Engine: 2-litre turbocharged

Power: 254hp

Torque: 390Nm

Price: From Dh126,000

Available: Now

Ten tax points to be aware of in 2026

1. Domestic VAT refund amendments: request your refund within five years

If a business does not apply for the refund on time, they lose their credit.

2. E-invoicing in the UAE

Businesses should continue preparing for the implementation of e-invoicing in the UAE, with 2026 a preparation and transition period ahead of phased mandatory adoption.

3. More tax audits

Tax authorities are increasingly using data already available across multiple filings to identify audit risks.

4. More beneficial VAT and excise tax penalty regime

Tax disputes are expected to become more frequent and more structured, with clearer administrative objection and appeal processes. The UAE has adopted a new penalty regime for VAT and excise disputes, which now mirrors the penalty regime for corporate tax.

5. Greater emphasis on statutory audit

There is a greater need for the accuracy of financial statements. The International Financial Reporting Standards standards need to be strictly adhered to and, as a result, the quality of the audits will need to increase.

6. Further transfer pricing enforcement

Transfer pricing enforcement, which refers to the practice of establishing prices for internal transactions between related entities, is expected to broaden in scope. The UAE will shortly open the possibility to negotiate advance pricing agreements, or essentially rulings for transfer pricing purposes.

7. Limited time periods for audits

Recent amendments also introduce a default five-year limitation period for tax audits and assessments, subject to specific statutory exceptions. While the standard audit and assessment period is five years, this may be extended to up to 15 years in cases involving fraud or tax evasion.

8. Pillar 2 implementation

Many multinational groups will begin to feel the practical effect of the Domestic Minimum Top-Up Tax (DMTT), the UAE's implementation of the OECD’s global minimum tax under Pillar 2. While the rules apply for financial years starting on or after January 1, 2025, it is 2026 that marks the transition to an operational phase.

9. Reduced compliance obligations for imported goods and services

Businesses that apply the reverse-charge mechanism for VAT purposes in the UAE may benefit from reduced compliance obligations.

10. Substance and CbC reporting focus

Tax authorities are expected to continue strengthening the enforcement of economic substance and Country-by-Country (CbC) reporting frameworks. In the UAE, these regimes are increasingly being used as risk-assessment tools, providing tax authorities with a comprehensive view of multinational groups’ global footprints and enabling them to assess whether profits are aligned with real economic activity.

Contributed by Thomas Vanhee and Hend Rashwan, Aurifer

The five pillars of Islam

1. Fasting

2. Prayer

3. Hajj

4. Shahada

5. Zakat

More from this package

UAE 'urgently needs crisis helplines' for teenagers struggling with mental health

Netflix suicide drama 13 Reasons Why prompts young people in UAE to discuss mental health

Dubai teen campaigns for launch of a suicide helpline in UAE

Global state-owned investor ranking by size

1.	United States
2.	China
3.	UAE
4.	Japan
5	Norway
6.	Canada
7.	Singapore
8.	Australia
9.	Saudi Arabia
10.	South Korea

MATCH INFO

CAF Champions League semi-finals first-leg fixtures

Tuesday:

Primeiro Agosto (ANG) v Esperance (TUN) (8pm UAE)
Al Ahly (EGY) v Entente Setif (ALG) (11PM)

Second legs:

October 23

Learn more about Qasr Al Hosn

In 2013, The National's History Project went beyond the walls to see what life was like living in Abu Dhabi's fabled fort:

On Women's Day

Dr Nawal Al-Hosany: Why more women should be on the frontlines of climate action

Shelina Janmohamed: Why shouldn't a spouse be compensated fairly for housework?

Samar Elmnhrawy: How companies in the Middle East can catch up on gender equality

The National Editorial: Is there much to celebrate on International Women's Day 2021?

The specs: 2017 Porsche 718 Cayman

Price, base / as tested Dh222,500 / Dh296,870

Engine 2.0L, flat four-cylinder

Transmission Seven-speed PDK

Power 300hp @ 6,500rpm

Torque 380hp @ 1,950rpm

Fuel economy, combined 6.9L / 100km

While you're here

Ibrahim Al Zubi: Circular economy can shape a bright future for the Mena region

Ambika Vishwanath and Karma Ekmekji: A simple trick to save billions of litres of water

Maram Ahmed: The Middle East is thirsty for solutions to water scarcity

About Takalam

Date started: early 2020

Founders: Khawla Hammad and Inas Abu Shashieh

Based: Abu Dhabi

Sector: HealthTech and wellness

Number of staff: 4

Funding to date: Bootstrapped

CHELSEA'S NEXT FIVE GAMES

Mar 10: Norwich(A)

Mar 13: Newcastle(H)

Mar 16: Lille(A)

Mar 19: Middlesbrough(A)

Apr 2: Brentford(H)

Famous left-handers

- Marie Curie

- Jimi Hendrix

- Leonardo Di Vinci

- David Bowie

- Paul McCartney

- Albert Einstein

- Jack the Ripper

- Barack Obama

- Helen Keller

- Joan of Arc

While you're here

Sholto Byrnes: Here's how this century can still belong to Asia

Brahma Chellaney: South China Sea is Asean's Achilles heel

The National Editorial: Territorial disputes require a mediator

Our family matters legal consultant

Name: Hassan Mohsen Elhais

Position: legal consultant with Al Rowaad Advocates and Legal Consultants.

Tank warfare

Lt Gen Erik Petersen, deputy chief of programs, US Army, has argued it took a “three decade holiday” on modernising tanks.

“There clearly remains a significant armoured heavy ground manoeuvre threat in this world and maintaining a world class armoured force is absolutely vital,” the general said in London last week.

“We are developing next generation capabilities to compete with and deter adversaries to prevent opportunism or miscalculation, and, if necessary, defeat any foe decisively.”