Visualizing the Training Costs of AI Models Over Time |

[ccpw id="5"]

HomeTechnologyVisualizing the Training Costs of AI Models Over Time

Visualizing the Training Costs of AI Models Over Time


Visualizing the Training Costs of AI Models Over Time

Training advanced AI models like OpenAI’s ChatGPT and Google’s Gemini Ultra requires millions of dollars, with costs escalating rapidly.

As computational demands increase, the expenses for the computing power necessary to train them are soaring. In response, AI companies are rethinking how they train generative AI systems. In many cases, these include strategies to reduce computational costs given current growth trajectories.

This graphic shows the surge in training costs for advanced AI models, based on analysis from Stanford University’s 2024 Artificial Intelligence Index Report.

How Training Cost is Determined

The AI Index collaborated with research firm Epoch AI to estimate AI model training costs, which were based on cloud compute rental prices. Key factors that were analyzed include the model’s training duration, the hardware’s utilization rate, and the value of the training hardware.

While many have speculated that training AI models has become increasingly costly, there is a lack of comprehensive data supporting these claims. The AI Index is one of the rare sources for these estimates.

Ballooning Training Costs

Below, we show the training cost of major AI models, adjusted for inflation, since 2017:

YearModel NameModel Creators/ContributorsTraining Cost (USD)
2019RoBERTa LargeMeta$160,018
2020GPT-3 175B (davinci)OpenAI$4,324,883
2021Megatron-Turing NLG 530BMicrosoft/NVIDIA$6,405,653
2022PaLM (540B)Google$12,389,056
2023Llama 2 70BMeta$3,931,897
2023Gemini UltraGoogle$191,400,000

Last year, OpenAI’s GPT-4 cost an estimated $78.4 million to train, a steep rise from Google’s PaLM (540B) model, which cost $12.4 million just a year earlier.

For perspective, the training cost for Transformer, an early AI model developed in 2017, was $930. This model plays a foundational role in shaping the architecture of many large language models used today.

Google’s AI model, Gemini Ultra, costs even more, at a staggering $191 million. As of early 2024, the model outperforms GPT-4 on several metrics, most notably across the Massive Multitask Language Understanding (MMLU) benchmark. This benchmark serves as a crucial yardstick for gauging the capabilities of large language models. For instance, its known for evaluating knowledge and problem solving proficiency across 57 subject areas.

Training Future AI Models

Given these challenges, AI companies are finding new solutions for training language models to combat rising costs.

These include a number of approaches, such as creating smaller models that are designed to perform specific tasks. Other companies are experimenting with creating their own, synthetic data to feed into AI systems. However, a clear breakthrough is yet to be seen.

Today, AI models using synthetic data have shown to produce nonsense when asked with certain prompts, triggering what is referred to as “model collapse”.


Please enter your comment!
Please enter your name here


Where are the World’s Largest Solar Power Plants?

Visualizing the World’s Largest Solar Power Plants Devastating wildfires, rising temperatures, and warming seas across the world have added renewed vigor to the global quest for...

Visualized: Global Clean Energy Spending Forecasts (2022-2030)

Global Clean Energy Spending Forecasts (2022-2030) This was originally posted on our Voronoi app. Download the app for free on Apple or Android and discover incredible data-driven charts from a variety of...

The Periodic Table of Commodity Returns (2014-2023)

The Periodic Table of Commodity Returns (2014-2023) It was a challenging year for commodity returns in 2023. But there were a few exceptions. Gold was a standout...

Will Direct Lithium Extraction Disrupt the $90B Lithium Market?

Will Direct Lithium Extraction Disrupt the $90B Lithium Market? Current lithium extraction and refinement methods are outdated, often harmful to the environment, and ultimately inefficient. So...

Most Popular