Blogs
Inferencing vs Learning: Why a Shift in the Way AI is Being Used will Shape the Infrastructure that Supports It
Summary: This blog explains the difference between AI learning and inferencing, why we predict inferencing will soon dominate AI workloads, and how this shift may reshape data center infrastructure, power demand, and facility design.
Inferencing is the 'doing' task of an AI model. It occurs once an AI model has been trained. The AI tool (i.e., Claude, ChatGPT, Co-Pilot, etc.) is trained in how it pieces together sentences, where to find information, how to format the information, and how to give it to you.
So, when you log in to your favorite AI tool's interface, you are leveraging the extensive energy-consuming training the model has been performing in the background. The responses it generates are considered inferences. Inferences are the "easy" task of the model's job.
Learning and Inferencing - Prominent in AI tools and football
I’m a football fan. And it doesn’t take a fan to know a lot goes into winning a game.
Learning in Football
Practices, weight room days, and film sessions are all part of the team getting better, preparing to execute a game plan. You may run a play thousands of times to get it right, only to run it once on Sunday. That’s a ton of energy and effort put into training to succeed on gameday. Training, in the context of AI tools, is the Learning phase.
Inference in Football
What happens on the field on gameday is the result of the training. The thousands of repetitions of a play leads to scoring a touchdown. Scoring a touchdown is Inferencing.
This is similar to the processes required by an AI model. Training, or the learning phase of an AI model, is an energy-intensive, power-demanding process. Like football, AI requires extensive repetition (oftentimes millions of repetitions) of learning steps in the background to give you a clear answer or result. The response AI gives you is its touchdown moment. Or, inferencing.
As a whole, the AI model is your offense. You're the coach who calls plays, and you expect to see a great result.
What happens on the field on gameday is the result of the training. The thousands of repetitions of a play leads to scoring a touchdown. Scoring a touchdown is Inferencing.
How Inference Affects Data Center Infrastructure
To continue this football analogy, the data center industry has experienced a massive boom in newly built factories, which we can think of as the offseason training camps. Now, we're inching closer to the regular season, where the efforts of our training can be executed.
That's how I view the upcoming split of AI's tasks and, by extension, energy consumption. Right now, we can estimate that AI tasks are 80% learning and 20% inference. Industry experts believe the balance will flip over the next year or so: 80% for inference and 20% for learning and refinement.
When I heard this information, I had plenty of questions:
Why will AI tools infer more than learn in the near future?
Because it's time! This was the plan all along: to train these models rigorously and observe real-time improvements in the responses as training deepens. The two tasks will continue to coexist, but the information sent into the model is primarily for refinement and updating rather than for learning everything from scratch.
Adoption of AI across different use cases is also increasing the models' versatility. If a model has been trained solely on coding, it would take a long time to train it on a payroll model. However, since many leaders leverage the same models for different tasks, they do not need to be trained as much when you are trying to throw a curveball at them.
How will data center power demands be impacted by inferencing becoming more prominent than learning in AI response models?
Since the industry hasn’t made the shift yet, there are competing theories.
- Some believe energy demand will decrease because large training events won't be held as often. AI factories are designed for training. They are powerful enough to handle the massive demand on the grid that learning requires, so a relief on training activity could result in relief on the grid. If nothing changes in an AI factory during this transition, the inference workload will be a walk in the park, as it imposes a lighter, more consistent workload.
- Others believe efficiency gains will instead be used to increase density. More efficient chips and better‑optimized models allow more compute to be deployed in the same space, maximizing the investment in the real estate. This would result in either a maintained or increased power demand as performance improves.
How can newly built facilities prepare for this shift?
Be flexible. Flexibility in the type of cooling needed, where the racks are located on the floor, density, and more. A modular approach future-proofs a new build. Additionally, requiring a higher inlet temperature for upcoming chip models would allow a facility to adopt liquid cooling more easily without the need for a chiller. This is a solid strategy because of the unpredictability of this vertical. I do, however, believe that site selection will be important moving forward. This is because latency is a critical part of inferencing. A model can train from anywhere, but users expect a speedy response, which means location is critical.
Retrofitting vs. Modular New Builds for an Inference-Heavy AI Landscape
So, it sounds like the ever-changing data center industry is going to change yet again. We can never truly predict how the market will react to this change, but we can make some guesses.
The industry continues to face the same challenges: lack of power, limited space to build, and a need for quick response. Could we start seeing retrofit projects popping up? Repurposing existing or decommissioned facilities in or near cities could address latency issues. These facilities are also connected to the grid, so you could deploy very quickly without much red tape. However, these are older facilities and have limitations on the amount of compute they can house, not to mention that retrofitting for power requirements can take as long as it would for a greenfield facility in some places.
Maybe this is another use case for a modular system. As stated above, flexibility is key, and maybe the response to fluctuating demands is fluctuating infrastructure. With these systems, we’d be bumping up against those who want to generate as much compute capacity as possible in any envelope they can.
Retrofits, greenfield, or modular deployments can all address part of the challenge, but maybe there is something else that's yet to come that will check every box. If that's the case, I hope they're using Mitsubishi Electric hardware inside.
Want notifications for new Blogs, White Papers, Case Studies & Webinars?
Contributors
Pete Byrnes
Industry Marketing Manager (Data Center)
Mitsubishi Electric Automation, Inc.
-
Inquiries
-
Select
& Quote -
Share
-
Partners