AI & ML


From the editor's desk: Groq – the future of AI processing?

28 March 2025 AI & ML


Peter Howells, Editor

For the past few years, the world has been hit by a storm of AI-generated information, mostly using generative pre-trained transformer (GPT) models to perform the AI inferencing. These models are excellent at performing large language model (LLM) requests, but they do have one drawback. The response time or lag is noticeable. This is largely due to the hardware that these models are being processed on, namely GPUs.

Many of the GPUs running AI models in professional data centres are Nvidia’s A100 (or similar) series which contain thousands of CUDA cores, many more than the handful of processing cores in a standard CPU. These CUDA cores work together to answer the language requests directed at them as they are designed for parallel processing and are optimised for tasks like scientific simulations. But are they really optimised?

Groq seems to be the new kid in all this AI hoopla, but they have been around since 2016 when the company was founded by a group of former Google employees led by Jonathan Ross, one of the designers of the Tensor Processing Unit (TPU), and Douglas Wightman, an engineer at Google X. The TPU is an AI accelerator Application-Specific Integrated Circuit (ASIC), a custom-designed chip tailored for a specific task. These ASICs offer optimised performance and efficiency compared to general-purpose processors.

And this is where the story gets exciting. The Groq AI model runs on ASICs as opposed to GPU architecture to deliver similar responses to the current slew of GPT models in use. Groq’s architecture is developed to expedite machine learning workloads, providing unparalleled speed and efficiency. This is a big deal – Groq needs much less energy to answer these same requests, and more importantly, does it with seemingly no lag. This last property is down to the speed at which the ASICs perform these ‘application-specific’ tasks.

Real-world testing by myself bears this out. When asking exactly the same technical question to both chatGPT 3.5 and 4.0 models and also to Groq, and then comparing the response times, I can without a doubt say that Groq certainly has minimal lag compared to the GPT models. The information produced in the responses is presented in a different format, but compares favourably with each other. Groq’s response is almost immediate whereas the other models take a few seconds before beginning to display an answer.

The introduction of Groq’s ASIC-based approach to AI inferencing marks a significant shift in the landscape of LLMs. By prioritising speed and efficiency, Groq is challenging the current dominance of GPU-driven AI, offering near-instantaneous responses while consuming less power. As AI applications continue to expand, this technology could redefine the way we interact with AI systems, setting a new benchmark responsiveness.

Whether this signals a broader industry shift remains to be seen, but one thing is clear – Groq has introduced a compelling alternative that demands attention.


Credit(s)



Share this article:
Share via emailShare via LinkedInPrint this page

Further reading:

Development kit for AI and edge applications
TRX Electronics AI & ML
Mouser Electronics is now shipping the new Digi ConnectCore MP255 development kit, which boasts a versatile, secure, and cost-effective wireless system-on-module (SOM), designed for maximum power efficiency to support battery-powered and industrial AI applications.

Read more...
From the editor's desk: A challenging manufacturing landscape
Technews Publishing News
Electronic manufacturing in South Africa faces many challenges that limit its potential to compete effectively on the global market, with several obstacles that are impeding its development.

Read more...
New platforms that deliver advanced edge AI capabilities
AI & ML
The SOM-5000, VAB-5000, and ARTiGO A5000 from VIA Technologies are powered by Mediatek Genio and designed for industrial, commercial and consumer applications.

Read more...
Ryzen-based computer on module
Altron Arrow AI & ML
SolidRun announced the launch of its new Ryzen V3000 CX7 Com module, configurable with the eight-core/16-thread Ryzen Embedded V3C48 processor.

Read more...
What is an NPU?
AI & ML
A neural processing unit is a specialised hardware accelerator designed to efficiently process tasks related to artificial intelligence, in particular deep learning models.

Read more...
Low SWaP SoM for AI applications
RFiber Solutions AI & ML
Matchstiq’s G20 and G40 are low SWaP-C SDRs tailored for AI and ML applications by combining an RF module, SDR, FPGA, CPU, and GPU into a single transceiver platform.

Read more...
Powering the intelligent edge
EBV Electrolink AI & ML
STMicroelectronics released new devices from the second generation of its industrial MPUs, the STM32MP2 series, to drive future progress in smart factories, smart healthcare, smart buildings, and smart infrastructure.

Read more...
From the editor's desk: Trekkie on my mind
Technews Publishing Editor's Choice
This year’s exciting announcement was in the non-terrestrial network sector with many NTN chips being released, promising communications from anywhere on Earth.

Read more...
A perfect match for next-gen computing
Vepac Electronics AI & ML
Teguar’s collaboration with Hailo marks a significant step forward in their mission to provide powerful and reliable computing solutions for a wide range of industries.

Read more...
High-performance low SWaP SDR
RFiber Solutions AI & ML
The Matchstiq X40 from Epiq is a high-performance low SWaP SDR optimised for AI and ML and the RF edge.

Read more...