AMD reports that its older Ryzen mobile 7040 Phoenix and Ryzen mobile 8040 series processors outperform Intel’s Core Ultra Meteor Lake CPUs by up to 79% in various large language models (LLMs). The CPU manufacturer unveiled a plethora of benchmarks against Intel’s Core Ultra 7 155H CPU compared to the Ryzen 7 8740U. Both chips sport hardware-based Neural Processing Units (NPUs).
AMD put together several slides featuring performance results in Mistral 7b, Llama v2 and Mistral Instruct 7B with the two CPUs. In Llama v2 Chat using a Q4 bit size, the Ryzen chip achieved 14% faster tokens per second than the Core Ultra 7 155H. With the same bit size in Mistral Instruct, the Ryzen chips achieved 17% faster tokens per second. In the same LLMs, but looking at Time to First Token for Sample Prompt, AMD’s competitor was 79% faster than the Core Ultra 7 in Llama v2 and 41% faster in Mistral Instruct.
AMD showed another chart of Llama 2 7B Chat using a plethora of different bit sizes, block sizes, and quality levels. On average, the Ryzen 7 7840U was 55% quicker than the Intel counterpart and up to 70% faster in the Q8 results. Despite Q8 being the fastest, AMD recommends a 4-bit K M quantization for running LLMs for real-world use and setting a 5-bit K M for tasks requiring extreme accuracy, like coding.
We are not surprised that AMD is currently winning the AI performance war with Intel. Despite its Ryzen 7040 series architecture having the same level of performance (in TOPS) as Meteor Lake, we discovered late last year that AMD often outperforms Meteor Lake in AI-based workloads. This appears to be a problem with LLM optimization rather than a hardware or driver issue. We noticed AMD notably wins in AI workloads that don’t take advantage of Intel’s OpenVINO framework, which is optimized for Intel products only. OpenVINO appears to be vital to significantly boosting Intel AI performance. Intel’s A770, for instance, gets a tremendous 54% performance improvement purely from OpenVINO optimizations.
Don’t expect this performance behavior to last long. We are only at the beginning of NPU development, after all. If more apps don’t embrace OpenVINO, we expect Intel to switch gears and try a better optimization route—one that will be adopted by more developers. Intel is also getting ready to unleash its next-generation Lunar Lake mobile CPU architecture later this year, which will reportedly feature 3x the AI performance of Meteor Lake (on top of huge IPC improvements for the CPU cores).
For now, AMD’s slides demonstrate that it currently has the edge in NPU performance, especially with its Ryzen 8040 series CPUs, which have even more NPU performance than the Ryzen 7 7840U. But by the end of this year, the tables could turn depending on how successful Intel is with Lunar Lake and its AI optimization plans.