Intel's LLM-Scaler-vLLM 0.14.0-b8.1 adds support for Qwen3.5-27B, Qwen3.5-35B-A3B, and Qwen3.5-122B-A10B models on Arc Graphics hardware.
Intel has released an updated version of its LLM-Scaler-vLLM software, expanding support for Qwen models on Arc Graphics hardware. The new release, version 0.14.0-b8.1, builds upon Intel's ongoing efforts to optimize large language model deployment on their GPUs through the Battlematrix driver enhancements.
Expanded Model Support
The latest update adds support for several Qwen3.5 models, including:
- Qwen3.5-27B
- Qwen3.5-35B-A3B
- Qwen3.5-122B-A10B (available in both FP8 and INT4 precision)
Additionally, Qwen3-ASR-1.7B has been added to the supported model list. This expansion significantly broadens the range of AI workloads that can be efficiently run on Intel Arc hardware.
Technical Foundation
LLM-Scaler-vLLM leverages the vLLM framework, which has become a popular choice for deploying large language models due to its optimized inference capabilities. The software is distributed as a Docker-based deployment setup, making it accessible for users who want to experiment with LLMs without complex configuration.
This release continues Intel's strategy of enhancing their Arc Graphics ecosystem for AI workloads. The improvements are built on Project Battlematrix, Intel's driver enhancement initiative that has been ongoing for the past year, focusing on optimizing graphics driver performance for machine learning tasks.
Availability
The updated LLM-Scaler-vLLM 0.14.0-b8.1 is available through Intel's GitHub repository, where users can find download links and detailed setup instructions. This release represents Intel's commitment to providing competitive AI acceleration options for their hardware platform, particularly as the demand for local LLM deployment continues to grow.
For developers and researchers working with Qwen models, this update provides a streamlined path to deploying these models on Intel Arc hardware, potentially offering an alternative to NVIDIA's CUDA ecosystem for certain workloads.

Comments
Please log in or register to join the discussion