Fibocom Launches 8-Core USB AI Dongle for On-Device LLMs on PCs

AI-generated Image for Illustration Only (Credit: Jacky Lee)

Fibocom Wireless, a Shenzhen-headquartered maker of wireless communication modules, has unveiled a compact AI dongle designed to run large language models (LLMs) and other AI workloads directly on devices such as laptops and network-attached storage (NAS) systems.

Announced on 20 November 2025, the USB-powered “AI Dongle” is pitched as a plug-and-play way to add local AI acceleration without relying on cloud infrastructure, aiming to reduce latency and improve data privacy for both consumers and small businesses.

Qualcomm QCS6490 at the Core

The dongle is built around Qualcomm’s QCS6490 system-on-chip, an octa-core processor with an integrated Hexagon AI engine and Adreno GPU, positioned for mid-range edge AI applications. Fibocom says the device can handle tasks ranging from question-answering assistants and text-to-image search to meeting transcription and summarisation, with real-time on-device LLM inference as a key use case.

Physically, the unit connects via a standard USB port, drawing power directly from the host and requiring no separate power brick or dedicated drivers on mainstream operating systems. According to Fibocom, it works with typical PCs and NAS devices to add a dedicated AI compute path alongside general compute and storage.

On the software side, the dongle is tightly coupled to the Fibocom AI Stack, a software platform the company introduced at CES 2025. The stack combines model management, SDKs and tooling to deploy audio, vision and language models on Fibocom’s AI modules, and can access popular models through OpenAI APIs while also supporting on-device open-source models and agent frameworks such as Tencent’s Youtu-Agent.

Willson Liu, general manager of Fibocom’s AI business unit, said in the launch announcement that the product is intended to make it easier for developers and device makers to add “portable, user-friendly” AI features to existing hardware, while keeping data local.

From Connectivity Modules to Edge AI

Founded in 1999, Fibocom built its business supplying cellular and wireless modules into laptops, industrial devices and IoT equipment, and listed on the Shenzhen Stock Exchange in 2017 (stock code 300638.SZ). It now also lists 0638.HK as its Hong Kong code.

Over the past several years the company has repositioned itself as an “AI + connectivity” vendor. At Computex 2024 and other shows, Fibocom showcased on-device AI solutions based on Qualcomm’s QCS8550 and QCM6490 platforms for robots, smart retail, industrial automation and in-vehicle systems, capable of running multi-billion-parameter open-source models locally.

The new USB dongle extends that strategy from embedded and industrial systems toward more general-purpose PCs and NAS devices, offering a way to experiment with on-device LLMs without redesigning the host hardware.

How It Compares With Other Edge AI Add-Ons

Fibocom’s dongle enters a growing niche of small form-factor accelerators for local AI inference:

  • Google Coral USB Accelerator – A USB stick that adds Google’s Edge TPU co-processor (around 4 TOPS INT8) to an existing system, primarily optimised for vision and classification workloads using TensorFlow Lite models. It has seen wide use in cameras, sensors and hobbyist projects, but is not specifically packaged for LLM workloads.

  • Intel Neural Compute Stick 2 (NCS2) – Based on Intel’s Movidius Myriad X VPU, originally launched in 2018 as a USB accelerator for prototyping deep-learning models at the edge. It remains available with updated software support, but its architecture and performance are now considered entry-level compared with more recent edge AI systems.

  • Silex EP-200Q module – A recently introduced system-on-module built on the same Qualcomm QCS6490 silicon, aimed at OEMs designing smart cameras, robots and medical devices. It offers similar 12-TOPS-class AI performance but is delivered as an embedded module rather than a general-purpose USB dongle.

At the higher end, Nvidia’s Jetson Orin Nano platforms deliver up to around 40 TOPS of INT8 performance (and, in the newer “Super” configuration, up to 67 TOPS) at higher power budgets, targeting robotics, industrial systems and more complex generative AI workloads.

Cisco, meanwhile, has moved in a different direction with its Unified Edge platform – a rack-scale, on-premises edge system with up to 120 TB of storage and integrated networking to run AI workloads locally at sites such as retail stores and factories. It is priced and specified for enterprise deployments rather than individual devices.

In that landscape, Fibocom’s dongle sits between hobbyist-style USB accelerators and full embedded systems: it uses similar silicon to industrial modules but comes in a consumer-friendly USB form factor, with Fibocom’s AI Stack positioned as the differentiator for deployment and model management.

Edge AI on the Rise

Although hard shipment numbers for “AI dongles” are not broken out in public datasets, multiple analyst houses agree that spending on edge computing and edge AI infrastructure is rising quickly. IDC estimates that global spending on edge computing solutions will reach around US$261 billion in 2025, with continued double-digit growth through 2028 as organisations push more analytics and AI closer to where data is generated.

Separate forecasts from Global Insight Services suggest that the installed base of edge AI devices could grow from hundreds of millions today to around 1.2 billion by 2033, spanning industrial, automotive and consumer electronics segments.

For Fibocom, the AI Dongle is a way to tap into that trend among smaller customers and developers who may not yet be ready to roll out dedicated AI servers or redesign device mainboards.

What It Means for Users and Developers

For end users, the main promise of Fibocom’s dongle is access to AI-assisted features, such as meeting transcription, summarisation, translation and content search, without sending sensitive data to remote servers. Because inference happens on the device or on a local NAS, data remains within the local network, which can help with privacy and compliance requirements.

For developers and integrators, Fibocom currently offers demonstration applications for knowledge-base question answering and audio/video transcription, with plans to introduce more local knowledge-base and agent-style solutions. Support for Windows and Linux hosts, along with SDKs and open-source frameworks like Tencent Youtu-Agent, is intended to reduce the effort needed to prototype and deploy applications that mix local compute with cloud-hosted models when needed.

There are still practical constraints. Like other mid-range edge AI systems, the QCS6490-class hardware is best matched to small and medium language models or heavily optimised pipelines, rather than very large frontier-scale models. Developers are likely to rely on quantisation, pruning and hybrid approaches that offload heavier workloads to the cloud.

Outlook for Portable Edge AI

The AI Dongle fits into a broader shift toward distributing AI inference across cloud, data centre and edge layers. High-end systems such as Jetson Orin and Cisco’s Unified Edge target complex, multi-application deployments, while lower-power devices, from smart cameras to USB accelerators, aim to make single-purpose or small-scale AI features ubiquitous.

Fibocom has already signalled plans to expand its “AI-first” portfolio with products such as its AI Buddy mobile computing platform and AI-enabled hotspot devices that blend connectivity with on-device intelligence. The dongle extends that narrative to any PC or NAS with a free USB port, offering a relatively simple on-ramp for organisations and individuals experimenting with on-device LLMs and other generative AI tools.

3% Cover the Fee
TheDayAfterAI News

We are a leading AI-focused digital news platform, combining AI-generated reporting with human editorial oversight. By aggregating and synthesizing the latest developments in AI — spanning innovation, technology, ethics, policy and business — we deliver timely, accurate and thought-provoking content.

Previous
Previous

Yann LeCun Ends 12-Year Meta Run to Launch New AI Startup

Next
Next

Hong Kong Fire: 128 Dead, 11 Arrested in Tai Po Tower Blaze Inquiry