Abstract: Hybrid language models (HLMs) are inference-time architectures that combine the low-latency efficiency of small language models (SLMs) on clients (edge devices) with the high accuracy of ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results