Abstract: The innovation of transformer architecture has propelled the growth of natural language algorithms and models, spanning language models, large language models (LLMs), and pretraining ...