Abstract: Multimodal Foundation Model (MFM), like ChatGPT and Gemini, have emerged as powerful tools for their exceptional natural language processing capabilities and their emerging potential in ...