A Multimodal LLM-based Assistant for User-Centric Interactive Machine Learning

Abstract

This paper proposes a system based on a multimodal large language model (MLLM) to assist non-expert users without prior experience in machine learning (ML) development. The MLLM assistant in our system interactively helps users compile their requirements and create appropriate training data while building an ML model. It has been reported that users often struggle to define training data that comprehensively covers all samples or aligns with their needs. To prevent such failures, the MLLM assistant monitors the user’s interaction process and translates users’ vague needs into concrete ML formulations through chat, ultimately facilitating the creation of appropriate training data.

Publication
SIGGRAPH Asia 2024 Posters