All Tags
Browse through all available tags to find articles on topics that interest you.
Browse through all available tags to find articles on topics that interest you.
Showing 2 results for this tag.
SLAM-LLM: A Modular, Open-Source Multimodal Large Language Model Framework and Best Practice for Speech, Language, Audio and Music Processing
SLAM-LLM is an open-source deep learning framework designed to train customized Multimodal Large Language Models (MLLMs), with a focus on speech, language, audio, and music processing. It provides a modular configuration, detailed training and inference recipes, and high-performance checkpoints for mainstream tasks, aiming to accelerate research in audio-language models.
AudioFab: Building A General and Intelligent Audio Factory through Tool Learning
AudioFab is an open-source agent framework designed to create a unified and efficient audio-processing ecosystem by addressing the fragmentation and complex integration issues of existing audio AI tools. It offers a modular design and intelligent tool learning strategies to simplify complex audio tasks for both experts and non-experts.