Features Audio Craft
MusicGen
Generates diverse, long music samples from user-provided text inputs, catering to users' specific creative demands.
AudioGen
Focused on text-to-sound generation, produces environmental sounds from text, useful for aural representation of written information.
EnCodec
Maps raw audio signals to discrete audio tokens, contributing to effective audio compression and subsequent analysis.
Autoregressive Language Model
Efficiently models audio sequences, capturing long-term dependencies in audio, and aiding high-quality audio generation.
Token Interleaving Pattern
Enables efficient modelling of audio sequences and aids in capturing long-term dependencies in audio.
Text-to-Sound Generation
Converts textual inputs into corresponding sound output, enhancing accessibility and inclusivity.
Text-to-Music Generation
Converts textual inputs into corresponding music output, allowing users to create customized musical tracks.
Conditioning Models
Controls the generation process using pre-trained encoders for diverse applications like text-to-audio, meeting users' specific requirements.