multimodal
Work involving models that combine text with images, audio, video, or other modalities in training, inference, or evaluation.
Loading postsā¦
Work involving models that combine text with images, audio, video, or other modalities in training, inference, or evaluation.