A Visual Question Answering using BLIP model.
An Image Captioning Application using Salesforce BLIP model.
An English Audio Transcription using OpenAI Whisper model.