Generating metadata for AVID using speech-to-text & Video Intelligence