fetch video from url, and transcibe to text


running very poor server with performance, maybe 1.5x time cost with orignal video processing.
try yourself locally if your have macbook with silicon or high performance GPU.
https://gist.github.com/yanyaoer/5cc7b0dd6729f306ad3cb740d501cabd#file-0-video2text-py