fetch video from url, and transcibe to text
running very poor server with performance, maybe 1.5x time cost with orignal video processing.
try yourself locally if your have macbook with silicon or high performance GPU.
https://gist.github.com/yanyaoer/5cc7b0dd6729f306ad3cb740d501cabd#file-0-video2text-py