How to download YouTube Video captions in SRT using Pytube in Python?
In this section, we will learn how to download youtube caption in SRT format using pytube in a python programming language.
Prerequisite
Pytube is a very serious, lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.
Table of Contents
Required Module
In this tutorial, we are using the Pytube Module of Python. You can download it using PIP in python.
pip install pytube
Approach
- Import the pytube module.
- Create and initialize an object of YouTube() e.g, video_src = YouTube(‘https://youtu.be/mBJMkFNRVek’).
- To get a particular language caption from videos, create a variable “en_caption_data” as shown below in the program, then you need to pass the language code for a particular videos e.g, video_src.captions[‘a.en’].
- XML is the default output format of the caption, So it need to convert the XML caption into SRT using xml_caption_to_srt() function of Pytube.
- To get the YouTube caption use “.xml_caption_to_srt” available option in Pytube, e.gsrt_format = en_caption_data.xml_caption_to_srt(en_caption_data.xml_captions).
Pytube program implementation given below.
Program: Get Caption Using Pytube In Python
# Import Pytube module to use API
from pytube import YouTube
video_url = 'https://youtu.be/mBJMkFNRVek'
# create an object of YouTube() and pass the URL of YouTube Videos
video_src = YouTube(video_url)
# print the all avaible caption list, to see language code
print("All Avaible Captions : \n",video_src.captions)
# to get particular langauge caption you need to pass the language code e.g, captions['a.en']
en_caption_data = video_src.captions['a.en']
print("\nCaption Data in SRT Format: \n")
# call .xml_caption_to_srt() function and pass the XML Caption as an arguments
srt_format = en_caption_data.xml_caption_to_srt(en_caption_data.xml_captions)
# print caption in SRT format
print(srt_format)
Output

Leave a Reply