當前位置：首頁 >

使用多进程教你下载 m3u8 加密或非加密视频

發布時間：2024/3/26 40 豆豆

生活随笔收集整理的這篇文章主要介紹了使用多进程教你下载 m3u8 加密或非加密视频小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

一、兩者不同

二、爬蟲源碼

三、爬蟲內容詳解

一、兩者不同

m3u8?是一種基于 HTTP Live Streaming 文件視頻格式，它主要是存放整個視頻的基本信息和分片(Segment)組成。
相信大家都看過m3u8格式文件的內容，我們直來對比一下有什么不同，然后教大家怎么用python多進程實現下載并且合并。
非加密?的m3u8文件

?加密?的m3u8文件

?相信眼尖的小伙伴已經看出了2個內容的不同之處，對的，其實區別就在加密文件的第 5 行的?#EXT-X-KEY 的信息
這個信息就是用來視頻內容解密的，其實里面的內容大多是一段字符串，其實也就是解密時候的KEY值
那么這個怎么去解密呢，我們暫時不管，我們先來解釋一下每行的意思
第一行:?#EXTM3U 聲明這是一個m3u8的文件
第二行:?#EXT-X-VERSION?協議的版本號
第三行:?#EXT-X-MEDIA-SEQUENCE?每一個media URI 在 PlayList中只有唯一的序號，相鄰之間序號+1
第四行:?#EXT-X-KEY? 記錄了加密的方式，一般是AES-128以及加密的KEY信息
第五行:?#EXTINF 表示這段視頻碎片的持續時間有多久
第六行:?sA3LRa6g.ts 視頻片段的名稱，獲取的時候需要拼接上域名，找到文件的正確的路徑

二、爬蟲源碼

#!/usr/bin/env python # encoding: utf-8 ''' #------------------------------------------------------------------- # CONFIDENTIAL --- CUSTOM STUDIOS #------------------------------------------------------------------- # # @Project Name : 多進程M3U8視頻下載助手 # # @File Name : main.py # # @Programmer : Felix # # @Start Date : 2020/7/30 14:42 # # @Last Update : 2020/7/30 14:42 # #------------------------------------------------------------------- ''' import requests, os, platform, time from Crypto.Cipher import AES import multiprocessing from retrying import retryclass M3u8:'''This is a main Class, the file contains all documents.One document contains paragraphs that have several sentencesIt loads the original file and converts the original file to new contentThen the new content will be saved by this class'''def __init__(self):'''Initial the custom file by self'''self.encrypt = Falseself.headers = {"User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0"}def hello(self):'''This is a welcome speech:return: self'''print("*" * 50)print(' ' * 15 + 'm3u8鏈接下載小助手')print(' ' * 5 + '作者: Felix Date: 2020-05-20 13:14')print(' ' * 10 + '適用于非加密 | 加密鏈接')print("*" * 50)return selfdef checkUrl(self, url):'''Determine if it is a available link of m3u8:return: bool'''if '.m3u8' not in url:return Falseelif not url.startswith('http'):return Falseelse:return Truedef parse(self, url):'''Analyze a link of m3u8:param url: string, the link need to analyze:return: list'''container = list()response = self.request(url).text.split('\n')for ts in response:if '.ts' in ts:container.append(ts)if '#EXT-X-KEY:' in ts:self.encrypt = Truereturn containerdef getEncryptKey(self, url):'''Access to the secret key:param url: string, Access to the secret key by the url:return: string'''encryptKey = self.request("{}/key.key".format(url)).contentreturn encryptKeydef aesDecode(self, data, key):'''Decode the data:param data: stream, the data need to decode:param key: secret key:return: decode the data'''crypt = AES.new(key, AES.MODE_CBC, key)plain_text = crypt.decrypt(data)return plain_text.rstrip(b'\0')def download(self, queue, sort, file, downPath, url):'''Download the debris of video:param queue: the queue:param sort: which number debris:param file: the link of debris:param downPath: the path to save debris:param url: the link of m3u8:return: None'''queue.put(file)baseUrl = '/'.join(url.split("/")[:-1])if self.encrypt:self.encryptKey = self.getEncryptKey(baseUrl)if not file.startswith("http"):file = baseUrl + '/' +filedebrisName = "{}/{}.ts".format(downPath, sort)if not os.path.exists(debrisName):response = self.request(file)with open(debrisName, "wb") as f:if self.encrypt:data = self.aesDecode(response.content, self.encryptKey)f.write(data)f.flush()else:f.write(response.content)f.flush()def progressBar(self, queue, count):'''Show progress bar:param queue: the queue:param count: the number count of debris:return: None'''print('---一共{}個碎片...'.format(count))offset = 0while True:offset += 1file = queue.get()rate = offset * 100 / countprint("\r%s下載成功，當前進度%0.2f%%, 第%s/%s個" % (file, rate, offset, count))if offset >= count:break@retry(stop_max_attempt_number=3)def request(self, url, params):'''Send a request:param url: the url of request:param params: the params of request:return: the result of request'''response = requests.get(url, params=params, headers=self.headers, timeout=10)assert response.status_code == 200return responsedef run(self):'''program entry, Input basic information'''downPath = str(input("碎片的保存路徑, 默認./Download：")) or "./Download"savePath = str(input("視頻的保存路徑, 默認./Complete：")) or "./Complete"clearDebris = bool(input("是否清除碎片, 默認True：")) or TruesaveSuffix = str(input("視頻格式, 默認ts：")) or "ts"while True:url = str(input("請輸入合法的m3u8鏈接："))if self.checkUrl(url):break# create a not available folderif not os.path.exists(downPath):os.mkdir(downPath)if not os.path.exists(savePath):os.mkdir(savePath)# start analyze a link of m3u8print('---正在分析鏈接...')container = self.parse(url)print('---鏈接分析成功...')# run processing to do somethingprint('---進程開始運行...')po = multiprocessing.Pool(30)queue = multiprocessing.Manager().Queue()size = 0for file in container:sort = str(size).zfill(5)po.apply_async(self.download, args=(queue, sort, file, downPath, url,))size += 1po.close()self.progressBar(queue, len(container))print('---進程運行結束...')# handler debrissys = platform.system()saveName = time.strftime("%Y%m%d_%H%M%S", time.localtime())print('---文件合并清除...')if sys == "Windows":os.system("copy /b {}/*.ts {}/{}.{}".format(downPath, savePath, saveName, saveSuffix))if clearDebris:os.system("rmdir /s/q {}".format(downPath))else:os.system("cat {}/*.ts>{}/{}.{}".format(downPath, savePath, saveName, saveSuffix))if clearDebris:os.system("rm -rf {}".format(downPath))print('---合并清除完成...')print('---任務下載完成...')print('---歡迎再次使用...')if __name__ == "__main__":M3u8().hello().run()

三、爬蟲內容詳解

初始化m3u8下載類

if __name__ == "__main__":M3u8().hello().run()

hello方法

def hello(self):'''This is a welcome speech:return: self'''print("*" * 50)print(' ' * 15 + 'm3u8鏈接下載小助手')print(' ' * 5 + '作者: Felix Date: 2020-05-20 13:14')print(' ' * 10 + '適用于非加密 | 加密鏈接')print("*" * 50)return self

run方法
hello方法其實就是歡迎語，介紹了一些基本信息
如果鏈式調用的話，必須返回 self，初學者需要注意

def run(self):'''program entry, Input basic information'''downPath = str(input("碎片的保存路徑, 默認./Download：")) or "./Download"savePath = str(input("視頻的保存路徑, 默認./Complete：")) or "./Complete"clearDebris = bool(input("是否清除碎片, 默認True：")) or TruesaveSuffix = str(input("視頻格式, 默認ts：")) or "ts"while True:url = str(input("請輸入合法的m3u8鏈接："))if self.checkUrl(url):break# create a not available folderif not os.path.exists(downPath):os.mkdir(downPath)if not os.path.exists(savePath):os.mkdir(savePath)

?就是提示一些保存碎片的路徑，合并完成后是否需要進行碎片清除
保存的視頻格式，默認是ts，因為ts一般的視頻軟件都可以打開，如果不放心可以輸入mp4
合法的連接這里調用了一個方法，checkUrl 其實就是檢測下是否是合格的m3u8鏈接
然后創建了一些不存在的文件夾

def checkUrl(self, url):'''Determine if it is a available link of m3u8:return: bool'''if '.m3u8' not in url:return Falseelif not url.startswith('http'):return Falseelse:return True

?這里我簡單的判斷了下鏈接是否是m3u8
首先鏈接要是m3u8結尾的
其次鏈接需要是http打頭
分析輸入的鏈接

# start analyze a link of m3u8 print('---正在分析鏈接...') container = self.parse(url) print('---鏈接分析成功...') def parse(self, url):'''Analyze a link of m3u8:param url: string, the link need to analyze:return: list'''container = list()response = self.request(url).text.split('\n')for ts in response:if '.ts' in ts:container.append(ts)if '#EXT-X-KEY:' in ts:self.encrypt = Truereturn container

請求鏈接，判斷是否是加密m3u8還是非加密
將所有碎片文件進行返回
打開多進程，開啟進程池，加速下載速度

# run processing to do something print('---進程開始運行...') po = multiprocessing.Pool(30) queue = multiprocessing.Manager().Queue() size = 0 for file in container:sort = str(size).zfill(5)po.apply_async(self.download, args=(queue, sort, file, downPath, url,))size += 1po.close()

zfill方法，其實就是在數字前填充0，因為我希望下載的文件是00001.ts，00002.ts這樣有序的，最后合并的時候才不會混亂?
queue 是多進程共享變量的一種方式，用來顯示下載的進度條
download方法

def download(self, queue, sort, file, downPath, url):'''Download the debris of video:param queue: the queue:param sort: which number debris:param file: the link of debris:param downPath: the path to save debris:param url: the link of m3u8:return: None'''queue.put(file)baseUrl = '/'.join(url.split("/")[:-1])if self.encrypt:self.encryptKey = self.getEncryptKey(baseUrl)if not file.startswith("http"):file = baseUrl + '/' +filedebrisName = "{}/{}.ts".format(downPath, sort)if not os.path.exists(debrisName):response = self.request(file)with open(debrisName, "wb") as f:if self.encrypt:data = self.aesDecode(response.content, self.encryptKey)f.write(data)f.flush()else:f.write(response.content)f.flush()

一開始就加入隊列，是為了防止文件之前已經存在的情況下，導致長度不對
如果是加密m3u8就通過 getEncryptKey 去獲取KEY值
寫入文件的時候如果是加密的，就將文件進行 aesDecode 方法解密，具體請看源碼
進度條顯示

def progressBar(self, queue, count):'''Show progress bar:param queue: the queue:param count: the number count of debris:return: None'''print('---一共{}個碎片...'.format(count))offset = 0while True:offset += 1file = queue.get()rate = offset * 100 / countprint("\r%s下載成功，當前進度%0.2f%%, 第%s/%s個" % (file, rate, offset, count))if offset >= count:break

其實就是通過當前的下載到第幾個碎片，和所有碎片的數量進行比較
一旦大于等于總數的時候，就退出循環
文件合并，碎片清除
這里兼容了 window 和 linux 下的合并清除命令
是否清除，剛開始的選擇中可設置

# handler debris sys = platform.system() saveName = time.strftime("%Y%m%d_%H%M%S", time.localtime())print('---文件合并清除...') if sys == "Windows":os.system("copy /b {}/*.ts {}/{}.{}".format(downPath, savePath, saveName, saveSuffix))if clearDebris:os.system("rmdir /s/q {}".format(downPath)) else:os.system("cat {}/*.ts>{}/{}.{}".format(downPath, savePath, saveName, saveSuffix))if clearDebris:os.system("rm -rf {}".format(downPath)) print('---合并清除完成...') print('---任務下載完成...') print('---歡迎再次使用...')

總結

以上是生活随笔為你收集整理的使用多进程教你下载 m3u8 加密或非加密视频的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： R语言学习笔记︱Echarts与R的可视
下一篇： Skia最新“编译”，绘制中文字符串，加

日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

使用多进程教你下载 m3u8 加密或非加密视频

一、兩者不同

二、爬蟲源碼

三、爬蟲內容詳解

總結