當(dāng)前位置：首頁(yè) > 编程语言 > python >内容正文

python

python遍历文件夹内文件并检索文件中的中文内容

發(fā)布時(shí)間：2023/12/9 python 37 豆豆

生活随笔收集整理的這篇文章主要介紹了 python遍历文件夹内文件并检索文件中的中文内容小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

前言

有個(gè)需求，遍歷文件夾內(nèi)的文件，并搜索文件中是否存在特定關(guān)鍵字（中文）

代碼

import os import re from os import path def cn_to_unicode(in_str, need_str=True, debug=False):out = []for s in in_str:# 獲得該字符的數(shù)值val = ord(s)# print(val)# 小于0xff則為ASCII碼，手動(dòng)構(gòu)造\u00xx格式if val <= 0xff:hex_str = hex(val).replace('0x', '').zfill(4)# 這里不能以u(píng)nicode_escape編碼，不然會(huì)自動(dòng)增加一個(gè)'\\'res = bytes('\\u' + hex_str, encoding='utf-8')else:res = s.encode("unicode_escape")out.append(res)# 調(diào)試if debug:print(out)print(len(out), len(out[0]), len(out[-1]))# 轉(zhuǎn)換為str類(lèi)if need_str:out_str = ''for s in out:out_str += str(s, encoding='utf-8')return out_strelse:return outdef scaner_file (url,key):file = os.listdir(url)for f in file:real_url = path.join (url , f)if path.isfile(real_url):file_path = path.abspath(real_url)with open(file_path,encoding='utf8') as file_obj:contents = file_obj.read()res = re.findall(key, contents)if(res):print(contents)exit('Success!')elif path.isdir(real_url):scaner_file(real_url)else:print("其他情況")passprint(real_url)chinese_key = "你好" unicode_key = cn_to_unicode(chinese_key) check_dir = "./result"scaner_file(check_dir,unicode_key)

總結(jié)

以上是生活随笔為你收集整理的python遍历文件夹内文件并检索文件中的中文内容的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： oracle查出连续5行,Oracle期
下一篇： python kmeans聚类对二维坐