pandas,apply并行计算的一个demo
生活随笔
收集整理的這篇文章主要介紹了
pandas,apply并行计算的一个demo
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Date : 2018-10-11 17:55:26
# @Author : Sheldon (thisisscret@qq.com)
# @blogs : 謝耳朵的派森筆記
# @Link : https://www.cnblogs.com/shld/ import pandas as pd
from joblib import Parallel, delayeddef apply_parallel(df, func, n=-2):"""利用 Parallel 和 delayed 函數實現并行運算,思路是把Dataframe分割喂給包含apply的函數@params df: 需要apply的Dataframe@params func: 包含apply的函數,(不是apply的參數那個函數),需自己定義,@params n: n為線程數,默認為cpu數-1,-1為cpu數,可自定義@return Dataframe: apply參數函數得到的Dataframe
"""if n is None:n = -1dflength = len(df)cpunum = cpu_count()if dflength<cpunum:spnum = dflengthif n<0:spnum = cpunum+n+1else:spnum = n or 1sp = list(range(dflength)[::int(dflength/spnum+0.5)])sp.append(dflength)slice_gen = (slice(*idx) for idx in zip(sp[:-1],sp[1:]))results = Parallel(n_jobs=n)(delayed(func)(df[slc]) for slc in slice_gen)return pd.concat(results)
"""if n is None:n = -1dflength = len(df)cpunum = cpu_count()if dflength<cpunum:spnum = dflengthif n<0:spnum = cpunum+n+1else:spnum = n or 1sp = list(range(dflength)[::int(dflength/spnum+0.5)])sp.append(dflength)slice_gen = (slice(*idx) for idx in zip(sp[:-1],sp[1:]))results = Parallel(n_jobs=n)(delayed(func)(df[slc]) for slc in slice_gen)return pd.concat(results)
轉載于:https://www.cnblogs.com/shld/p/9774180.html
總結
以上是生活随笔為你收集整理的pandas,apply并行计算的一个demo的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 中文手机评论情感分类系列(一)
- 下一篇: redis make编译失败的原因