日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程语言 > python >内容正文

python

notebook python 内嵌 数据库_python数据分析:在jupyter notebook上使用pythonSQL做数据分析...

發(fā)布時間:2024/7/23 python 28 豆豆
生活随笔 收集整理的這篇文章主要介紹了 notebook python 内嵌 数据库_python数据分析:在jupyter notebook上使用pythonSQL做数据分析... 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

python數(shù)據(jù)分析:在jupyter notebook上使用python&SQL做數(shù)據(jù)分析

發(fā)布時間:2019-01-14 21:14,

瀏覽次數(shù):1143

, 標簽:

python

jupyter

notebook

SQL

類似于在jupyter上使用R語言,同樣可以使用SQL語句:

詳細見github項目:https://github.com/catherinedevlin/ipython-sql

<>安裝ipython-sql

pip install ipython-sql

<>載入

%load_ext sql

<>連接數(shù)據(jù)庫 同 SQLAlchemy

* postgresql://will:[email?protected]/shakes

* mysql+pymysql://scott:[email?protected]/foo

* oracle://scott:[email?protected]:1521/sidname

* sqlite://

* sqlite:///foo.db

*

mssql+pyodbc://username:[email?protected]/databasedriver=SQL+Server+Native+Client+11.0

我是使用的是mysql,本地鏈接,用戶名ffzs,密碼666666,test數(shù)據(jù)庫:

%sql mysql+pymysql://ffzs:[email?protected]/test

<>簡單使用

%matplotlib inline import matplotlib.pyplot as plt plt.style.use('bmh')

<>1.顯示表

%%sql show tables;

<>2.選取steam_users表的前5行

df = %sql select * from steam_users limit 5 df.DataFrame()

<>3.計算表中包含多少游戲數(shù)和玩家數(shù)

%%sql select count(distinct Game) gameCount, count(distinct UserID) userCount

from steam_users

<>4.篩選出擁有用戶前十的游戲

%%sql data << select Game , count(1) as count from steam_users where Action=

'play' group by Game order by count desc limit 10

data.DataFrame()[::-1].plot.barh("Game","count")

<>5.篩選出被玩總時長前十的游戲

%%sql playHour << select Game,sum(Hours) as playHour from steam_users where

Action="play" group by Game order by playHour desc limit 10

playHour.DataFrame()[::-1].plot.barh('Game', 'playHour')

<>6.篩選出被玩平均時長前十的游戲

%%sql avgHour << select Game, avg(Hours) as avgHour from steam_users where

Action='play' group by Game order by avgHour desc limit 10

avgHour.DataFrame()[::-1].plot.barh('Game','avgHour')

<>7.平均時長前十的游戲的游戲人數(shù)

%%sql select Game, avg(Hours) as avgHour, count(1) as count from steam_users

where Action='play' group by Game order by avgHour desc limit 10

聯(lián)系join on:

%%sql select a.Game, avgHour, count from (select Game, avg(Hours) as avgHour

from steam_users where Action='play' group by Game order by avgHour desc limit

10) a left join (select Game ,count(1) as count from steam_users where Action=

'play' group by Game) b on a.Game=b.Game order by avgHour desc

可見平均時長長的游戲大多是小眾游戲

<>8.玩家人數(shù)大于500人的游戲的個數(shù)(having使用)

%%sql select count(1) as count from (select Game, count(1) as count from

steam_userswhere Action='play' group by Game having count > 500) a

<>9.擁有游戲數(shù)量前十用戶

%%sql games << select UserID, count(1) count from steam_users where Action=

'play' group by UserID order by count desc limit 10

games.DataFrame()[::-1].plot.barh('UserID','count')

<>10.游戲總時長最多5個用戶和最少5個用戶(union使用)

%%sql (select UserID, sum(Hours) as allHour from steam_users where Action=

'play' group by UserID order by allHour desc limit 5) union (select UserID, sum(

Hours) as allHour from steam_users where Action='play' group by UserID order by

allHourlimit 5)

總結(jié)

以上是生活随笔為你收集整理的notebook python 内嵌 数据库_python数据分析:在jupyter notebook上使用pythonSQL做数据分析...的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。