當(dāng)前位置：首頁 > 运维知识 > 数据库 >内容正文

数据库

SQLite | Group By 和 Order By 子句

發(fā)布時(shí)間：2025/3/15 数据库 54 豆豆

生活随笔收集整理的這篇文章主要介紹了 SQLite | Group By 和 Order By 子句小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

文章目錄

1. Group by and Order by
- 1.1 Group Records
- 1.2 Ordering Records
- 1.3 Aggregate Functions
- 1.4 The Having Statement
- 1.5 Getting Distinct Records
- 參考資料

1. Group by and Order by

我們?cè)谏弦黄薪榻B了 Where 子句，接下來我們將使用 Group by 和 Order by 子句，對(duì)數(shù)據(jù)進(jìn)行聚合和排序。

使用Jupyter Notebook 運(yùn)行 SQL 語句需安裝 ipython-sql
%sql 以及 %%sql 為在 Notebook 中運(yùn)行 SQL 語句，在 SQLite 命令行或 SQLite Stiduo 中不需要 %sql 或 %%sql

載入 SQL 以及連接 SQLite：

%load_ext sql %sql sqlite:///DataBase/weather_stations.db 'Connected: @DataBase/weather_stations.db'

本文將使用 weather_stations.db 數(shù)據(jù)庫(kù)，其中包含了 STATION_DATA 表。

首先查看 STATION_DATA 表中的數(shù)據(jù)：

%sql select * from station_data limit 0,5; -- 篩選前五行 * sqlite:///DataBase/weather_stations.db Done. station_numberreport_codeyearmonthdaydew_pointstation_pressurevisibilitywind_speedtemperatureprecipitationsnow_depthfograinhailthundertornado

143080	34DDA7	2002	12	21	33.8	987.4	3.4	0.2	36	0	None	1	1	1	1	1
766440	39537B	1998	10	1	72.7	1014.6	5.9	6.7	83.3	0	None	0	0	0	0	0
176010	C3C6D5	2001	5	18	55.7	None	7.3	4.3	69.1	0	None	0	0	0	0	0
125600	145150	2007	10	14	33	None	6.9	2.5	39.7	0	None	0	0	0	0	0
470160	EF616A	1967	7	29	65.6	None	9.2	1.2	72.4	0.04	None	0	0	0	0	0

1.1 Group Records

首先從最簡(jiǎn)單的聚合方法開始：計(jì)數(shù)：

%%sql select count(*) as record_cound from station_data; * sqlite:///DataBase/weather_stations.db Done. record_cound

28000

count(*) 意味著計(jì)算記錄的長(zhǎng)度，你也可以和其他 SQL 操作符結(jié)合起來使用，比如 where，我們可以這樣計(jì)算 tornado 出現(xiàn)的次數(shù)：

%%sql select count(*) as record_count from station_data where tornado == 1; * sqlite:///DataBase/weather_stations.db Done. record_count

3000

我們找到了 3000 條包含 tornado 的記錄，但如果我們想要按年計(jì)數(shù)呢？我們可以這樣寫：

%%sql select year, count(*) as record_count from station_data where tornado == 1 group by year limit 0,3; -- 只展示前三條 * sqlite:///DataBase/weather_stations.db Done. yearrecord_count

1937	3
1941	3
1942	3

我們現(xiàn)在可以看到每年的計(jì)數(shù)，讓我們拆分下這個(gè)查詢來看看怎么執(zhí)行的：

select year, -- 1. 首先，我們選擇了 year（select year） count(*) as record_count -- 2. 然后我們用 **count(\*)** 對(duì)篩選的記錄進(jìn)行了計(jì)數(shù) from station_data where tornado == 1 -- 3. 我們篩選了 tornado 為 true 的數(shù)據(jù) group by year -- 4. 最后，按年進(jìn)行分類

我們也可以在多個(gè) field 上進(jìn)行聚合：

%%sql select year, month, count(*) as record_count from station_data where tornado == 1 group by year, month limit 0,3; * sqlite:///DataBase/weather_stations.db Done. yearmonthrecord_count

1937	7	3
1941	8	3
1942	10	3

此外，在使用 group by 時(shí)，我們可以也用 序數(shù)位置（ordinal positions）：

%%sql select year, month, count(*) as record_count from station_data where tornado == 1 group by 1, 2 -- ordinal positions limit 0,5; * sqlite:///DataBase/weather_stations.db Done. yearmonthrecord_count

1937	7	3
1941	8	3
1942	10	3
1943	1	3
1943	4	3

不是所有的平臺(tái)都支持 ordinal positions，例如 Oracle 和 SQL Server，就只能寫全稱

1.2 Ordering Records

需要注意到，我們通過 group 得到的數(shù)據(jù)中 month 并不是按自然月份排序的，所以字哦好就是同時(shí)使用 oreder by 操作符來進(jìn)行排序，如果你想要先按年份排序，再按月份排序，你只需要添加：

%%sql select year, month, count(*) as record_count from station_data where tornado == 1 group by 1, 2 -- ordinal positions order by 1, 2 -- order by 同樣支持 ordinal positions limit 0,5; * sqlite:///DataBase/weather_stations.db Done. yearmonthrecord_count

1937	7	3
1941	8	3
1942	10	3
1943	1	3
1943	4	3

order by 默認(rèn)是按升序（ASC）排列的，然而你可能更對(duì)近期的數(shù)據(jù)感興趣，你可以通過添加 DESC 來指定排序方式：

%%sql select year, month, count(*) as record_count from station_data where tornado == 1 group by year, month order by year DESC, month limit 0,5; * sqlite:///DataBase/weather_stations.db Done. yearmonthrecord_count

2010	3	6
2009	1	3
2009	2	3
2009	4	2
2009	5	6

1.3 Aggregate Functions

我們已經(jīng)使用 count(*) 來對(duì)記錄進(jìn)行計(jì)數(shù)了，但還有其他的一些聚合函數(shù)（AggregateyFunctions），
如 sum()、min()、max() 和 avg()。我們可以在特定的列上使用聚合函數(shù)來進(jìn)行計(jì)算。

圖1 SQLite 內(nèi)置聚合函數(shù)

但首先讓我們來看看 count() 的另一種使用方式， count() 可以用于除了計(jì)數(shù)以外的其他用途。如果你不使用 * ，
而是指定某一列，那么它將會(huì)計(jì)算所有非缺失值（non-null）的個(gè)數(shù)。舉個(gè)例子，我們可以計(jì)算 snow_depth 中非缺失值的個(gè)數(shù)：

%%sql select count(snow_depth) as recorded_snow_depth_count from station_data * sqlite:///DataBase/weather_stations.db Done. recorded_snow_depth_count

1552

讓我們進(jìn)一步看看聚合函數(shù)，如果你想要看看你從 2000 年開始每個(gè)月的平均溫度，你可以先篩選 2000 年的記錄，
然后按月份分組，最后計(jì)算平均溫度：

%%sql select month, avg(temperature) as avg_temp from station_data where year >= 2000 group by month limit 0,3; * sqlite:///DataBase/weather_stations.db Done. monthavg_temp

1	41.55585443037976
2	38.98063127690104
3	48.975062656641576

sum() 是另一個(gè)常見的聚合操作符，為了得到 2000 年至今每年的下雪深度，你可以這樣查詢：

%%sql select year, sum(snow_depth) as total_snow from station_data where year >= 2000 group by year limit 0,3; * sqlite:///DataBase/weather_stations.db Done. yeartotal_snow

2000	685.8999999999999
2001	391.90000000000003
2002	437.69999999999993

你可以在一次查詢中多次使用聚合操作，我們將 2000 年以來的下雪總量、下雨總量和最大降雨量分別統(tǒng)計(jì)出來，并保留兩位小數(shù)：

%%sql select year, round(sum(snow_depth), 2) as total_snow, round(sum(precipitation), 2) as total_precipitation, round(max(precipitation), 2) as max_precipitation from station_data where year >= 2000 group by year limit 0,3; * sqlite:///DataBase/weather_stations.db Done. yeartotal_snowtotal_precipitationmax_precipitation

2000	685.9	27.57	0.87
2001	391.9	38.15	2.95
2002	437.7	43.06	5.0

1.4 The Having Statement

假設(shè)你想要基于一個(gè)聚合值來篩選記錄，你的第一反應(yīng)應(yīng)該是使用 where 子句。確實(shí)， where 子句可以
用來篩選記錄，但是卻無法用于聚合值上。舉個(gè)例子，如果你想使用 where 子句篩選出總下雨量大于 30 的記錄，
就會(huì)出現(xiàn)以下錯(cuò)誤：

%%sql select year, sum(precipitation) as total_precipitation from station_data where total_precipitation > 30 group by year limit 0,3; * sqlite:///DataBase/weather_stations.db (sqlite3.OperationalError) misuse of aggregate: sum() [SQL: select year, sum(precipitation) as total_precipitation from station_data where total_precipitation > 30 group by year limit 0,3;] (Background on this error at: http://sqlalche.me/e/e3q8)

為什么不起作用呢？首先我們來看下聚合的原理，首先程序一行一行的掃描，找出那些在 where 子句
上成立的數(shù)據(jù)，然后再進(jìn)行聚合。然而在聚合前并沒有 total_precipitation 這一列數(shù)據(jù)，因此出錯(cuò)。

當(dāng)你想在聚合值上執(zhí)行 where 這個(gè)方法時(shí)，只能使用 having 這個(gè)關(guān)鍵詞：

%%sql select year, sum(precipitation) as total_precipitation from station_data group by year having total_precipitation > 30 limit 0,3 * sqlite:///DataBase/weather_stations.db Done. yeartotal_precipitation

1973	35.07999999999996
1974	42.209999999999994
1975	48.25999999999997

having 相當(dāng)于聚合版的 where，但并不是所有平臺(tái)都支持在 aliases 上使用 having ，
如 Oracle（group by 也不行），這意味著當(dāng)你使用 having 時(shí)需要再輸入一次聚合函數(shù)，像這樣：

%%sql select year, sum(precipitation) as total_preicipitation from station_data group by year having sum(precipitation) > 30 limit 0,3 * sqlite:///DataBase/weather_stations.db Done. yeartotal_preicipitation

1973	35.07999999999996
1974	42.209999999999994
1975	48.25999999999997

1.5 Getting Distinct Records

當(dāng)我們使用 **select from** 時(shí)，記錄中可能會(huì)包含重復(fù)值，如果你只想要返回**唯一值（distinct records）**，你可以使用 **select distinct from**，比如我們的 station_data，表中 station_number 一列包含了 28000 個(gè)值，但你通過 **select distinct from** 后會(huì)發(fā)現(xiàn)其中是 6368 個(gè)值不斷重復(fù)出現(xiàn)組成的 %%sql select count(station_number) as duplicate_num from station_data; * sqlite:///DataBase/weather_stations.db Done. duplicate_num

28000

%%sql select count(distinct station_number) as distinct_num from station_data; * sqlite:///DataBase/weather_stations.db Done. distinct_num

6368

參考資料

[1] Thomas Nield.Getting Started with SQL[M].US: O’Reilly, 2016: 29-37

相關(guān)文章：

總結(jié)

以上是生活随笔為你收集整理的SQLite | Group By 和 Order By 子句的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： Opencv 图像入门一之基本操作
下一篇： NAR丨方海发布免疫疾病遗传靶点数据库“