概要
本文主要介绍分析窗口函数的使用方法,通过获取相邻两个股票交易日收盘价的例子,来介绍分析窗口函数在获取相邻值的用法。
分析窗口函数
本文主要介绍FIRST_VALUE和LAG两个分析窗口函数,与其类似的LAST_VALUE和LEAD两个函数,因为使用方法完全类似,所以不赘述,读者可以根据已有例子实现类似的操作。
FIRST_VALUE函数是获取有序集合的第一个值。
FIRST_VALUE ( [scalar_expression ] ) [ IGNORE NULLS | RESPECT NULLS ] OVER ( [ partition_by_clause ] order_by_clause [ rows_range_clause ] )
- 有序由order by语句决定
- 集合由具体窗口决定,包含partition和Framing子句
LAG 函数 获取当前数据的前一条数据内容,可以省去表的自联操作。
LAG (scalar_expression [,offset] [,default]) OVER ( [ partition_by_clause ] order_by_clause )
- 什么是前一条由order by决定
- 集合由具体窗口决定,包含partition和Framing子句
注意该方法在SQL Server 2012或其之后的版本才有支持。
需求介绍
本文以一个存储股票信息的数据表来说明分析窗口函数的使用方法。数据初始化见附录,建表语句如下:
create table MicrosiftStockHistory(
Id int primary key identity(1,1),
StockDate date not null ,
OpenPrice decimal(18,2) not null,
High decimal(18,2) not null,
Low decimal(18,2) not null,
Volumn int not null,
ClosePrice decimal(18,2) not null,
AdjClose decimal(18,2) not null,
)
该表只记录MS一只股票的数据,每天生成一条数据,节假日没有数据。ClosePrice 表示股票的收盘价,StockDate 是记录生成的日期。
当前需求是生成一个新列PreviousClosePrice 用于显示上一个交易日的收盘价。
解决方案
表的自联接解决方案
如果没有分析函数,我我们首先想到的是表格自联接,通过StockDate列去左联自己StockDate + 1 天的数据。
具体实现如下:
SELECT
m1.[StockDate], m1.[ClosePrice],
m2.[StockDate], m2.[ClosePrice] AS PreviousClosePrice
FROM [dbo].[MicrosiftStockHistory] m1
LEFT JOIN [dbo].[MicrosiftStockHistory] m2
ON m1.[StockDate] = DATEADD(DD,1, m2.[StockDate])
ORDER BY m1.[StockDate]
执行结果如下:
从结果我们看出很多行没有取到上一个交易日的收盘价,原因很明显,因为不是每天都有股票数据,周末和公共假期是没有的,所以我们改进一下左联的操作,增加更多的左联以应对多天的假期问题。
SELECT
m1.[StockDate], m1.[ClosePrice],
m2.[StockDate], m2.[ClosePrice],
m3.[StockDate], m3.[ClosePrice],
m4.[StockDate], m4.[ClosePrice],
COALESCE(m2.[ClosePrice],m3.[ClosePrice], m4.[ClosePrice]) AS PreviousClosePrice
FROM [dbo].[MicrosiftStockHistory] m1
LEFT JOIN [dbo].[MicrosiftStockHistory] m2
ON m1.[StockDate] = DATEADD(DD,1, m2.[StockDate])
LEFT JOIN [dbo].[MicrosiftStockHistory] m3
ON m1.[StockDate] = DATEADD(DD,2, m3.[StockDate])
LEFT JOIN [dbo].[MicrosiftStockHistory] m4
ON m1.[StockDate] = DATEADD(DD,3, m4.[StockDate])
ORDER BY m1.[StockDate]
我们增加了两天假期和三天假期的处理。执行结果如下:
从执行结果上看好了很多,但是如果假期更多,例如像国庆或者春节,可能要休息七天,难道要写7个左联操作 ?
表的自联加窗口函数的解决方案
现在最大的问题是因为假期,日期是不连续的,造成左联操作产生空值,如果我们认为制造一个连续的列代替日期列,问题也就解决了。
;WITH PreviousPrice as (
SELECT
StockDate,
ClosePrice,
ROW_NUMBER() over (PARTITION BY 1 ORDER BY StockDate) as RN
from [dbo].[MicrosiftStockHistory]
)
SELECT
c.[StockDate]
,c.[ClosePrice]
,p. ClosePrice as PreviousPrice
from PreviousPrice c
LEFT JOIN PreviousPrice p
on c.RN = p.RN +1
- 我们定义了一个CTE,以该CTE进行左联操作,避免嵌套子查询。
- CTE中包含一个通过窗口函数ROW_NUMBER,生成序号列,该列是按照日期升序排序,且连续的。PARTITION BY 1表示窗口是全表数据。
- PreviousPrice左联自己,获得前一个交易日的收盘数据,左联条件是序号相差1的,也就是前一个交易日的。
分析函数FIRST_VALUE的解决方案
上面的解决方案相比于第一种解决方案,不再需要考虑具体是几天假期了,但是还是需要联表操作。通过分析函数FIRST_VALUE我们可以省去该操作。
SELECT
[StockDate],
[ClosePrice],
FIRST_VALUE([ClosePrice]) over(
PARTITION BY 1
ORDER BY [StockDate] ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
) as PreviousClosePrice
FROM [dbo].[MicrosiftStockHistory]
- 分析函数FIRST_VALUE表示取当前窗口内有序数据的第一个
- 我们将当前Framing设置为从当前行的上一日到当前行,也就是每个窗口两条数据,按照日期升序排列,我们取第一条,执行结果如下:
因为7月1号的数据就是最早的了,它没有前面一条,所以它的窗口只有一条,也就是它自己。
分析函数LAG的解决方案
上面的解决方案还需要设置Framing,使用LAG函数可以更加简单
SELECT
[StockDate],
[ClosePrice],
LAG([ClosePrice]) OVER(
PARTITION BY 1
ORDER BY [StockDate]
) AS PreviousClosePrice
FROM [dbo].[MicrosiftStockHistory]
LAG函数本身就具有取上面一条数据的功能,所以我们只需要设置好按照StockDate升序排列即可。
更复杂的需求
现在需求变更了,MicrosiftStockHistory表更名为StockHistory,不再只存MS的股票数据了,而是可能存储多只股票的数据,增加了一个StockId列表示不同的股票数据,建表语句如下:
create table StockHistory( Id int primary key identity(1,1), StockId int not null, StockDate date not null , OpenPrice decimal(18,2) not null, High decimal(18,2) not null, Low decimal(18,2) not null, Volumn int not null, ClosePrice decimal(18,2) not null, AdjClose decimal(18,2) not null, )
初始化数据见附录。
需求变为获取每只股票每个交易日和前一个交易日的收盘价。
表的自联加窗口函数的解决方案
;WITH PreviousPrice as (
SELECT
[StockId],
[StockDate],
[ClosePrice],
ROW_NUMBER() OVER (PARTITION BY [StockId] ORDER BY StockDate) as RN
FROM [dbo].[StockHistory]
)
SELECT
c.[StockId]
,c.[StockDate]
,c.[ClosePrice]
,p. ClosePrice as PreviousClosePrice
FROM PreviousPrice c
LEFT JOIN PreviousPrice p
ON c.RN = p.RN +1 and c.StockId = p.StockId
- CTE中的窗口不再是整个数据表而是每个STockId一个窗口。
- 左联时候,增加STockId相等的限制。
分析函数FIRST_VALUE的解决方案
SELECT
[StockId],
[StockDate],
[ClosePrice],
FIRST_VALUE([ClosePrice]) over(
PARTITION BY [StockId]
ORDER BY [StockDate] ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
) as PreviousClosePrice
FROM [dbo].[StockHistory]
分析函数LAG的解决方案
SELECT
[StockId],
[StockDate],
[ClosePrice],
LAG([ClosePrice]) OVER(
PARTITION BY [StockId]
ORDER BY [StockDate]
) AS PreviousClosePrice
FROM [dbo].[StockHistory]
附录
添加MS股票数据
insert into MicrosiftStockHistory (StockDate,OpenPrice, High, Low , ClosePrice, Volumn, AdjClose) values
('2010-07-01', 23.09,23.32, 23.73,23.16, 92239400, 19.67),
('2010-07-02', 23.36,23.48, 23.05,23.27, 62485100, 19.76),
('2010-07-06', 23.70,24.09, 23.58,23.82, 73592000, 20.23),
('2010-07-07', 23.82,24.32, 23.61,24.30, 79965300, 20.64),
('2010-07-08', 24.60,24.62, 23.97,24.41, 50758100, 20.73),
('2010-07-09', 24.33,24.41, 24.15,24.27, 53806100, 20.61),
('2010-07-12', 24.43,24.89, 24.42,24.83, 49854200, 21.09),
('2010-07-13', 25.14,25.30, 24.90,25.13, 61928700, 21.34),
('2010-07-14', 25.50,25.61, 25.12,25.44, 72808100, 21.61),
('2010-07-15', 25.50,25.59, 24.98,25.51, 56934700, 21.67),
('2010-07-16', 25.51,25.64, 24.88,24.89, 65064800, 21.14),
('2010-07-19', 24.96,25.30, 24.91,25.23, 38181800, 21.43),
('2010-07-20', 24.86,25.48, 24.70,25.48, 45530700, 21.64),
('2010-07-21', 25.60,25.65, 24.98,25.12, 73297300, 21.33),
('2010-07-22', 25.51,25.99, 25.47,25.84, 73016400, 21.95),
('2010-07-23', 25.84,26.02, 25.25,25.81, 108520000, 19.67),
('2010-07-26', 25.86,26.20, 25.80,26.10, 67249900, 22.17),
('2010-07-27', 26.14,26.24, 25.96,26.16, 60672100, 22.22)
添加所有股票数据
insert into StockHistory (StockId, StockDate,OpenPrice, High, Low , ClosePrice, Volumn, AdjClose) values
(1,'2010-07-01', 23.09,23.32, 23.73,23.16, 92239400, 19.67),
(1,'2010-07-02', 23.36,23.48, 23.05,23.27, 62485100, 19.76),
(1,'2010-07-06', 23.70,24.09, 23.58,23.82, 73592000, 20.23),
(1,'2010-07-07', 23.82,24.32, 23.61,24.30, 79965300, 20.64),
(1,'2010-07-08', 24.60,24.62, 23.97,24.41, 50758100, 20.73),
(1,'2010-07-09', 24.33,24.41, 24.15,24.27, 53806100, 20.61),
(1,'2010-07-12', 24.43,24.89, 24.42,24.83, 49854200, 21.09),
(1,'2010-07-13', 25.14,25.30, 24.90,25.13, 61928700, 21.34),
(1,'2010-07-14', 25.50,25.61, 25.12,25.44, 72808100, 21.61),
(1,'2010-07-15', 25.50,25.59, 24.98,25.51, 56934700, 21.67),
(1,'2010-07-16', 25.51,25.64, 24.88,24.89, 65064800, 21.14),
(1,'2010-07-19', 24.96,25.30, 24.91,25.23, 38181800, 21.43),
(1,'2010-07-20', 24.86,25.48, 24.70,25.48, 45530700, 21.64),
(1,'2010-07-21', 25.60,25.65, 24.98,25.12, 73297300, 21.33),
(1,'2010-07-22', 25.51,25.99, 25.47,25.84, 73016400, 21.95),
(1,'2010-07-23', 25.84,26.02, 25.25,25.81, 108520000, 19.67),
(1,'2010-07-26', 25.86,26.20, 25.80,26.10, 67249900, 22.17),
(1,'2010-07-27', 26.14,26.24, 25.96,26.16, 60672100, 22.22),
(2,'2010-07-01', 23.09,23.32, 23.73,13.16, 92239400, 19.67),
(2,'2010-07-02', 23.36,23.48, 23.05,13.27, 62485100, 19.76),
(2,'2010-07-06', 23.70,24.09, 23.58,13.82, 73592000, 20.23),
(2,'2010-07-07', 23.82,24.32, 23.61,14.30, 79965300, 20.64),
(2,'2010-07-08', 24.60,24.62, 23.97,14.41, 50758100, 20.73),
(2,'2010-07-09', 24.33,24.41, 24.15,14.27, 53806100, 20.61),
(2,'2010-07-12', 24.43,24.89, 24.42,14.83, 49854200, 21.09),
(2,'2010-07-13', 25.14,25.30, 24.90,15.13, 61928700, 21.34),
(2,'2010-07-14', 25.50,25.61, 25.12,15.44, 72808100, 21.61),
(2,'2010-07-15', 25.50,25.59, 24.98,15.51, 56934700, 21.67),
(2,'2010-07-16', 25.51,25.64, 24.88,14.89, 65064800, 21.14),
(2,'2010-07-19', 24.96,25.30, 24.91,15.23, 38181800, 21.43),
(2,'2010-07-20', 24.86,25.48, 24.70,15.48, 45530700, 21.64),
(2,'2010-07-21', 25.60,25.65, 24.98,15.12, 73297300, 21.33),
(2,'2010-07-22', 25.51,25.99, 25.47,15.84, 73016400, 21.95),
(2,'2010-07-23', 25.84,26.02, 25.25,15.81, 108520000, 19.67),
(2,'2010-07-26', 25.86,26.20, 25.80,16.10, 67249900, 22.17),
(2,'2010-07-27', 26.14,26.24, 25.96,16.16, 60672100, 22.22),
(3,'2010-07-01', 23.09,23.32, 23.73,33.16, 92239400, 19.67),
(3,'2010-07-02', 23.36,23.48, 23.05,33.27, 62485100, 19.76),
(3,'2010-07-06', 23.70,24.09, 23.58,33.82, 73592000, 20.23),
(3,'2010-07-07', 23.82,24.32, 23.61,34.30, 79965300, 20.64),
(3,'2010-07-08', 24.60,24.62, 23.97,34.41, 50758100, 20.73),
(3,'2010-07-09', 24.33,24.41, 24.15,34.27, 53806100, 20.61),
(3,'2010-07-12', 24.43,24.89, 24.42,34.83, 49854200, 21.09),
(3,'2010-07-13', 25.14,25.30, 24.90,35.13, 61928700, 21.34),
|