DGCN论文阅读
1. 网络架构
本文的主要思想: data:image/s3,"s3://crabby-images/f0f9d/f0f9d0671f8846056cecc986d04cae79f5fda60a" alt="在这里插入图片描述"
DGCN的网络框架: data:image/s3,"s3://crabby-images/28bc1/28bc17e5f1d092e95c1874c79b62758fef009c4d" alt="在这里插入图片描述"
输入由三部分组成:1. the recent-period data X_Th 2. the daily-period data X_Td; 3. the weekly-period data X_Tw
网络主要由两个部分组成:1. LMLN(蓝色部分)和 2. GCN(灰色部分)
1.1 LMLN,Laplace Matrix Latent Network
看Fig. 2. 蓝色的部分,主要有一个Global Laplace matrix Learning Layer 和两个Laplace matrix prediction unit组成。
Laplace matrix prediction unit的结构如下,主要包括三个部分:1. the feature sampling; 2. the spatial attention; 3. LSTM unit : data:image/s3,"s3://crabby-images/d5e9a/d5e9a83f38a4c7eb9f57776da2cff3a28a415752" alt="在这里插入图片描述"
(1)Global Laplace Matrix Learning Layer*
(提到了参考文献17,maybe思路是从那里来的。)
这里进行了这样的一个数学运算: data:image/s3,"s3://crabby-images/e8d9c/e8d9c5425f4a8f3ccb4f60880c52c89ccfa5cc13" alt="在这里插入图片描述"
(2)Feature Sampling
这个模块的目的是:to reduce the data dimension of the traffic feature according to the importance of different time interval
把输入的数据按照时间维度分为k组,每组长度为T。除了最近的T个特征,其余的k-1组都将T步长的特征融合为1个。
前k-1组的操作可以表示为: data:image/s3,"s3://crabby-images/1380c/1380c84ae017cdaa99c6799cb1df6f914ae0a894" alt="在这里插入图片描述" 经过Feature Sampling,数据的维度变成这样: data:image/s3,"s3://crabby-images/9721f/9721f63c77b09edb0d5492e909807703932ba8c4" alt="在这里插入图片描述" 然后融合后的k-1个特征和最后的这个T个特征会通过两个不同的Spatial Attention继续学习。
(3)Spatial Attention
Spatial Attention的计算借鉴了Transformer中self-attention的思想,但是具体的计算不同: data:image/s3,"s3://crabby-images/0b3e3/0b3e3029a444a88895f3f9385624773a6f444505" alt="在这里插入图片描述" 融合后的k-1个特征经过注意力机制产生k-1个邻接矩阵: data:image/s3,"s3://crabby-images/3d27f/3d27f2cfc46a340e1cd3ed430d104d9a01a11b2c" alt="在这里插入图片描述" 最后的T个特征经过注意力机制产生T个邻接矩阵: data:image/s3,"s3://crabby-images/54784/547848c7202345ecdeac20019a8b4b4c4ad417d5" alt="在这里插入图片描述" 最终该层的输出是k+T-1个邻接矩阵: ![[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Cxl8Voju-1646383434482)(C:\Users\admin\AppData\Roaming\Typora\typora-user-images\image-20220303161913550.png)]](https://img-blog.csdnimg.cn/9029eb833b11495781b3a3bdca37e6ae.png)
(4) LSTM Unit
将k+T-1个N×N的拉普拉斯矩阵沿着时间轴通过LSTM进行融合,LSTM就是一个普通的LSTM: data:image/s3,"s3://crabby-images/4557c/4557c691497d11437e8f153acee59e7800bed1d6" alt="在这里插入图片描述"
通过LSTM我们得到了Laplace matrix prediction unit的最终输出h_k+T-1
再与Global Laplace Matrix Learning Layer的输出 global Laplace matrix L_res结合,就得到了 Laplace matrix prediction network 的输出 L_p: data:image/s3,"s3://crabby-images/aac65/aac656afe49939342b0928a73761128be2a83404" alt="在这里插入图片描述" LMLN的算法可以总结为如下伪代码: data:image/s3,"s3://crabby-images/a8ffd/a8ffdd0c3293aa0d9e30fa717bdb9ebb755e6a7f" alt="在这里插入图片描述"
1.2 GCN
GCN是Fig. 2. 中的灰色部分,主要有4个部分。
(1)Temporal Convolution Layer
本模块的目的:to extract the high-dimensional local temporal information from the original traffic data.
对于输入 ,TCL的运算可以表示为: data:image/s3,"s3://crabby-images/2d818/2d8186807c8e550259ee4852eb6b714c389d8018" alt="在这里插入图片描述"
(2) Graph Temporal Convolution Layer*
data:image/s3,"s3://crabby-images/30858/30858da7701067a3218643556da40f8007b814c2" alt=""
(3) Temporal Attention*
这里是借鉴了参考文献[19]中的Temporal Attention,目的是to explore the long-range time relation。
其运算公式为: data:image/s3,"s3://crabby-images/45a54/45a54b22b327f1d87fedeb329557d7f5908ff310" alt="在这里插入图片描述" 其中: data:image/s3,"s3://crabby-images/db70b/db70b4e635903efaddd7b881e26dea0be4b9f240" alt="在这里插入图片描述"
(4)Batch norm
data:image/s3,"s3://crabby-images/e9579/e957967fd95bb75daa1e6926c6d48b0b63b425ac" alt="在这里插入图片描述"
GCN模块整体的运算过程如下: data:image/s3,"s3://crabby-images/1836c/1836c2ab6d17e7a9e22a8e69128e16556f9c3912" alt="在这里插入图片描述"
1.3 Loss Function
Output Layer 对 recent data、 daily-period data、 weekly-period data 分开处理。
主要是 recent data是连续的,而 daily-period data和weekly-period data经过了sample处理,被切分成k-i个单元,在时间轴上不连续。
所以使用: Conv_1×i to deal with the recent data, Conv_1×1 to reflect the k ? i feature unit dependently in the daily-period and weekly-period.
最后我们将所有convolution output看作我们模型的预测,采用l2_loss计算损失。 data:image/s3,"s3://crabby-images/f8501/f8501369210683eb97628ead2076ac82c7a17d4f" alt="在这里插入图片描述"
2. 实验
2.1 实验结果
data:image/s3,"s3://crabby-images/65000/650002711c504f5a3bca400870d14909f61a3e4f" alt="" GCRN:Y. Seo, M. Defferrard, P. Vandergheynst, and X. Bresson, “Structured sequence modeling with graph convolutional recurrent networks,” in Proc. Int. Conf. Learn. Represent. (ICLR), 2017, pp. 362–373.
Gated-STGCN:B. Yu, H. Yin, and Z. Zhu, “Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting,” in Proc. Int. Joint Conf. Artif. Intell., Jul. 2018, pp. 1–9.
ASTGCN: S. Guo, Y. Lin, N. Feng, C. Song, and H. Wan, “Attention based spatialtemporal graph convolutional networks for traffic flow forecasting,” in Proc. Assoc. Advance Artif. Intell. Conf.(AAAI), 2019, pp. 922–933.
2.2 消融实验
data:image/s3,"s3://crabby-images/071fa/071fa58ca4b8fe0d1128d98354e1546df4b435e1" alt="在这里插入图片描述"
2.4
data:image/s3,"s3://crabby-images/304bf/304bf90d61bacc5e6b9a1b9724bc84cf3bd44769" alt="在这里插入图片描述"
2.5
data:image/s3,"s3://crabby-images/fce3f/fce3f0dfd34ef64085cb4e3b7175f7f78c36efc4" alt="在这里插入图片描述"
3. Introduction
本文的研究动机:Thus, a data-driven method [17] was proposed to optimize a parameterized global-temporal-sharing Laplace matrix in the network training phase, and it obtained richer space connections compared to the empirical one [15], [16]. But, this graph’s Laplace matrix is still fixed in the prediction phase, which cannot capture the dynamic information of the graph to improve the forecasting accuracy.
本文的一个比较重要的灵感提供文:AAAI19 ASTGCN:attention mechanism based Spatial-Temporal Graph Convolutional Network (ASTGCN) [18] was presented, in which a dynamic Laplace matrix of the graph was constructed at each input sequence data by the spatial attention mechanism [19].
ASTGCN的两个缺点:①it still utilized the empirical Laplace matrix [15], [16] as a mask;② ASTGCN ignored the inner temporal connection between Laplace matrices of the adjacent periods.
|