| |
|
开发:
C++知识库
Java知识库
JavaScript
Python
PHP知识库
人工智能
区块链
大数据
移动开发
嵌入式
开发工具
数据结构与算法
开发测试
游戏开发
网络协议
系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程 数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁 |
-> 人工智能 -> Memory-Associated Differential Learning论文及代码解读 -> 正文阅读 |
|
[人工智能]Memory-Associated Differential Learning论文及代码解读 |
Memory-Associated Differential Learning论文及代码解读论文来源:论文PDF:Memory-Associated Differential Learning论文 论文代码:Memory-Associated Differential Learning代码 论文解读:1.AbstractConventional Supervised Learning approaches focus on the mapping from input features to output labels. After training, the learnt models alone are adapted onto testing features to predict testing labels in isolation, with training data wasted and their associations ignored. To take full advantage of the vast number of training data and their associations, we propose a novel learning paradigm called Memory-Associated Differential (MAD) Learning. We first introduce an additional component called Memory to memorize all the training data. Then we learn the differences of labels as well as the associations of features in the combination of a differential equation and some sampling methods. Finally, in the evaluating phase, we predict unknown labels by inferencing from the memorized facts plus the learnt differences and associations in a geometrically meaningfull manner. We gently build this theory in unary situations and apply it on Image Recognition, then extend it into Link Prediction as a binary situation, in which our method outperforms strong state-of-the-art baselines on three citation networks and ogbl-ddi dataset. 2.Introduction
3.Related WorksInstead of treating External Memory as a way to add more learnable parameters to store uninterpretable hidden states, we try to memorize the facts as they are, and then learn the differences and associations between them. Most of the experiments in this article are designed to solve Link Prediction problem that we predict whether a pair of nodes in a graph are likely to be connected, how much the weight their edge bares, or what attributes their edge should have. Although our method is derived from a different perspective of view, we point out that Matrix Factorization can be seen as a simplification of MAD Learning with no memory and no sampling. 4.Proposed Approach4.1 Memory-Associated Differential Learning By applying Mean Value Theorem for Definite Integrals [Comenetz, 2002], we can estimate the unknown y with known y0 if x0 is close enough to x: 4.2 Inferencing from Multiple References To get a steady and accurate estimation of y, we can sample n references x1, x2, · · · , xn to get n estimations y?|y1, y?|y2, · · · , y?|yn and combine them with an aggregator such as mean: 4.3 Soft Sentinels and Uncertainty we introduce a mechanism on top of Softmin named Soft Sentinel. A Soft Sentinel is a dummy element mixed into the array of estimations with no information (e.g. the logit is 0) but a set distance (e.g. 0). The estimation after k Soft Sentinels distant at 1 added is: 4.4 Other Details To adapt to this situation, we generally wrap the memory with an adaptor function m such as a one-layer MLP, getting y?|y0 = m(y0) + (f(x) ? f(x0)) · g(x) where g(x) stands for gradient. When the encodings of nodes are dynamic and no features are provided, we usually adopt Random Mode in the training phase for efficiency and adopt Dynamic NN Mode in the evaluation phase for performance. 4.5 Binary MAD Learning 5.ExperimentsIn the training phase, we sample arbitrary pairs of nodes to construct negative samples [Grover and Leskovec, 2016] and compare the scores between connected pairs and negative samples with Cross-Entropy as the loss function: where y is the number of positive samples and n of negative samples, py(i) is the predicted probability of the i-th positive sample and pn(i) of the i-th negative sample. In the evaluating phase, we record the scores not only in Dynamic NN Mode but also in Random Mode. We have these three experimental settings to examine the contribution of Softmin and Soft Sentinels: mean. Estimations are aggregated by mean function. we repeat that MAD Learning does not predict directly. From another point of view, this experiment implies that undirect references can also be beneficial on par with direct information. 6.Discussionby extending it from a scalar to a vector, MAD Learning can be used for graphs with featured edges. We also point out that MAD Learning can learn relations in heterogeneous graphs where nodes belong to different types (usually represented by encodings in different lengths). The only requirement is that positions of the source nodes should match with gradients of the destination nodes and vice versa. 7. ConclusionIn this work, we explore a novel learning paradigm which is flexible, effective and interpretable. The outstanding results, especially on Link Prediction, open the door for several research directions:
在这项工作中,我们探索了一种灵活、有效和可解释的新型学习范式。 突出的结果,尤其是在链接预测方面,为几个研究方向打开了大门:
代码解读: |
|
|
上一篇文章 下一篇文章 查看所有文章 |
|
开发:
C++知识库
Java知识库
JavaScript
Python
PHP知识库
人工智能
区块链
大数据
移动开发
嵌入式
开发工具
数据结构与算法
开发测试
游戏开发
网络协议
系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程 数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁 |
360图书馆 购物 三丰科技 阅读网 日历 万年历 2024年11日历 | -2024/11/17 22:41:00- |
|
网站联系: qq:121756557 email:121756557@qq.com IT数码 |