信息检索学术速递[1.10]
Update!H5支持摘要折叠,体验更佳!点击阅读原文访问arxivdaily.com,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏等功能!
cs.IR信息检索,共计4篇
【1】 MGAE: Masked Autoencoders for Self-Supervised Learning on Graphs
标题:MGAE:用于图的自监督学习的屏蔽自动编码器
链接:https://arxiv.org/abs/2201.02534
摘要:We introduce a novel masked graph autoencoder (MGAE) framework to perform
effective learning on graph structure data. Taking insights from
self-supervised learning, we randomly mask a large proportion of edges and try
to reconstruct these missing edges during training. MGAE has two core designs.
First, we find that masking a high ratio of the input graph structure, e.g.,
$70\%$, yields a nontrivial and meaningful self-supervisory task that benefits
downstream applications. Second, we employ a graph neural network (GNN) as an
encoder to perform message propagation on the partially-masked graph. To
reconstruct the large number of masked edges, a tailored cross-correlation
decoder is proposed. It could capture the cross-correlation between the head
and tail nodes of anchor edge in multi-granularity. Coupling these two designs
enables MGAE to be trained efficiently and effectively. Extensive experiments
on multiple open datasets (Planetoid and OGB benchmarks) demonstrate that MGAE
generally performs better than state-of-the-art unsupervised learning
competitors on link prediction and node classification.
【2】 SaL-Lightning Dataset: Search and Eye Gaze Behavior, Resource Interactions and Knowledge Gain during Web Search
标题:SAL-Lightning数据集:网络搜索期间的搜索和眼睛注视行为、资源交互和知识获取
链接:https://arxiv.org/abs/2201.02339
备注:To be published at the 2022 ACM SIGIR Conference on Human Information Interaction and Retrieval (CHIIR '22)
摘要:The emerging research field Search as Learning investigates how the Web
facilitates learning through modern information retrieval systems. SAL research
requires significant amounts of data that capture both search behavior of users
and their acquired knowledge in order to obtain conclusive insights or train
supervised machine learning models. However, the creation of such datasets is
costly and requires interdisciplinary efforts in order to design studies and
capture a wide range of features. In this paper, we address this issue and
introduce an extensive dataset based on a user study, in which $114$
participants were asked to learn about the formation of lightning and thunder.
Participants' knowledge states were measured before and after Web search
through multiple-choice questionnaires and essay-based free recall tasks. To
enable future research in SAL-related tasks we recorded a plethora of features
and person-related attributes. Besides the screen recordings, visited Web
pages, and detailed browsing histories, a large number of behavioral features
and resource features were monitored. We underline the usefulness of the
dataset by describing three, already published, use cases.
【3】 On the Effectiveness of Sampled Softmax Loss for Item Recommendation
标题:抽样软最大损失在项目推荐中的有效性研究
链接:https://arxiv.org/abs/2201.02327
备注:10 Pages, 1 figure, 5 tables
摘要:Learning objectives of recommender models remain largely unexplored. Most
methods routinely adopt either pointwise or pairwise loss to train the model
parameters, while rarely pay attention to softmax loss due to the high
computational cost. Sampled softmax loss emerges as an efficient substitute for
softmax loss. Its special case, InfoNCE loss, has been widely used in
self-supervised learning and exhibited remarkable performance for contrastive
learning. Nonetheless, limited studies use sampled softmax loss as the learning
objective to train the recommender. Worse still, none of them explore its
properties and answer "Does sampled softmax loss suit for item recommendation?"
and "What are the conceptual advantages of sampled softmax loss, as compared
with the prevalent losses?", to the best of our knowledge. In this work, we aim
to better understand sampled softmax loss for item recommendation.
Specifically, we first theoretically reveal three model-agnostic advantages:
(1) mitigating popularity bias, which is beneficial to long-tail
recommendation; (2) mining hard negative samples, which offers informative
gradients to optimize model parameters; and (3) maximizing the ranking metric,
which facilitates top-K performance. Moreover, we probe the model-specific
characteristics on the top of various recommenders. Experimental results
suggest that sampled softmax loss is more friendly to history and graph-based
recommenders (e.g., SVD++ and LightGCN), but performs poorly for ID-based
models (e.g., MF). We ascribe this to its shortcoming in learning
representation magnitude, making the combination with the models that are also
incapable of adjusting representation magnitude learn poor representations. In
contrast, the history- and graph-based models, which naturally adjust
representation magnitude according to node degree, are able to compensate for
the shortcoming of sampled softmax loss.
【4】 Multi-Behavior Enhanced Recommendation with Cross-Interaction Collaborative Relation Modeling
标题:基于交叉交互协同关系建模的多行为增强推荐
链接:https://arxiv.org/abs/2201.02307
备注:Published on ICDE 2021
摘要:Many previous studies aim to augment collaborative filtering with deep neural
network techniques, so as to achieve better recommendation performance.
However, most existing deep learning-based recommender systems are designed for
modeling singular type of user-item interaction behavior, which can hardly
distill the heterogeneous relations between user and item. In practical
recommendation scenarios, there exist multityped user behaviors, such as browse
and purchase. Due to the overlook of user's multi-behavioral patterns over
different items, existing recommendation methods are insufficient to capture
heterogeneous collaborative signals from user multi-behavior data. Inspired by
the strength of graph neural networks for structured data modeling, this work
proposes a Graph Neural Multi-Behavior Enhanced Recommendation (GNMR) framework
which explicitly models the dependencies between different types of user-item
interactions under a graph-based message passing architecture. GNMR devises a
relation aggregation network to model interaction heterogeneity, and
recursively performs embedding propagation between neighboring nodes over the
user-item interaction graph. Experiments on real-world recommendation datasets
show that our GNMR consistently outperforms state-of-the-art methods. The
source code is available at https://github.com/akaxlh/GNMR.
机器翻译,仅供参考
点击“阅读原文”获取带摘要的学术速递