Empowering A* Algorithm With Neuralized Variational Heuristics for Fastest Route Recommendation

IEEE TKDE 2023

Paper
无开源代码

0. Summary

0.1 本文解决的问题

把路线推荐算法当作路线规划问题，使用A*算法。SOTA模型使用神经网络估计g()和h()函数。

h()不应当被过度估计，否则会产生次优解。
g()应当足够精确。现有方法分别预测每一个边的时间，最后把所有时间相加。应当结合已通过路径的上下文和顺序性特征，预测整个路径的通过时间。

0.2 本文的主要贡献

用序列模型预测整个已通过路径的时间。
提出使用variational inference学习h()的分布，因此可以获得一个admissible的（高置信的）h()值，避免次优解。
用GAN生成类似g()估计器的路径表示，用于把g()中获得的最快路径信息传输给h()

1. Background

路径推荐任务：基于历史路径数据，给定source node 和 destination node，预测 most likely route。

1.1 Baselines

NASF: 使用A*和神经网络估计g()和h()

1.2 Problem Formulation

最佳路径 $p^*$ 由路网图 $G=(V, E)$ 和 query $q=[v_s,v_d]$ 预测得到：

p^*=\mathcal{F}(G,q)

A*算法:

g(): observable path / path-aware

h(): path-blind

f(\cdot)=g(\cdot)+h(\cdot)

2. Method

framework

2.1 g()估计器

2.1.1 Attribute Embedding Component

Topological features

使用 DeepWalk 进行 Edge embedding。

Temporal features

按照^[1]的方法创建时域图。使用 Node2Vec 进行 Node embedding。

Temporal graph

External features

weather, holiday: one-hot encoding
recent traffic condition: normalized average traffic time of road segments in the last 24 hours

最后把以上信息Concatenate，得到每一个边上的representation： $\{r_{e_i}\}_{i \in [1, \ldots , t]}$ .

2.1.2 Sequentiality Modeling Component

使用BiLSTM处理每一个path上的edge。BiLSTM在第 $i\in (1,...,t)$ 个edge上的输出是 $h_{fd}^i$ 和 $h_{bd}^i$ 。

2.1.3 Multi-Task Learning Component

使用最后一个（第t个）edge上的 $h_{op}=h_{\mathrm{fd}}^{(t)}\oplus h_{\mathrm{bd}}^{(t)}$ 作为总可见路径的信息，同时把 $\{h_{sp_i}=h_\mathrm{fd}^{(i)}\}_{i \in [1, \ldots , t-1]}$ 作为从出发点走过的i个边组成的子路径的信息。

使用两个Head分别对subpath和path做时间预测：

[\hat{t}_{sp_1},\ldots,\hat{t}_{sp_{t-1}}]=MLP_1\left([h_{sp_1},\ldots,h_{sp_{t-1}}]\right)

\hat{t}_{op}= MLP_2(h_{op})

使用MSE作为Loss：

\mathcal{L}_{sp}(\theta_g)=\frac1{t-1}\sum_{i=1}^{t-1}\|\hat{t}_{sp_i}-t_{[1:i]}\|_2^2

\mathcal{L}_{op}(\theta_g)=\|\hat{t}_{op}-t\|_2^2

2.2 h()估计器

2.2.1 Representation Learning of Heuristic Inputs

$g()$ 的输入是整个可见路径 $\{e_1, \ldots, e_t\}$ ，但 $h()$ 的输入只有当前节点 $v_c$ 和终点 $v_d$ 。但是两个估计器的目的是一样的：估算最短时间。

第一个idea是仿照[3]的方法，用当前节点和终点之间的shortest path估计最短时间。但是shortest和fastest有所不同，因此直接使用会引入噪声。

第二个idea是同时学习 $g()$ 和 $h()$ ，让 $h()$ 学习 $g()$ 提取最快路径的信息。

Fastest Path - GAN

首先，手动找到从 $v_c$ 到 $v_d$ 的 fastest path，然后用 $g()$ 编码作为 ground truth $h_{fp}$ 。

generator: $h_{od}=\mathcal{G}(r_{v_c,v_d})$

$r_{v_c,v_d}$ 如何得到？论文原文：

Specifically, we first use the representation of road segments in the g(⋅) estimator to initialize the representation of the two nodes $r_{v_c,v_d}$

没懂

discriminator: $\mathcal D$

$\mathcal G$ 生成的 $h_{od}$ 在多个下游任务上同时训练：最快路径长度 $d$ 、最快路径转弯数等。

这里的 $\mathcal G$ 和 $\mathcal D$ 都是MLP。相当于强行让 $\mathcal G$ 记住，对于特定的图，任意两个节点之间的最快路径。

Shortest Path - GRU

按照[3]的经验， K shortest path也可以用来估计h。从 $v_c$ 到 $v_d$ 的最短路径作为 GRU 的输入，得到 $h_{sp}$ , 用来预测从 $v_c$ 到 $v_d$ 的最快时间。

最后结果采用 $h_{fp}$ 和 $h_{sp}$ 的动态门控加权和。

2.2.2 Variational Inference Based Prediction

[5]和[19]分别证明了高斯分布可以对交通情况和交通速度建模。把交通情况记作 $u_{od}$ ，把交通速度记作 $v_{od}$ ，假设这两个量 $P(u_{od}|r_{od})$ 和 $P(v_{od}|u_{od})$ 遵循高斯分布。

P(u_{od}|r_{od}) = \mathcal {N}(\mu _{u}, diag(\sigma ^{2}_{u}))

P(v_{od}|u_{od}) = \mathcal {N}(\mu _{v}, \sigma ^{2}_{v})

又因为 $t_{od} = \frac{d}{v_{od}}$ ，所以可以估计从节点 $v_c$ 到节点 $v_d$ 的时间。

论文原文说 $t_{od}$ 遵循Inverse Gaussian distribution。应该是错了。应该遵循Reciprocal normal distribution

两个条件分布可以看成是VAE，因此用VAE建模训练。

给定最短时间 $t_h$ 的置信度理论上是 $\tau _{t_{h}} = \int ^{t_{h}}_{0}{P(t,\mu _{t}^{\prime },\lambda _{t}^{\prime })dt}$ ，但还是额外学习了一个置信度。

如何获得Decoder输出的分布？贝叶斯线性回归？

Reference

Li, Yaguang, et al. Multi-task representation learning for travel time estimation. Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 2018. ↩

机器学习

#机器学习 #论文阅读 #Route Recommendations

Empowering A* Algorithm With Neuralized Variational Heuristics for Fastest Route Recommendation

https://blog.wintertee.top/机器学习/Empowering-A-Algorithm-With-Neuralized-Variational-Heuristics-for-Fastest-Route-Recommendation/

作者

WinterTee

发布于

2024年4月28日

更新于

2024年5月6日

许可协议

NeuroMLR: Robust & Reliable Route Recommendation on Road Networks 下一篇