StatsDLMathsRecomSys
diff --git a/‎README.md‎
Lines changed: 119 additions & 0 deletions b/‎README.md‎
Lines changed: 119 additions & 0 deletions
diff --git a/‎graph.py‎
Lines changed: 157 additions & 0 deletions b/‎graph.py‎
Lines changed: 157 additions & 0 deletions
@@ -0,0 +1,119 @@
+# Inductive Representation Learning on Temporal Graphs
+
+## Introduction
+
+The evolving nature of temporal dynamic graphs requires handling new nodes as well as capturing temporal patterns. The node embeddings, as functions of time, should represent both the static node features and the evolving topological structures. 
+
+We propose the temporal graph attention (TGAT) layer to efficiently aggregate temporal-topological neighborhood features as well as to learn the time-feature interactions. Stacking TGAT layers, the network recognizes the node embeddings as functions of time and is able to inductively infer embeddings for both new and observed nodes as the graph evolves. 
+
+The proposed approach handles both node classification and link prediction task, and can be naturally extended to include the temporal edge features.
+
+
+#### Paper link: [Inductive Representation Learning on Temporal Graphs](https://openreview.net/attachment?id=rJeW1yHYwH&name=original_pdf)
+
+
+## Running the experiments
+
+### Dataset and preprocessing
+
+#### Download the public data
+* [Reddit](http://snap.stanford.edu/jodie/reddit.csv)
+
+* [Wikipedia](http://snap.stanford.edu/jodie/wikipedia.csv)
+
+#### Preprocess the data
+We use the dense `npy` format to save the features in binary format. If edge features or nodes features are absent, it will be replaced by a vector of zeros. 
+```{bash}
+python process.py 
+```
+
+#### Use your own data
+Put your data under `processed` folder. The required input data includes `ml_${DATA_NAME}.csv`, `ml_${DATA_NAME}.npy` and `ml_${DATA_NAME}_node.npy`. They store the edge linkages, edge features and node features respectively. 
+
+The `CSV` file has following columns
+```
+u, i, ts, label, idx
+```
+, which represents source node index, target node index, time stamp, edge label and the edge index. 
+
+`ml_${DATA_NAME}.npy` has shape of [#temporal edges + 1, edge features dimention]. Similarly, `ml_${DATA_NAME}_node.npy` has shape of [#nodes + 1, node features dimension].
+
+
+All node index starts from `1`. The zero index is reserved for `null` during padding operations. So the maximum of node index equals to the total number of nodes. Similarly, maxinum of edge index equals to the total number of temporal edges. The padding embeddings or the null embeddings is a vector of zeros.
+
+### Requirements
+
+* python >= 3.7
+
+* Dependency
+
+```{bash}
+pandas==0.24.2
+torch==1.1.0
+tqdm==4.41.1
+numpy==1.16.4
+scikit_learn==0.22.1
+```
+
+### Command and configurations
+
+#### Sample commend
+
+* Learning the network using link prediction tasks
+```{bash}
+# t-gat learning on wikipedia data
+python -u learn_edge.py -d wikipedia --bs 200 --uniform --n_degree 20 --agg_method attn --attn_mode prod --gpu 0 --n_head 2 --prefix hello_world
+
+# t-gat learning on reddit data
+python -u learn_edge.py -d reddit --bs 200 --uniform --n_degree 20 --agg_method attn --attn_mode prod --gpu 0 --n_head 2 --prefix hello_world
+```
+
+* Learning the down-stream task (node-classification)
+
+Node-classification task reuses the network trained previously. Make sure the `prefix` is the same so that the checkpoint can be found under `saved_models`.
+
+```{bash}
+# on wikipedia
+python -u learn_node.py -d wikipedia --bs 100 --uniform --n_degree 20 --agg_method attn --attn_mode prod --gpu 0 --n_head 2 --prefix hello_world
+
+# on reddit
+python -u learn_node.py -d reddit --bs 100 --uniform --n_degree 20 --agg_method attn --attn_mode prod --gpu 0 --n_head 2 --prefix hello_world
+```
+#### General flags
+
+```{txt}
+optional arguments:
+ -h, --help show this help message and exit
+ -d DATA, --data DATA data sources to use, try wikipedia or reddit
+ --bs BS batch_size
+ --prefix PREFIX prefix to name the checkpoints
+ --n_degree N_DEGREE number of neighbors to sample
+ --n_head N_HEAD number of heads used in attention layer
+ --n_epoch N_EPOCH number of epochs
+ --n_layer N_LAYER number of network layers
+ --lr LR learning rate
+ --drop_out DROP_OUT dropout probability
+ --gpu GPU idx for the gpu to use
+ --node_dim NODE_DIM Dimentions of the node embedding
+ --time_dim TIME_DIM Dimentions of the time embedding
+ --agg_method {attn,lstm,mean}
+ local aggregation method
+ --attn_mode {prod,map}
+ use dot product attention or mapping based
+ --time {time,pos,empty}
+ how to use time information
+ --uniform take uniform sampling from temporal neighbors
+```
+
+## Cite us
+
+```
+@inproceedings{tgat_iclr20,
+title={Inductive representation learning on temporal graphs},
+author={da Xu and chuanwei ruan and evren korpeoglu and sushant kumar and kannan achan},
+booktitle={International Conference on Learning Representations (ICLR)},
+year={2020}
+}
+```
+
+
@@ -0,0 +1,157 @@
+import numpy as np
+import torch
+
+class NeighborFinder:
+ def __init__(self, adj_list, uniform=False):
+ """
+ Params
+ ------
+ node_idx_l: List[int]
+ node_ts_l: List[int]
+ off_set_l: List[int], such that node_idx_l[off_set_l[i]:off_set_l[i + 1]] = adjacent_list[i]
+ """ 
+ 
+ node_idx_l, node_ts_l, edge_idx_l, off_set_l = self.init_off_set(adj_list)
+ self.node_idx_l = node_idx_l
+ self.node_ts_l = node_ts_l
+ self.edge_idx_l = edge_idx_l
+ 
+ self.off_set_l = off_set_l
+ 
+ self.uniform = uniform
+ 
+ def init_off_set(self, adj_list):
+ """
+ Params
+ ------
+ adj_list: List[List[int]]
+ 
+ """
+ n_idx_l = []
+ n_ts_l = []
+ e_idx_l = []
+ off_set_l = [0]
+ for i in range(len(adj_list)):
+ curr = adj_list[i]
+ curr = sorted(curr, key=lambda x: x[1])
+ n_idx_l.extend([x[0] for x in curr])
+ e_idx_l.extend([x[1] for x in curr])
+ n_ts_l.extend([x[2] for x in curr])
+ 
+ 
+ off_set_l.append(len(n_idx_l))
+ n_idx_l = np.array(n_idx_l)
+ n_ts_l = np.array(n_ts_l)
+ e_idx_l = np.array(e_idx_l)
+ off_set_l = np.array(off_set_l)
+
+ assert(len(n_idx_l) == len(n_ts_l))
+ assert(off_set_l[-1] == len(n_ts_l))
+ 
+ return n_idx_l, n_ts_l, e_idx_l, off_set_l
+ 
+ def find_before(self, src_idx, cut_time):
+ """
+ 
+ Params
+ ------
+ src_idx: int
+ cut_time: float
+ """
+ node_idx_l = self.node_idx_l
+ node_ts_l = self.node_ts_l
+ edge_idx_l = self.edge_idx_l
+ off_set_l = self.off_set_l
+ 
+ neighbors_idx = node_idx_l[off_set_l[src_idx]:off_set_l[src_idx + 1]]
+ neighbors_ts = node_ts_l[off_set_l[src_idx]:off_set_l[src_idx + 1]]
+ neighbors_e_idx = edge_idx_l[off_set_l[src_idx]:off_set_l[src_idx + 1]]
+ 
+ if len(neighbors_idx) == 0 or len(neighbors_ts) == 0:
+ return neighbors_idx, neighbors_ts, neighbors_e_idx
+
+ left = 0
+ right = len(neighbors_idx) - 1
+ 
+ while left + 1 < right:
+ mid = (left + right) // 2
+ curr_t = neighbors_ts[mid]
+ if curr_t < cut_time:
+ left = mid
+ else:
+ right = mid
+ 
+ if neighbors_ts[right] < cut_time:
+ return neighbors_idx[:right], neighbors_e_idx[:right], neighbors_ts[:right]
+ else:
+ return neighbors_idx[:left], neighbors_e_idx[:left], neighbors_ts[:left]
+
+ def get_temporal_neighbor(self, src_idx_l, cut_time_l, num_neighbors=20):
+ """
+ Params
+ ------
+ src_idx_l: List[int]
+ cut_time_l: List[float],
+ num_neighbors: int
+ """
+ assert(len(src_idx_l) == len(cut_time_l))
+ 
+ out_ngh_node_batch = np.zeros((len(src_idx_l), num_neighbors)).astype(np.int32)
+ out_ngh_t_batch = np.zeros((len(src_idx_l), num_neighbors)).astype(np.float32)
+ out_ngh_eidx_batch = np.zeros((len(src_idx_l), num_neighbors)).astype(np.int32)
+ 
+ for i, (src_idx, cut_time) in enumerate(zip(src_idx_l, cut_time_l)):
+ ngh_idx, ngh_eidx, ngh_ts = self.find_before(src_idx, cut_time)
+
+ if len(ngh_idx) > 0:
+ if self.uniform:
+ sampled_idx = np.random.randint(0, len(ngh_idx), num_neighbors)
+ 
+ out_ngh_node_batch[i, :] = ngh_idx[sampled_idx]
+ out_ngh_t_batch[i, :] = ngh_ts[sampled_idx]
+ out_ngh_eidx_batch[i, :] = ngh_eidx[sampled_idx]
+ 
+ # resort based on time
+ pos = out_ngh_t_batch[i, :].argsort()
+ out_ngh_node_batch[i, :] = out_ngh_node_batch[i, :][pos]
+ out_ngh_t_batch[i, :] = out_ngh_t_batch[i, :][pos]
+ out_ngh_eidx_batch[i, :] = out_ngh_eidx_batch[i, :][pos]
+ else:
+ ngh_ts = ngh_ts[:num_neighbors]
+ ngh_idx = ngh_idx[:num_neighbors]
+ ngh_eidx = ngh_eidx[:num_neighbors]
+ 
+ assert(len(ngh_idx) <= num_neighbors)
+ assert(len(ngh_ts) <= num_neighbors)
+ assert(len(ngh_eidx) <= num_neighbors)
+ 
+ out_ngh_node_batch[i, num_neighbors - len(ngh_idx):] = ngh_idx
+ out_ngh_t_batch[i, num_neighbors - len(ngh_ts):] = ngh_ts
+ out_ngh_eidx_batch[i, num_neighbors - len(ngh_eidx):] = ngh_eidx
+ 
+ return out_ngh_node_batch, out_ngh_eidx_batch, out_ngh_t_batch
+
+ def find_k_hop(self, k, src_idx_l, cut_time_l, num_neighbors=20):
+ """Sampling the k-hop sub graph
+ """
+ x, y, z = self.get_temporal_neighbor(src_idx_l, cut_time_l, num_neighbors)
+ node_records = [x]
+ eidx_records = [y]
+ t_records = [z]
+ for _ in range(k -1):
+ ngn_node_est, ngh_t_est = node_records[-1], t_records[-1] # [N, *([num_neighbors] * (k - 1))]
+ orig_shape = ngn_node_est.shape
+ ngn_node_est = ngn_node_est.flatten()
+ ngn_t_est = ngh_t_est.flatten()
+ out_ngh_node_batch, out_ngh_eidx_batch, out_ngh_t_batch = self.get_temporal_neighbor(ngn_node_est, ngn_t_est, num_neighbors)
+ out_ngh_node_batch = out_ngh_node_batch.reshape(*orig_shape, num_neighbors) # [N, *([num_neighbors] * k)]
+ out_ngh_eidx_batch = out_ngh_eidx_batch.reshape(*orig_shape, num_neighbors)
+ out_ngh_t_batch = out_ngh_t_batch.reshape(*orig_shape, num_neighbors)
+
+ node_records.append(out_ngh_node_batch)
+ eidx_records.append(out_ngh_eidx_batch)
+ t_records.append(out_ngh_t_batch)
+ return node_records, eidx_records, t_records
+
+ 
+