- Notifications
You must be signed in to change notification settings - Fork 205
Add the AMR graph constrction and RGCN in example #578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
52d81ee
to dee086d
Compare There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the comments.
return self.nodes[:].features | ||
| ||
@property | ||
def ntypes(self) -> List[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the graph is homogeneous, it should be an exception.
graph4nlp/pytorch/data/data.py Outdated
), "The number of nodes to be added should be greater than 0. (Got {})".format(node_num) | ||
| ||
if not self.is_hetero: | ||
assert ntypes is None, "The graph is homogeneous, ntypes should be None." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please raise exceptions
return EdgeView(self) | ||
| ||
@property | ||
def etypes(self) -> List[Tuple[str, str, str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When it is homogeneous, self._etypes is not assigned.
graph4nlp/pytorch/data/data.py Outdated
return self._edge_attributes | ||
| ||
# Conversion utility functions | ||
def make_data_dict(self) -> Dict[Tuple[str, str, str], Tuple[torch.Tensor, torch.Tensor]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be a utility function.
graph4nlp/pytorch/data/dataset.py Outdated
edge_token = graph.edge_attributes[edge_idx]["token"] | ||
s.add(edge_token) | ||
except Exception as e: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please handle the catched exception
graph4nlp/pytorch/data/dataset.py Outdated
if "val" in self.__dict__: | ||
self.val = self.build_topology(self.val) | ||
# build_edge_vocab and save | ||
if self.init_edge_vocab: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be a new function similar to "build_vocab()". E.g., build_edge_vocab()
for_inference=False, | ||
reused_vocab_model=None, | ||
nlp_processor_args=None, | ||
init_edge_vocab=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
init_edge_vocab is not clear. It covers 1.build edge vocab, 2. an indicator to use heterogeneous graph.
TODO: discuss
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- build_edge_vocab: bool
- is_hetero: bool
for i in range(len(data_item_collect)): | ||
data_item_collect[i] = self.dataset._vectorize_one_dataitem( | ||
data_item_collect[i], self.vocab_model, use_ie=use_ie | ||
data_item_collect[i], self.vocab_model, use_ie=use_ie, edge_vocab=self.edge_vocab |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the api in dataset._vectorize_one_dataitem doesn't have the parameter: "edge_vocab", please check it.
_vectorize_one_dataitem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- move the edge vocab to the library code
- unify edge_vocab with vocab_model
"w2v_bert", | ||
"w2v_bert_bilstm", | ||
"w2v_bert_bigru", | ||
"w2v_amr", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why amr shoud be a general embedding strategy?
I don't think it should be regarded as a new general embedding strategy.
rnn_input_size = word_emb_size | ||
| ||
if "pos" in word_emb_type: | ||
self.word_emb_layers["pos"] = WordEmbedding(37, 50) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please don't use magic numbers.
dee086d
to 0d0400c
Compare 0d0400c
to 0dd8edb
Compare
Description
Checklist
Please feel free to remove inapplicable items for your PR.
or have been fixed to be compatible with this change
Changes