If resegment=False, the root.text should not contain only the first sentence #69

michnov · 2021-02-09T11:01:56Z

Calling tokenize_tag_parse_tree with resegment=False first runs the underlying UDPipe, resulting in a sequence of tokens potentially grouped to sentence segments. The nested sequence is then flatten so that all tokens belong to the same segment. However, this was not reflected in the root.text attribute, which was always assigned a value from UDPipe by calling ufal.udpipe.Sentence.getText(). Apparently, instead of recomputing the return value on the fly, the getter returns a value pre-computed during processing.

If we do not want the text to be resegmented, the value of root.text must stay the same.

…ole text, not just the first sentence after segmentation

martinpopel · 2021-02-09T11:04:43Z

Thanks.

bugfix: if resegment==False, the text attribute should contain the wh…

ee34f2e

…ole text, not just the first sentence after segmentation

martinpopel merged commit 4bb1908 into master Feb 9, 2021

martinpopel deleted the no_resegment_text branch February 9, 2021 11:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

If resegment=False, the root.text should not contain only the first sentence #69

If resegment=False, the root.text should not contain only the first sentence #69

Uh oh!

michnov commented Feb 9, 2021

martinpopel commented Feb 9, 2021

Labels

3 participants

Uh oh!

If resegment=False, the root.text should not contain only the first sentence #69

If resegment=False, the root.text should not contain only the first sentence #69

Uh oh!

Conversation

michnov commented Feb 9, 2021

martinpopel commented Feb 9, 2021

Labels

3 participants