Skip to content

Conversation

@zhxfl
Copy link
Member

@zhxfl zhxfl commented Feb 6, 2018

split long sentence, drop sentence close by -1

@zhxfl zhxfl requested review from kuke and pkuyym February 6, 2018 09:29
@zhxfl zhxfl added the DeepASR label Feb 6, 2018

assert feature_frame_num == label_frame_num

if self._split_sentence_threshold == -1 or self._split_perturb == -1 or self._split_sub_sentence_len == -1 or self._split_sentence_threshold >= feature_frame_num:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep one line under the 80-character limit, which is also suitable for comments

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

@kuke
Copy link
Collaborator

kuke commented Feb 7, 2018

LGTM

label_desc_paths (list|tuple): Files containing the description of
samples' label data.
split_perturb(int): split long sentence' perturb sub-sentence length value.
split_sentence_threshold(int): sentence length large than
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sentence --> Sentence
large --> larger
operator --> operation

sentence length large than split_sentence_threshold trigger split operator. -->
Sentence whose length larger than the value will trigger split operation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

label_bin_paths (list|tuple): Files containing the binary label data.
label_desc_paths (list|tuple): Files containing the description of
samples' label data.
split_perturb(int): split long sentence' perturb sub-sentence length value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split_perturb(int) --> split_perturb (int)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split long sentence' perturb sub-sentence length value. Please refine this comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random perturb sub-sentence length when split long sentence

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about Maximum perturbation value for length of sub-sentence when splitting long sentence.?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

remain_frame_num = feature_frame_num
while True:
if remain_frame_num > self._split_sentence_threshold:
cur_frame_len = self._split_sub_sentence_len + random.randint(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems exceed 80 columns.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use self._rng instead using random.randint directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._rng.randint(0, self._split_perturb)

sample_info.feature_start,
sample_info.feature_size)

assert sample_info.feature_frame_num * sample_info.feature_dim * 4 == len(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems exceed 80 columns.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree


# drop long sentence
if self._drop_frame_len >= sample_data[0].shape[0]:
if self._drop_frame_len == -1 or self._drop_frame_len >= sample_data[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

Copy link
Contributor

@pkuyym pkuyym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhxfl zhxfl merged commit 1184109 into PaddlePaddle:develop Feb 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3 participants