[WIP] Rose #750

andrealorenzon · 2020-08-21T13:09:05Z

Reference Issue

First PR for implementation of Random OverSampling Example (ROSE) method.

What does this implement/fix? Explain your changes.

I drafted the class ROSE, but I have still some error on common test units. I would need a help to understand what I did miss.

Any other comments?

pep8speaks · 2020-08-21T13:09:08Z

Hello @andrealorenzon! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

In the file imblearn/over_sampling/_rose.py:

Line 117:17: E123 closing bracket does not match indentation of opening bracket's line

Comment last updated at 2020-09-16 09:03:36 UTC

lgtm-com · 2020-08-21T13:36:51Z

This pull request introduces 4 alerts when merging 0a3307b into 0acd717 - view on LGTM.com

new alerts:

3 for Unused import
1 for Unused local variable

hayesall · 2020-08-21T14:17:42Z

I have still some error on common test units

It looks like most the errors raised are linting errors. The pep8speaks/lgtm entries above should have most of them. If you want to test things locally, running pylint on your files should help catch most of these!

The current version also looks like it''s pointing the user guide. It would be good to include a couple examples / an overview in the user guide as well.

andrealorenzon · 2020-08-22T07:14:25Z

I have still some error on common test units

It looks like most the errors raised are linting errors. The pep8speaks/lgtm entries above should have most of them. If you want to test things locally, running pylint on your files should help catch most of these!

The current version also looks like it''s pointing the user guide. It would be good to include a couple examples / an overview in the user guide as well.

Thank you. I'll fix linting, and add documentation in the user guide.
When I'm done, should I just make another PR?

hayesall · 2020-08-22T13:31:45Z

When I'm done, should I just make another PR?

I marked this as a draft while a couple of these other points get worked on. When you're ready for some feedback, mark it as "Ready for Review" and we'll iterate!

andrealorenzon · 2020-08-25T18:13:31Z

Are the PR automatically updated for new pushes on my fork repo?

chkoar · 2020-08-25T18:39:55Z

All pushed commits to andrealorenzon:ROSE will update the current PR.

hayesall

Still a few linting errors (see logs here), I missed mentioning flake8 earlier.

See the review comments otherwise.

imblearn/over_sampling/_rose.py

hayesall · 2020-08-26T01:26:56Z

imblearn/over_sampling/_rose.py

+ .. [1] N. Lunardon, G. Menardi, N.Torelli, "ROSE: A Package for Binary
+ Imbalanced Learning," R Journal, 6(1), 2014.
+
+ .. [2] G Menardi, N. Torelli, "Training and assessing classification
+ rules with imbalanced data," Data Mining and Knowledge
+ Discovery, 28(1), pp.92-122, 2014.


These are great to include. References should be referenced from the main docstring for the class. A short summary of the method would also be good.

Here's an example from the BorderlineSMOTE class:

imbalanced-learn/imblearn/over_sampling/_smote.py

Lines 219 to 224 in 0acd717

class BorderlineSMOTE(BaseSMOTE):

"""Over-sampling using Borderline SMOTE.

This algorithm is a variant of the original SMOTE algorithm proposed in

[2]_. Borderline samples will be detected and used to generate new

synthetic samples.

I added a better description in the docstring.

I would like to add more complete information on the maths in the docs too, but I'm not familiar with Sphinx. Could you point me to some instructions or metadocumentation?

Math typesetting should look familiar if you've seen LaTeX before, here's a short guide from sphinx's docs: https://www.sphinx-doc.org/en/1.0/ext/math.html. Syntax is based on reStructuredText, which feels similar to markdown but has a powerful directive system.

Could you point me to some instructions or metadocumentation?

Nothing specific to imblearn. If you want to learn more, sphinx's "Getting Started" guide is a good place to start. (I'd recommend it regardless. Sphinx is used for a huge number of projects, so the skill is extremely transferable).

Our Makefile and conf.py in the docs/ directory are fairly standard. Building local documentation then looks like:

cd docs/ make html xdg-open _build/html/index.html

Ok. I will just have some problem with plots, but I'll manage to add everything to the docs.

I have another issue: failing checks, see below. It's not very clear to me what they do address at, and what to fix.

imblearn/over_sampling/_rose.py

imblearn/over_sampling/tests/test_rose.py

lgtm-com · 2020-09-15T10:22:42Z

This pull request introduces 1 alert when merging 9ac2797 into 0acd717 - view on LGTM.com

new alerts:

1 for Unused import

lgtm-com · 2020-09-15T17:54:21Z

This pull request introduces 1 alert when merging c391ec3 into 0acd717 - view on LGTM.com

new alerts:

1 for Unused import

andrealorenzon · 2020-09-16T11:33:32Z

going back to local development, I don't want to spam broken builds

Andrea Lorenzon added 3 commits August 21, 2020 11:34

Created empty test units

53c7a8d

added ROSE empty class, modified __init__.py

183c03f

implemented ROSE, still some failed test

0a3307b

hayesall added the Package: over_sampling label Aug 21, 2020

PEP8 cleaning

886694f

hayesall marked this pull request as draft August 21, 2020 14:18

PEP8 linting

07731c4

andrealorenzon marked this pull request as ready for review August 23, 2020 20:59

andrealorenzon changed the title ~~Rose~~ [WIP] Rose Aug 23, 2020

hayesall reviewed Aug 26, 2020

View reviewed changes

Andrea Lorenzon added 7 commits August 26, 2020 12:29

fixed linting errors.

c0d7473

updated documentation and bibliography

013f7cc

cleaned ROSE test

2b34f47

added an exception for non binary datasets

8d0e99e

multiclass oversampling

8bbbd2e

removed non-binary exception

9ac2797

removed unused import

b41b06a

Andrea Lorenzon added 5 commits September 15, 2020 17:23

minor fixes

d2dd6f4

linting

b31a5c3

linting

d5ca24c

linting

b6e95aa

linting

6f7f8e1

removed explicit pandas dataframe management

c391ec3

Andrea Lorenzon added 2 commits September 16, 2020 09:34

added check_X_y() parsing

93ac868

removed check_X_y test

bdffda3

andrealorenzon closed this Sep 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Rose #750

[WIP] Rose #750

Uh oh!

andrealorenzon commented Aug 21, 2020

pep8speaks commented Aug 21, 2020 •

edited

Loading

lgtm-com bot commented Aug 21, 2020

hayesall commented Aug 21, 2020

andrealorenzon commented Aug 22, 2020

hayesall commented Aug 22, 2020

andrealorenzon commented Aug 25, 2020

chkoar commented Aug 25, 2020

hayesall left a comment

Uh oh!

hayesall Aug 26, 2020

andrealorenzon Aug 26, 2020

hayesall Aug 26, 2020

andrealorenzon Aug 27, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lgtm-com bot commented Sep 15, 2020

lgtm-com bot commented Sep 15, 2020

andrealorenzon commented Sep 16, 2020

Labels

4 participants

	class BorderlineSMOTE(BaseSMOTE):
	"""Over-sampling using Borderline SMOTE.

	This algorithm is a variant of the original SMOTE algorithm proposed in
	[2]_. Borderline samples will be detected and used to generate new
	synthetic samples.

[WIP] Rose #750

[WIP] Rose #750

Uh oh!

Conversation

andrealorenzon commented Aug 21, 2020

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

pep8speaks commented Aug 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2020-09-16 09:03:36 UTC

lgtm-com bot commented Aug 21, 2020

hayesall commented Aug 21, 2020

andrealorenzon commented Aug 22, 2020

hayesall commented Aug 22, 2020

andrealorenzon commented Aug 25, 2020

chkoar commented Aug 25, 2020

hayesall left a comment

Choose a reason for hiding this comment

Uh oh!

hayesall Aug 26, 2020

Choose a reason for hiding this comment

andrealorenzon Aug 26, 2020

Choose a reason for hiding this comment

hayesall Aug 26, 2020

Choose a reason for hiding this comment

andrealorenzon Aug 27, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lgtm-com bot commented Sep 15, 2020

lgtm-com bot commented Sep 15, 2020

andrealorenzon commented Sep 16, 2020

Labels

4 participants

pep8speaks commented Aug 21, 2020 •

edited

Loading