Skip to content

Conversation

@riccardocappi
Copy link

@riccardocappi riccardocappi commented Oct 3, 2023

Describe your change:

I implemented from scratch a multinomial naive bayes classifier. The algorithm is trained and tested on the twenty_newsgroup dataset from sklearn to perform text classification

  • Add an algorithm?
  • Fix a bug or typo in an existing algorithm?
  • Documentation change?

Checklist:

  • I have read CONTRIBUTING.md.
  • This pull request is all my own work -- I have not plagiarized.
  • I know that pull requests will not be merged if they fail the automated tests.
  • This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
  • All new Python files are placed inside an existing directory.
  • All filenames are in all lowercase characters with no spaces or dashes.
  • All functions and variable names follow Python naming conventions.
  • All function parameters and return values are annotated with Python type hints.
  • All functions have doctests that pass the automated testing.
  • All new algorithms include at least one URL that points to Wikipedia or another similar explanation.
  • If this pull request resolves one or more open issues then the description above includes the issue number(s) with a closing keyword: "Fixes #ISSUE-NUMBER".
@algorithms-keeper algorithms-keeper bot added require descriptive names This PR needs descriptive function and/or variable names require tests Tests [doctest/unittest/pytest] are required require type hints https://docs.python.org/3/library/typing.html labels Oct 3, 2023
Copy link

@algorithms-keeper algorithms-keeper bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Click here to look at the relevant links ⬇️

🔗 Relevant Links

Repository:

Python:

Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.

algorithms-keeper commands and options

algorithms-keeper actions can be triggered by commenting on this PR:

  • @algorithms-keeper review to trigger the checks for only added pull request files
  • @algorithms-keeper review-all to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.

NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.

from sklearn.metrics import accuracy_score


def group_indices_by_target(targets):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: group_indices_by_target. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide type hint for the parameter: targets



class MultinomialNBClassifier:
def __init__(self, alpha=1):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: __init__. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide type hint for the parameter: alpha

self.priors = None
self.alpha = alpha

def _check_X(self, X):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function _check_X

Variable and function names should follow the snake_case naming convention. Please update the following name accordingly: _check_X

Please provide return type hint for the function: _check_X. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide descriptive name for the parameter: X

Please provide type hint for the parameter: X

if not sparse.issparse(X):
raise ValueError("Matrix X must be an instance of scipy.sparse.csr_matrix")

def _check_X_y(self, X, y):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function _check_X_y

Variable and function names should follow the snake_case naming convention. Please update the following name accordingly: _check_X_y

Please provide return type hint for the function: _check_X_y. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide descriptive name for the parameter: X

Please provide type hint for the parameter: X

Please provide descriptive name for the parameter: y

Please provide type hint for the parameter: y

raise ValueError(
"The expected shape for array y is (" + str(X.shape[0]) + ",), but got (" + str(len(y)) + ",)")

def fit(self, X, y):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function fit

Please provide return type hint for the function: fit. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide descriptive name for the parameter: X

Please provide type hint for the parameter: X

Please provide descriptive name for the parameter: y

Please provide type hint for the parameter: y

return np.array(y_pred)


def main():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function main

Please provide return type hint for the function: main. If the function does not return a value, please provide the type hint as: def function() -> None:

def main():
newsgroups_train = fetch_20newsgroups(subset='train')
newsgroups_test = fetch_20newsgroups(subset='test')
X_train = newsgroups_train['data']

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable and function names should follow the snake_case naming convention. Please update the following name accordingly: X_train

newsgroups_test = fetch_20newsgroups(subset='test')
X_train = newsgroups_train['data']
y_train = newsgroups_train['target']
X_test = newsgroups_test['data']

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable and function names should follow the snake_case naming convention. Please update the following name accordingly: X_test

X_test = newsgroups_test['data']
y_test = newsgroups_test['target']
vectorizer = TfidfVectorizer(stop_words='english')
X_train = vectorizer.fit_transform(X_train)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable and function names should follow the snake_case naming convention. Please update the following name accordingly: X_train

y_test = newsgroups_test['target']
vectorizer = TfidfVectorizer(stop_words='english')
X_train = vectorizer.fit_transform(X_train)
X_test = vectorizer.transform(X_test)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable and function names should follow the snake_case naming convention. Please update the following name accordingly: X_test

@algorithms-keeper algorithms-keeper bot added the awaiting reviews This PR is ready to be reviewed label Oct 3, 2023
@algorithms-keeper algorithms-keeper bot added the tests are failing Do not merge until tests pass label Oct 3, 2023
@riccardocappi riccardocappi reopened this Oct 3, 2023
Copy link

@algorithms-keeper algorithms-keeper bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Click here to look at the relevant links ⬇️

🔗 Relevant Links

Repository:

Python:

Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.

algorithms-keeper commands and options

algorithms-keeper actions can be triggered by commenting on this PR:

  • @algorithms-keeper review to trigger the checks for only added pull request files
  • @algorithms-keeper review-all to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.

NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.



class MultinomialNBClassifier:
def __init__(self, alpha=1):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: __init__. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide type hint for the parameter: alpha

self.priors = None
self.alpha = alpha

def fit(self, data: sparse.csr_matrix, y: ArrayLike) -> None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function fit

Please provide descriptive name for the parameter: y

return np.array(y_pred)


def main() -> None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function main

Copy link

@algorithms-keeper algorithms-keeper bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Click here to look at the relevant links ⬇️

🔗 Relevant Links

Repository:

Python:

Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.

algorithms-keeper commands and options

algorithms-keeper actions can be triggered by commenting on this PR:

  • @algorithms-keeper review to trigger the checks for only added pull request files
  • @algorithms-keeper review-all to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.

NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.



class MultinomialNBClassifier:
def __init__(self, alpha=1):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: __init__. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide type hint for the parameter: alpha

self.priors = None
self.alpha = alpha

def fit(self, data: sparse.csr_matrix, y: ArrayLike) -> None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function fit

Please provide descriptive name for the parameter: y

return np.array(y_pred)


def main() -> None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function main

Copy link

@algorithms-keeper algorithms-keeper bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Click here to look at the relevant links ⬇️

🔗 Relevant Links

Repository:

Python:

Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.

algorithms-keeper commands and options

algorithms-keeper actions can be triggered by commenting on this PR:

  • @algorithms-keeper review to trigger the checks for only added pull request files
  • @algorithms-keeper review-all to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.

NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.



class MultinomialNBClassifier:
def __init__(self, alpha=1):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: __init__. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide type hint for the parameter: alpha

self.priors = None
self.alpha = alpha

def fit(self, data: sparse.csr_matrix, y: ArrayLike) -> None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function fit

Please provide descriptive name for the parameter: y

return np.array(y_pred)


def main() -> None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function main

Copy link

@algorithms-keeper algorithms-keeper bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Click here to look at the relevant links ⬇️

🔗 Relevant Links

Repository:

Python:

Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.

algorithms-keeper commands and options

algorithms-keeper actions can be triggered by commenting on this PR:

  • @algorithms-keeper review to trigger the checks for only added pull request files
  • @algorithms-keeper review-all to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.

NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.



class MultinomialNBClassifier:
def __init__(self, alpha=1):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: __init__. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide type hint for the parameter: alpha

self.priors = None
self.alpha = alpha

def fit(self, data: sparse.csr_matrix, y: ArrayLike) -> None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function fit

Please provide descriptive name for the parameter: y

return np.array(y_pred)


def main() -> None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function main

pre-commit-ci bot and others added 3 commits October 3, 2023 18:36
…o naive-bayes-text-classifier # Conflicts: #	machine_learning/multinomial_naive_bayes_classifier.py
@algorithms-keeper algorithms-keeper bot removed the require descriptive names This PR needs descriptive function and/or variable names label Oct 6, 2023
Copy link

@algorithms-keeper algorithms-keeper bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Click here to look at the relevant links ⬇️

🔗 Relevant Links

Repository:

Python:

Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.

algorithms-keeper commands and options

algorithms-keeper actions can be triggered by commenting on this PR:

  • @algorithms-keeper review to trigger the checks for only added pull request files
  • @algorithms-keeper review-all to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.

NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.



class MultinomialNBClassifier:
def __init__(self, alpha: int = 1):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: __init__. If the function does not return a value, please provide the type hint as: def function() -> None:

self.priors = None
self.alpha = alpha

def fit(self, data: sparse.csr_matrix, targets: npt.ArrayLike) -> None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function fit

return np.array(y_pred)


def main() -> None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function main

@algorithms-keeper algorithms-keeper bot removed the tests are failing Do not merge until tests pass label Oct 6, 2023
Copy link

@algorithms-keeper algorithms-keeper bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Click here to look at the relevant links ⬇️

🔗 Relevant Links

Repository:

Python:

Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.

algorithms-keeper commands and options

algorithms-keeper actions can be triggered by commenting on this PR:

  • @algorithms-keeper review to trigger the checks for only added pull request files
  • @algorithms-keeper review-all to trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.

NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.

from sklearn.metrics import accuracy_score


def group_indices_by_target(targets):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: group_indices_by_target. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide type hint for the parameter: targets



class MultinomialNBClassifier:
def __init__(self, alpha=1):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: __init__. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide type hint for the parameter: alpha

self.priors = None
self.alpha = alpha

def fit(self, data, targets):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function fit

Please provide return type hint for the function: fit. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide type hint for the parameter: data

Please provide type hint for the parameter: targets

tot_features_count + self.alpha * n_features
)

def predict(self, data):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: predict. If the function does not return a value, please provide the type hint as: def function() -> None:

Please provide type hint for the parameter: data

return np.array(y_pred)


def main():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function main

Please provide return type hint for the function: main. If the function does not return a value, please provide the type hint as: def function() -> None:

@algorithms-keeper algorithms-keeper bot removed require tests Tests [doctest/unittest/pytest] are required require type hints https://docs.python.org/3/library/typing.html labels Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting reviews This PR is ready to be reviewed

1 participant