-
- Notifications
You must be signed in to change notification settings - Fork 49.2k
Multinomial naive bayes text classifier #9619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Multinomial naive bayes text classifier #9619
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper reviewto trigger the checks for only added pull request files@algorithms-keeper review-allto trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
| from sklearn.metrics import accuracy_score | ||
| | ||
| | ||
| def group_indices_by_target(targets): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide return type hint for the function: group_indices_by_target. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: targets
| | ||
| | ||
| class MultinomialNBClassifier: | ||
| def __init__(self, alpha=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide return type hint for the function: __init__. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: alpha
| self.priors = None | ||
| self.alpha = alpha | ||
| | ||
| def _check_X(self, X): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function _check_X
Variable and function names should follow the snake_case naming convention. Please update the following name accordingly: _check_X
Please provide return type hint for the function: _check_X. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide descriptive name for the parameter: X
Please provide type hint for the parameter: X
| if not sparse.issparse(X): | ||
| raise ValueError("Matrix X must be an instance of scipy.sparse.csr_matrix") | ||
| | ||
| def _check_X_y(self, X, y): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function _check_X_y
Variable and function names should follow the snake_case naming convention. Please update the following name accordingly: _check_X_y
Please provide return type hint for the function: _check_X_y. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide descriptive name for the parameter: X
Please provide type hint for the parameter: X
Please provide descriptive name for the parameter: y
Please provide type hint for the parameter: y
| raise ValueError( | ||
| "The expected shape for array y is (" + str(X.shape[0]) + ",), but got (" + str(len(y)) + ",)") | ||
| | ||
| def fit(self, X, y): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function fit
Please provide return type hint for the function: fit. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide descriptive name for the parameter: X
Please provide type hint for the parameter: X
Please provide descriptive name for the parameter: y
Please provide type hint for the parameter: y
| return np.array(y_pred) | ||
| | ||
| | ||
| def main(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function main
Please provide return type hint for the function: main. If the function does not return a value, please provide the type hint as: def function() -> None:
| def main(): | ||
| newsgroups_train = fetch_20newsgroups(subset='train') | ||
| newsgroups_test = fetch_20newsgroups(subset='test') | ||
| X_train = newsgroups_train['data'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable and function names should follow the snake_case naming convention. Please update the following name accordingly: X_train
| newsgroups_test = fetch_20newsgroups(subset='test') | ||
| X_train = newsgroups_train['data'] | ||
| y_train = newsgroups_train['target'] | ||
| X_test = newsgroups_test['data'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable and function names should follow the snake_case naming convention. Please update the following name accordingly: X_test
| X_test = newsgroups_test['data'] | ||
| y_test = newsgroups_test['target'] | ||
| vectorizer = TfidfVectorizer(stop_words='english') | ||
| X_train = vectorizer.fit_transform(X_train) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable and function names should follow the snake_case naming convention. Please update the following name accordingly: X_train
| y_test = newsgroups_test['target'] | ||
| vectorizer = TfidfVectorizer(stop_words='english') | ||
| X_train = vectorizer.fit_transform(X_train) | ||
| X_test = vectorizer.transform(X_test) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable and function names should follow the snake_case naming convention. Please update the following name accordingly: X_test
for more information, see https://pre-commit.ci
…o naive-bayes-text-classifier # Conflicts: # machine_learning/multinomial_naive_bayes_classifier.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper reviewto trigger the checks for only added pull request files@algorithms-keeper review-allto trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
| | ||
| | ||
| class MultinomialNBClassifier: | ||
| def __init__(self, alpha=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide return type hint for the function: __init__. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: alpha
| self.priors = None | ||
| self.alpha = alpha | ||
| | ||
| def fit(self, data: sparse.csr_matrix, y: ArrayLike) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function fit
Please provide descriptive name for the parameter: y
| return np.array(y_pred) | ||
| | ||
| | ||
| def main() -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper reviewto trigger the checks for only added pull request files@algorithms-keeper review-allto trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
| | ||
| | ||
| class MultinomialNBClassifier: | ||
| def __init__(self, alpha=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide return type hint for the function: __init__. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: alpha
| self.priors = None | ||
| self.alpha = alpha | ||
| | ||
| def fit(self, data: sparse.csr_matrix, y: ArrayLike) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function fit
Please provide descriptive name for the parameter: y
| return np.array(y_pred) | ||
| | ||
| | ||
| def main() -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function main
for more information, see https://pre-commit.ci
…o naive-bayes-text-classifier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper reviewto trigger the checks for only added pull request files@algorithms-keeper review-allto trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
| | ||
| | ||
| class MultinomialNBClassifier: | ||
| def __init__(self, alpha=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide return type hint for the function: __init__. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: alpha
| self.priors = None | ||
| self.alpha = alpha | ||
| | ||
| def fit(self, data: sparse.csr_matrix, y: ArrayLike) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function fit
Please provide descriptive name for the parameter: y
| return np.array(y_pred) | ||
| | ||
| | ||
| def main() -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper reviewto trigger the checks for only added pull request files@algorithms-keeper review-allto trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
| | ||
| | ||
| class MultinomialNBClassifier: | ||
| def __init__(self, alpha=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide return type hint for the function: __init__. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: alpha
| self.priors = None | ||
| self.alpha = alpha | ||
| | ||
| def fit(self, data: sparse.csr_matrix, y: ArrayLike) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function fit
Please provide descriptive name for the parameter: y
| return np.array(y_pred) | ||
| | ||
| | ||
| def main() -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function main
for more information, see https://pre-commit.ci
…o naive-bayes-text-classifier # Conflicts: # machine_learning/multinomial_naive_bayes_classifier.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper reviewto trigger the checks for only added pull request files@algorithms-keeper review-allto trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
| | ||
| | ||
| class MultinomialNBClassifier: | ||
| def __init__(self, alpha: int = 1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide return type hint for the function: __init__. If the function does not return a value, please provide the type hint as: def function() -> None:
| self.priors = None | ||
| self.alpha = alpha | ||
| | ||
| def fit(self, data: sparse.csr_matrix, targets: npt.ArrayLike) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function fit
| return np.array(y_pred) | ||
| | ||
| | ||
| def main() -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Click here to look at the relevant links ⬇️
🔗 Relevant Links
Repository:
Python:
Automated review generated by algorithms-keeper. If there's any problem regarding this review, please open an issue about it.
algorithms-keeper commands and options
algorithms-keeper actions can be triggered by commenting on this PR:
@algorithms-keeper reviewto trigger the checks for only added pull request files@algorithms-keeper review-allto trigger the checks for all the pull request files, including the modified files. As we cannot post review comments on lines not part of the diff, this command will post all the messages in one comment.NOTE: Commands are in beta and so this feature is restricted only to a member or owner of the organization.
| from sklearn.metrics import accuracy_score | ||
| | ||
| | ||
| def group_indices_by_target(targets): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide return type hint for the function: group_indices_by_target. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: targets
| | ||
| | ||
| class MultinomialNBClassifier: | ||
| def __init__(self, alpha=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide return type hint for the function: __init__. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: alpha
| self.priors = None | ||
| self.alpha = alpha | ||
| | ||
| def fit(self, data, targets): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function fit
Please provide return type hint for the function: fit. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: data
Please provide type hint for the parameter: targets
| tot_features_count + self.alpha * n_features | ||
| ) | ||
| | ||
| def predict(self, data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide return type hint for the function: predict. If the function does not return a value, please provide the type hint as: def function() -> None:
Please provide type hint for the parameter: data
| return np.array(y_pred) | ||
| | ||
| | ||
| def main(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is no test file in this pull request nor any test function or class in the file machine_learning/multinomial_naive_bayes_classifier.py, please provide doctest for the function main
Please provide return type hint for the function: main. If the function does not return a value, please provide the type hint as: def function() -> None:
Describe your change:
I implemented from scratch a multinomial naive bayes classifier. The algorithm is trained and tested on the twenty_newsgroup dataset from sklearn to perform text classification
Checklist: