Skip to content

Conversation

@sudoskys
Copy link
Member

@sudoskys sudoskys commented Mar 29, 2025

  • No more manual line breaks management!
-detect("hello world", low_memory=False, use_strict_mode=True) +detect("hello world", low_memory=False, config=LangDetectConfig(allow_fallback=False)
Normalize text input to improve detection accuracy, particularly for all-uppercase text. This prevents misdetection as Japanese by converting uppercase text to lowercase. This enhancement ensures more reliable language predictions.
Reordered imports for better readability and alignment with PEP 8. This change enhances maintainability by ensuring consistent import order, making the codebase easier to navigate and understand. 🛠️
Enhanced text normalization by removing newline characters and lowercasing uppercase text to improve prediction accuracy. Added warnings for deprecated parameters and improved configuration management using LangDetectConfig. These changes enhance text preprocessing and ensure better configuration management.
Removed the `test_newline` function from `tests/test_real_detection.py` as it was deemed unnecessary. This streamlines our test suite by eliminating redundant checks, ensuring more focused and efficient test execution.
@sudoskys sudoskys changed the title ✨ feat(app): add input normalization to language detection ✨ feat(app): [Compatibility changes] add input normalization to language detection Mar 29, 2025
Improved `_normalize_text` to static method and refined logging messages for better clarity. This change enhances text processing by explicitly handling newline characters and long inputs, as well as aligning with issue #14. 🛠️ Refactoring ensures better code maintainability and readability.
Introduced `_preprocess_text` method to clean and validate text before detection. This ensures removal of newline characters and warns if text length exceeds 100 characters, enhancing prediction accuracy and preventing errors.
@sudoskys sudoskys linked an issue Mar 29, 2025 that may be closed by this pull request
@sudoskys sudoskys merged commit f4fc032 into main Mar 29, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants