Skip to content
View lpla's full-sized avatar

Highlights

  • Pro

Organizations

@paracrawl @bitextor @macocu @multiscore @Grupo-Enercoop

Block or report lpla

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. bitextor/bitextor bitextor/bitextor Public

    Bitextor generates translation memories from multilingual websites

    Python 299 42

  2. bitextor/bicleaner bitextor/bicleaner Public

    Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.

    Python 159 22

  3. bitextor/bicleaner-ai bitextor/bicleaner-ai Public

    Bicleaner fork that uses neural networks

    Python 40 4

  4. bitextor/warc2text bitextor/warc2text Public

    Extracts plain text, language identification and more metadata from WARC records

    C++ 23 6