Skip to content
View mbanon's full-sized avatar
🥔
🥔

Organizations

@paracrawl @bitextor @macocu @OpenEuroLLM

Block or report mbanon

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. hplt-project/data-analytics-tool hplt-project/data-analytics-tool Public

    HPLT Analytics

    JavaScript 15 4

  2. fastspell fastspell Public

    Targetted language identifier, based on FastText and Hunspell.

    Python 38 5

  3. bitextor/bicleaner bitextor/bicleaner Public

    Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.

    Python 159 22

  4. bitextor/bifixer bitextor/bifixer Public

    Tool to fix bitexts and tag near-duplicates for removal

    Python 34 3

  5. paracrawl/corset paracrawl/corset Public

    Corset is a web-based data selection portal that helps you getting relevant data from massive amounts of parallel data.

    SCSS 20 3

  6. paracrawl/keops paracrawl/keops Public

    Tool for manual evaluation of parallel sentences.

    PHP 15 5