Skip to content
View GitYCC's full-sized avatar

Organizations

@mtkresearch

Block or report GitYCC

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. mtkresearch/TASTE-SpokenLM mtkresearch/TASTE-SpokenLM Public

    A method that directly addresses the modality gap by aligning speech token with the corresponding text transcription during the tokenization stage.

    Python 97 11

  2. mtkresearch/generative-fusion-decoding mtkresearch/generative-fusion-decoding Public

    Generative Fusion Decoding (GFD) is a novel framework for integrating Large Language Models (LLMs) into multi-modal text recognition systems like ASR and OCR, improving performance and efficiency b…

    Python 86 11

  3. mtkresearch/clairaudience mtkresearch/clairaudience Public

    Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)

    Python 27 3

  4. g2pW g2pW Public

    Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)

    Python 366 46

  5. crnn-pytorch crnn-pytorch Public

    Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition using Pytorch

    Python 294 75

  6. mtkresearch/MR-Models mtkresearch/MR-Models Public

    聯發創新基地(MediaTek Research) 致力於研究基礎模型。我們將研究體現在適合繁體中文使用者的模型上,並在使用權許可的情況下,提供模型給學術界研究或產業界使用。

    251 24