asr

One can use https://github.com/s-yata/marisa-trie to save a lot of space for symbols.

I am a newcomer to the audio field. I have some questions when use this project to generate the audio embedding for my multimodality model (text and audio)

I want to use Mockingjay, and run `python preprocess_any.py --feature_type=mel' but get 80 dim features, I just simply change num_mel in utility/audio.py from 80 to 160(I see this model need 160dim mel features in README), is it right?

Th

asr

Here are 299 public repositories matching this topic...

wzpan / wukong-robot

tensorflow / lingvo

mravanelli / pytorch-kaldi

didi / delta

audier / DeepSpeechRecognition

freewym / espresso

srvk / eesen

pykaldi / pykaldi

mravanelli / SincNet

alphacep / vosk-api

Compress symbol table

snakers4 / open_stt

kaituoxu / Speech-Transformer

lium-lst / nmtpytorch

Picovoice / cheetah

gooofy / zamia-speech

hirofumi0810 / tensorflow_end2end_speech_recognition

hirofumi0810 / neural_sp

zw76859420 / ASR_Theory

jcsilva / docker-kaldi-gstreamer-server

athena-team / athena

robmsmt / KerasDeepSpeech

goodatlas / zeroth

alphacep / vosk-android-demo

andi611 / Self-Supervised-Speech-Pretraining-and-Representation-Learning

Questions about preprocessing custom dataset

Tutorial for application on custom dataset

belambert / asr-evaluation

Ailln / cn2an

speechio / chinese_text_normalization

alphacep / vosk-server

louiskirsch / speechT

wangkaisine / mrcp-plugin-with-freeswitch

Improve this page

Add this topic to your repo