ngram

目前的多音字使用 pypinyin 或者 g2pM，精度有限，想做一个基于 BERT (或者 ERNIE) 多音字预测模型，简单来说就是假设某语言有 100 个多音字，每个多音字最多有 3 个发音，那么可以在 BERT 后面接 100 个 3 分类器（简单的 fc 层即可），在预测时，找到对应的分类器进行分类即可。
参考论文：
tencent_polyphone.pdf

数据可以用 https://github.com/kakaobrain/g2pM 提供的数据

进阶：多任务的 BERT
![image](https://user-images.githubusercontent.com/24568452

ngram

Here are 111 public repositories matching this topic...

PaddlePaddle / PaddleSpeech

基于 BERT 实现语音合成文本前端的多音字预测

基于 BERT 实现语音合成文本前端的停顿预测

复现简单的 music_generation

zhezhaoa / ngram2vec

lonePatient / albert_pytorch

lonePatient / daguan_2019_rank9

proycon / colibri-core

ChrisMuir / refinr

wrathematics / ngram

words / n-gram

ranelpadon / ngram-type

suggest-go / suggest

Support for Protocol Buffers

Unit tests [The FIRST Principal]

vickumar1981 / stringdistance

myazi / NLP

BitSpeech / SRILM

joshualoehr / ngram-language-model

zheng5yu9 / unsupervised_extract_detect_words

mkearney / chr

StarlangSoftware / NGram-Py

Aurelius84 / N-gram

tapos12 / N-gram-Language-model

gheyret / UyghurNgram

mideind / Icegrams

DanielJohnBenton / Ngrams.java

slowikj / seqR

dohliam / hawaiian-corpus

loginn / ngrams_graphs

naturalness / unnaturalcode

ardcore / ngram

boon-cpu / boonkov

fredriko / metacurate-lexicon

mochi-co / ngrams

Improve this page

Add this topic to your repo

ngram

Here are 111 public repositories matching this topic...

PaddlePaddle / PaddleSpeech

基于 BERT 实现语音合成文本前端的多音字预测

基于 BERT 实现语音合成文本前端的停顿预测

复现 简单的 music_generation

zhezhaoa / ngram2vec

lonePatient / albert_pytorch

lonePatient / daguan_2019_rank9

proycon / colibri-core

ChrisMuir / refinr

wrathematics / ngram

words / n-gram

ranelpadon / ngram-type

suggest-go / suggest

Support for Protocol Buffers

Unit tests [The FIRST Principal]

vickumar1981 / stringdistance

myazi / NLP

BitSpeech / SRILM

joshualoehr / ngram-language-model

zheng5yu9 / unsupervised_extract_detect_words

mkearney / chr

StarlangSoftware / NGram-Py

Aurelius84 / N-gram

tapos12 / N-gram-Language-model

gheyret / UyghurNgram

mideind / Icegrams

DanielJohnBenton / Ngrams.java

slowikj / seqR

dohliam / hawaiian-corpus

loginn / ngrams_graphs

naturalness / unnaturalcode

ardcore / ngram

boon-cpu / boonkov

fredriko / metacurate-lexicon

mochi-co / ngrams

Improve this page

Add this topic to your repo

复现简单的 music_generation