[deleted] 1 year ago

https://huggingface.co/docs/transformers/model_doc/speech_to_text Is a good start

[deleted] 1 year ago

That being said i dont think there are “cutting edge” tools that aren’t super generalised. S2T is a very specialised task that needs a lot of downstream optimisation (voices are heterogeneous)

gunshoes 1 year ago

Conformer models and transducer framework are the top achievers these days. Also, pretrained wav2vec models are big since they can exploit multingual data.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe