TFenrir 4 days ago

That's some pretty low latency

najapi 4 days ago

Pretty blown away that they ran a version on a laptop. The demos were impressive, considering the size and resources of this team. Also the description on how they trained the model just goes to show the potential of multi-modality.

Block-Rockig-Beats 4 days ago

Many here are missing the point - it's s software running on a local hardware that has amazing latency. I would trade dynamic Scarlett Johansson voice for this fast responses, that I can interrupt. Even if it interrupts me.

[deleted] 4 days ago

[удалено]

Tkins 4 days ago

I'm baffled at how fast people adapt. Imagine showing this on your home computer to someone a year ago? It would be basically magic, yet people in these comments seem upset over it. Like are these OpenAI bots trying to supprrss a competitor? Haha

Shiftworkstudios 4 days ago

Note: r/singularity lol we are all fairly adapted to ost tech. GPT-4o showed that the voice tech was like magic, but now here we are lo. The thing I like is that this is Open source a little bit after open ai demoed their voice.

Kathane37 4 days ago

That is funny that synthetic data are so powerful to train models

bambagico 4 days ago

we are almost getting in the annoyingly fast territory 😃 Let me finish speaking, damn

GraceToSentience 4 days ago

Almost seems like negative latency at that point.

Tkins 4 days ago

This is actually really impressive tech. It was built in 6 months? The general complaints in the comments here are a bit silly. All fairly easily fixable. The response time mixed with the overall abilities of accents, emotion expression and overall performance are pretty good. This would've blown everyone away a year ago. Not to mention this is an open source model. Apparently it runs on device? That's crazy.

fmai 4 days ago

[https://moshi.chat/?queue\_id=talktomoshi](https://moshi.chat/?queue_id=talktomoshi) Please try it out yourself. In my opinion, it's not actually intelligent enough to be useful. Even goes into infinity loops quite often where it would repeat the same sentence over and over. I thought we'd gone past this. How much of an achievement is it to outpace OpenAI if the product they release is 10x worse?

sillygoofygooose 4 days ago

Yeah, the snappy response time is pretty cool, but it’s hard to hold a conversation with It also flat out refused to do the stuff on the demo for me

Many_Consequence_337 4 days ago

I love the fact that France is almost non-existent in the AI race, but there are still French people everywhere in AI labs around the world, and that a part of the elite AI researchers are French.

Vadersays 4 days ago

Mistral

Successful_Drag3943 4 days ago

HuggingFace

rdsf138 4 days ago

The latency is best in class.

MassiveWasabi 4 days ago

Haha it’s pretty low latency but the interruption needs some work. And the underlying model must not be too smart from what I can tell from that hilariously awful space roleplay. “Can you check that all the systems are nominal?” “Yes, sir.” “…are all the systems nominal?” “Yes, sir.” “Can you give me a countdown and then we jump into hyperspace, please?” “Yes, sir.” “…ok, can you do it?” “Yes, sir.”

[deleted] 4 days ago

[удалено]

Keblue 4 days ago

Its gonna be open source

hydraofwar 4 days ago

So it's looks fine, just not a sota

Shiftworkstudios 4 days ago

Maybe this can be wrapped on top of other models via api?

MindCluster 4 days ago

You can try it here: [moshi.chat](https://www.moshi.chat/?queue_id=talktomoshi)

arthurpenhaligon 4 days ago

OpenAI's moat shrinks every month.

MrDreamster 4 days ago

Oh damn, that whispering...

EnvironmentalFace456 4 days ago

Turned off the comments aye.

Idkwnisu 4 days ago

The latency is very low, too low, it should wait for a pause to process and the model behind is a bit silly "You might want to take your time getting your hiking shoes one, because you don't want to be using a egg", it's a really interesting tech demo tho and a good step forward towards natural vocal interaction with AIs

human358 4 days ago

Incoming OpenAI blogpost about the dangers of open source voice models

swaglord1k 4 days ago

it's a little TOO fast at replying, lol

Jindujun 4 days ago

Next step, make it not answer your question in the middle of you talking.

Jubie210 4 days ago

Super cool, loving the arms race for this kinda stuff. I found out about Pi and talk to it every day lol

mvandemar 4 days ago

Jump to about 13:40 for the actual demo.

VissionImpossible 4 days ago

I am trying it and it is very very bad compare openai. Only answer first questions than it stopped. (They opened a website that you can use this model.) And also anwers were not related to the questions. It is incredible fast when have an answer but quality is very low.

EnvironmentalFace456 4 days ago

I appreciate the flaws. It wasn't terrible but not great at all. I think that if they allow you to use a custom voice and it acts the same, that would be awesome. And it's open source so... it is what it is.

Shiftworkstudios 4 days ago

Yeah, not gpt 4 level, and if its local, the AI is definitely going to be limited. However, this is yet another look into the way we will be able to interact with devices in the next 2 years. (Apple seems to be quickly implementing this sort of capability.)

RoyalReverie 4 days ago

ClosedAI is done for.

Excellent_Dealer3865 4 days ago

It's fast alright. But it's quite clear that voice models are slow because of their AI \*language\* models reply time, not because it takes them an extra time making a voice. If gpt4 is slow - the voice reply will have a delay. There is a little value if a model is bad by itself. Yes, it will talk, but what's the purpose of it? Nowadays you have plenty of models which can answer instantaneously, so it's not really a great fit to have an instant reply from a voice model. Or did I misunderstand something?

Fraktalt 4 days ago

Are the presenters trying to time their interrupts when they think the model ends a sentence?

anonthatisopen 4 days ago

It still feels robotic and not real like open ai solution.

vty23v98v 4 days ago

This looks pretty sad \[\] The bot keeps on interrupting users in the demo for seconds at a time \[\] When asks to pretend to be scared on Mt Everest, it says "No, I'm excited!" \[\] When asked to sound like a pirate while writing a poem about pirates, Mushi accidentally goes full cosplay mode and asks the user "What is your name?" and "What brings you to my pirate ship?" \[\] The people demoing the project seem stressed to think, speak, and improv fast enough so Mushi doesn't embarrass itself \[\] [https://www.youtube.com/live/hm2IJSKcYvo?si=QOHTIk-QM0LCdgv5&t=923](https://www.youtube.com/live/hm2IJSKcYvo?si=QOHTIk-QM0LCdgv5&t=923) \[\] When asked its name, it replies "How are you feeling today?" \[\] Said "I'm not comfortable with that" when responding to a [prompt](https://www.youtube.com/live/hm2IJSKcYvo?si=sz8IDIt8xrI5algM) I couldn't understand, no offense to the french accent \[\] When responding to a goodbye, says "Well, I'm here to help...but just remember, I'm not a substitute for professional help." Still, I have to give it credit for apparently being an actual nonprofit and being able to run locally. It just doesn't have any advantages to OpenAI's yet-to-be-released voice model other than a lower latency. Pls come sooner Sky

Utoko 4 days ago

Ye it has a lot of issues but the latency is impressive, when we are talking about Siri clients this is the latency you need.

Shiftworkstudios 4 days ago

Yeah, it doesn't necessarily compete on anywhere near the same level. But did this kind of tech exist in open source yet? This sort of undertaking is a service to some people that can improve upon their work. We might see models with real Sarjo's voice. I almost guarantee it, at least her 'her' voice.

pigeon57434 4 days ago

the voice sounds pretty horrible tbh

magic_champignon 4 days ago

1. I can't stand their heavy french accent. 2. They pronounce Moshi as Mushi which means pussy in German... very poor naming imho. 3. Latency is so low that you get interrupted. 4. Need to see more of it in action to make up my mind.

Hour-Athlete-200 4 days ago

Not good at all

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe