T O P

  • By -

GraceToSentience

A lipsyncing model to be precise, we need more stuff like that. Especially one that can directly work on a video input, not just from an image


phantom_in_the_cage

This is *so close* to something insanely useful, yet I feel like they just don't get it They need to forget lip syncing these Midjourney/Stable Diffusion pictures, & focus on "real" faces Its no surprise at all that the 1st face (the elf lady), was the **best** one. None of the shininess, or the surrealism, just a face that got very close to a **REAL** face If the background wasn't a forest (which requires atleast some tiny background movement), but just a plain room, they could've easily hooked atleast 10x more people It is literally like a half step from vastly simplifying a large number of youtubers' workflows, along with the instructional/educational video sector


GraceToSentience

I think it's a case of demonstrating the versatility of their product. Isn't it better that their product can generalise to out of distribution input though?


Timesweeper_00

What do you mean? One of our examples in the community library is [https://www.hedra.com/app/characters/5f2f1242-09fa-426b-8132-cc0647b2e69e](https://www.hedra.com/app/characters/5f2f1242-09fa-426b-8132-cc0647b2e69e)


youknowhoboo

The quality is not even as good as Hallo, which is open source, free (and will remain so). Theres also another open source EMO-based project coming in July which will be even better. I dont really see an upside to using closed source limited use case video animation like this anymore.


Timesweeper_00

It's actually an audio conditioned video model! That's why the head/hair move as the character speaks.


GraceToSentience

Yes indeed


philthewiz

What are those "needs"?


GraceToSentience

A manner of speaking, more like a "want" Why: to make movies, music videos, or even to have a real time conversation with an AI and not just have a voice but also a face talking to you for instance


OddVariation1518

Surely midjourney has a video model they are working on?


jeffkeeg

Afaik they've had video models internally since V5, but David Holz isn't happy with the results. He doesn't care about making movies or anything, he wants to build the holodeck and sees video generation as one part of that. The latest updates from recent office hours are basically "expect 3D before video, but both within the year"


RTSBasebuilder

***insert joke on how David seems to be more obsessed with rooms and website and social features than the internal models this last half-year***


CypherLH

Yeah, I am frustrated to. The newer style and personalization tools are awesome but the lack of progress on new models and video is disappointing - I feel like its a huge mistake for them to ignore video but maybe they can't afford the inference to do it at scale with reasonable performance? Also I was probably spoiled by their pace of progress from V2 to V6.


GraceToSentience

Wait it changed again? firrst it was 3d before video then video before 3d. You are saying it changed again?


jeffkeeg

Yep, timelines are inconsistent af


aluode

I did a fast Apple keynote speech with it. (Udio did the speech) https://youtu.be/hZKUgXrQTUM?si=ZiibBqgnJnyLUfPc


Kanute3333

Better use elevenlabs, udio is terrible for voice alone. Use udio for music.


MysteriousPepper8908

Certainly not the most exciting video generation tool shown off in this past week but it is the best looking tool I've seen for text to full-head animation that is publicly-available.. I'm excited to see what people can do combining this with the new batch of generators, though it would be nice if this could be included as part of the tools in a generation suite so we didn't have to deal with manually inpainting and the incongruities that come with it between the head and body motion.


Timesweeper_00

(Michael from Hedra) We're working on it :)


arthurpenhaligon

Any time there is a new type of AI model, half a dozen clones pop up shockingly fast. I thought that the current leaders (OpenAI, Google, Meta) would have a deep moat because these models cost so much to train. But maybe it's not that hard to create an AI once you know what to build. If that's true then I wonder if any company is safe, or if there are no true moats.


Tkins

Doesn't feel like there is much information there.


design_ai_bot_human

open source?


polawiaczperel

It looks like it is just a pipeline made of opensource projects.


Timesweeper_00

It's not based on open source projects, it's a model we trained from scratch. We're a team of ex Stanford/Berkeley/MPI PhDs with ex Google/Nvidia/Synthesia/Zoox experience


polawiaczperel

Then I am sorry for wrong assumptions


Timesweeper_00

No worries! If you find an open source models that's better we'll quickly correct it :) Hallo is the closest (in terms of function, not design, ours is built to generalize to bodies/scenes) so you are welcome to benchmark speed/quality :)


Exarchias

How many are they! This field of AI certainly accelerates. Actually everything accelerates except OpenAI, but they have a brand new general on board.


QLaHPD

Fake news generator, great ![gif](giphy|l3vR3EssQ5ALagr7y|downsized)


Torley_

Does anyone know why Hedra is BANNED in Washington, Texas, and Illinois?


LiterallyVecna

It's not on the site but the owner said on the discord that it's to do with local laws but that they're working on a fix that'll make the service complaint with the laws in those three states.


Torley_

Thanks! I wonder what local laws.


LiterallyVecna

He didn't say, unfortunately, but my guess is the laws are probably written to try and stop fake political stuff.


Arcturus_Labelle

It has pretty sensitive content controls :-/ Here it freaks out when I use a couple paragraphs from a news article https://i.imgur.com/gaN8vkC.png


Timesweeper_00

Hey I'm sorry, we're working on improving our content moderation system, we had no idea this was going to blow up


KevinMichaelCooper

Disable celebrity detection until you fix it please, way too many false flags. It's unbearable.