Microsoft's VASA 1 - A New AI Model That Turns Photos Into 'talking faces'


Well-known member
  • May 20, 2009
    A new AI research paper from Microsoft promises a future where you can upload a photo, a sample of your voice and create a live, animated talking head of your own face.

    VASA-1 takes in a single portrait photo and an audio file and converts it into a hyper realistic talking face video complete with lip sync, realistic facial features and head movement.

    The model is currently only a research preview and not available for anyone outside of the Microsoft Research team to try, but the demo videos look impressive.

    Similar lip sync and head movement technology is already available from Runway and Nvidia but this seems to be of a much higher quality and realism, reducing mouth artifacts. This approach to audio-driven animation is also similar to a recent VLOGGER AI model from Google Research.


    Well-known member
  • Jan 8, 2023
    This is the death of the truth! (If this releases to the public. But I really hope not)

    What is the advantage coming from this technology? Nothing.
    Who asked for this creepy stuff? No one.
    Will this be used for the good things for humanity? NOOOOOO!
    So, what will this be used for? Scams, Porn, Deepfakes etc.

    Nothing good will come out of this. Hopefully this is a research project who wants to milk investors' money.

    Season 5 No GIF by The Office
    • Like
    Reactions: Elv888


    Well-known member
  • Feb 6, 2023
    Ursa Major
    That indeed belongs to the magical realm. It seems that the Harry Potter's magical realm will be true in near future. wow.
    • Like
    Reactions: Elv888