CivArchive
    Zeroscope V2 XL (txt2video) - v1.0

    Stop! These models are not for txt2img inference!

    Don't put them in your stable-diffusion-webui/models directory and expect to make images!

    So what are these?

    These are new Modelscope based models for txt2video, optimized to produce 16:9 video compositions. They've been trained on 9,923 video clips and 29,769 tagged frames at 24 fps, 1024x576 res.

    Note that these are the bigger brothers to the https://civarchive.com/models/96454/zeroscope-v2-576w-txt2video models. The XL models use 15.3GB of VRAM when rendering 30 fps at 1024x576.

    Where do they go?

    Drop them in the \stable-diffusion-webui\models\ModelScope\t2v folder

    It's imperative you rename the text2video_pytorch_model.pt to .pth extension after downloading.

    The files must be named open_clip_pytorch_model.bin, and text2video_pytorch_model.pth

    Who made them? Original Source?

    https://huggingface.co/cerspense/zeroscope_v2_XL

    What else do I need?

    These models are specifically for use with the txt2video Auto1111 WebUI Extension

    Description

    FAQ

    Comments (11)

    EnjuJun 25, 2023· 2 reactions
    CivitAI

    15.3 gb vram :cry:

    ElunaJun 25, 2023· 2 reactions
    CivitAI

    You know you tagged this one "text2video" but the previous one "txt2video," so they are separate when checking for each other by tag?

    Brings up the issue there's no associative tagging on the site, so a lot of related content isn't discoverable via similar tags. For instance, when someone writes a tag "Full Metal Alchemist," but someone else only uses "FMA," they are separate in tag searches.

    ... I should really post this on the Ideas, I know, but I also don't know how to elaborate this - if it hasn't been before. Summary of all this is: Synonym tags should have a way of being linked so everything is properly grouped together, or something to the same effect.

    theally
    Author
    Jun 25, 2023

    Good catch - fixed, and noted! That's perfect for submission to Ideas.

    ElunaJun 26, 2023· 1 reaction

    @theally Got'cha and got it up. Found the formal terms "Tag Aliases" and "Tag Implications." I apologize in advance if this is implemented, for the extra work in upkeep.

    louimposteurJun 25, 2023· 1 reaction
    CivitAI

    Hi, thanks a lot sharing!! i was wondering if this needs any other instalation besides the two files there im trying to use it on 1111 extention and i realise that there is also a config file and a vq gan encoder for modelscope are those the same? also, do you recomend using the small version first? also noticing that this needs xformers running.

    theally
    Author
    Jun 25, 2023· 1 reaction

    You can use Modelsope's VQGAN/Config - I'm not even sure they're required, perhaps they are. If you have at least 15GB of VRAM there's no need to use the small model - go straight for the big leagues! And I don't use xformers, Torch 2.0, sdp attention gang over here.

    louimposteurJun 25, 2023

    cool im gonna try that thanks!!

    louimposteurJun 26, 2023· 1 reaction

    it works like a charm!!!!!!!

    restofaceJun 26, 2023· 1 reaction

    Tip of the day! every model you need to edit the config. More VRAM the better and stay inbound in frame size as per given model.

    halr9000Jun 25, 2023
    CivitAI

    Curious if this would work wide screen 16:9? Or portrait only?

    happygoAug 17, 2023· 1 reaction
    CivitAI

    tested this out and only getting watermarked shutterstock videos playing, under models only modelscope is shown, had to create the two folder ModelScope/t2v in models dir, on extention shows models/text2video and the extention is in extentions folder and looking at all json and .py files nothing is calling for the ModelScope directory, so idk.