• PenisDuckCuck9001@lemmynsfw.com
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    2 months ago

    That should work in koboldcpp. I’m running mine on an older gpu than that but with more vram. Use the gpulayers parameter to control exactly how much of the work gets offloaded to the gpu to control how much vram gets used up.