Trying something new, going to pin this thread as a place for beginners to ask what may or may not be stupid questions, to encourage both the asking and answering.

Depending on activity level I’ll either make a new one once in awhile or I’ll just leave this one up forever to be a place to learn and ask.

When asking a question, try to make it clear what your current knowledge level is and where you may have gaps, should help people provide more useful concise answers!

  • doodlebob@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    10 months ago

    I have two 3090 Turbo GPUs and it seems like oobabooga doesn’t split the load between the two cards when I try to run TheBloke/dolphin-2.7-mixtral-8x7b-AWQ.

    Does anyone know how to make text generation webui use both cards? Do I need an nvlink between the two cards?

    • noneabove1182@sh.itjust.worksOPM
      link
      fedilink
      English
      arrow-up
      4
      ·
      10 months ago

      You shouldn’t need nvlink, I’m wondering if it’s something to do with AWQ since I know that exllamav2 and llama.cpp both support splitting in oobabooga

      • doodlebob@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        10 months ago

        I think you’re right. Saw a post on Reddit basically mentioning the same things I’m seeing.

        It looks like autoawq supports it but it might be an issue with how oobabooga implements it or something…