I know it is possible to split layers between Nvidia GPUs with cublas. But with AMD and ROCm, it is far more difficult, and maybe not implemented yet for any project?

  • j4k3@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I haven’t, but I can say when I was researching hardware a month ago, I came across a telemetry source from a stable diffusion add-on. In that data there was clear evidence of cloud instances running banks of 7900xtx cards. The 7k series are supposedly the only ones supported by hips. I didn’t see any other Radeon cards that had the same signs of use in a data center. Even in this instance, all the cards could be on separate containers or whatever the correct term is for a cloud instance. I expect someone would connect them all for doing model training. Not really helpful, I know.

    Honestly, the code base is not that hard to parse, especially if you have a capable LLM running and let it explain snippets and small functions.

  • micheal65536@lemmy.micheal65536.duckdns.org
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    AMD GPU support appears to be included in GGML. I don’t see any reason why you wouldn’t be able to split between multiple GPUs as the splitting is handled within GGML itself and not tied to any particular library/driver/backend.