I know it is possible to split layers between Nvidia GPUs with cublas. But with AMD and ROCm, it is far more difficult, and maybe not implemented yet for any project?
I know it is possible to split layers between Nvidia GPUs with cublas. But with AMD and ROCm, it is far more difficult, and maybe not implemented yet for any project?
AMD GPU support appears to be included in GGML. I don’t see any reason why you wouldn’t be able to split between multiple GPUs as the splitting is handled within GGML itself and not tied to any particular library/driver/backend.