Just a stranger trying things.

  • 5 Posts
  • 166 Comments
Joined 1 year ago
cake
Cake day: July 16th, 2023

help-circle


  • I’m not sure I see the issue to be honest. The development is made in the open, the architecture is pretty flexible and is designed to be rather robust to rug pulls specifically such that less trust is required in the model.

    Also, whenever these discussions happen, I can’t stop feeling that it is somehow also meant to imply that mastodon is somehow better. And I am not a fan of that, as if there could only be one good social network. The internet is better with multiple services, multiple of many things. That’s how there is cooperation, compatibility and development for the better.






  • The Hobbyist@lemmy.ziptoSelfhosted@lemmy.worldSelf-hosting LLMs
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    22 days ago

    I run the Mistral-Nemo(12B) and Mistral-Small (22B) on my GPU and they are pretty code. As others have said, the GPU memory is one of the most limiting factors. 8B models are decent, 15-25B models are good and 70B+ models are excellent (solely based on my own experience). Go for q4_K models, as they will run many times faster than higher quantization with little performance degradation. They typically come in S (Small), M (Medium) and (Large) and take the largest which fits in your GPU memory. If you go below q4, you may see more severe and noticeable performance degradation.

    If you need to serve only one user at the time, ollama +Webui works great. If you need multiple users at the same time, check out vLLM.

    Edit: I’m simplifying it very much, but hopefully should it is simple and actionable as a starting point. I’ve also seen great stuff from Gemma2-27B

    Edit2: added links

    Edit3: a decent GPU regarding bang for buck IMO is the RTX 3060 with 12GB. It may be available on the used market for a decent price and offers a good amount of VRAM and GPU performance for the cost. I would like to propose AMD GPUs as they offer much more GPU mem for their price but they are not all as supported with ROCm and I’m not sure about the compatibility for these tools, so perhaps others can chime in.

    Edit4: you can also use openwebui with vscode with the continue.dev extension such that you can have a copilot type LLM in your editor.




  • I understand your position. There is a learning curve to containers, but I can assure you that getting your basics on the topic will open a whole new world of possibilities and also make everything much easier for yourself. The vast majority of people run containers which make the services less brittle because they have their own tailored environment and don’t depend on the host libraries and packages and also brings increased security because the services can’t easily escape their boundaries rendering their potential vulnerabilities less of an issue compared to running those same services bare metal.

    I started on synology too. There is a website called Marius hosting which focuses on tutorials for containers on synology, but his instructions have been updated the last few years to focus on spinning up containers manually rather than through the UI, which makes it more intimidating than it needs to be for beginners… I’ll link it here just as a reference. I’ll see if on the way back machine he shows the easier way and report back if I find something.

    Edit: yes here is an original tutorial for Jellyfin (this method still works for me and is still how I use docker lately): https://web.archive.org/web/20210305002024/https://mariushosting.com/how-to-install-jellyfin-on-your-synology-nas/


  • To answer your question more specifically, most people set up the pi with docker, using services which have a front end accessible in the browser. They basically use their browser to navigate to the front end of the service they want to use and administer it like that. For instance portainer to manage their docker containers, or pihole for managing their firewall, or even jellyfin for their media which is both the website to consume the media and has an administrator dashboard.

    Edit: this is in complement to using something like tailscale which basically allows you to access these services away from home. They work in conjunction.






  • I don’t care which is better. But I can share certain unique features which make me personally chose GrapheneOS over all other options I know of:

    • it is possible to relock the bootloader
    • you can disable the internet permission
    • the location service is independent on google services, even if you install them
    • you can use mutliple profiles and pipe notifications from one profile to another
    • you control native app debugging (and its off by default)
    • you have storage scope (as well as contacts scope)
    • you get all the latest security patches and really fast
    • and more…

  • I don’t know if you do this already, but in case you are not: GOS offers a convenient way to separate apps between each other, where some may depend on google services and others don’t. The methodology which works best for me is to use the main profile for all my personal apps and which do not require google services. For those few apps that do, I create a dedicated profile, in which I install the google services and the apps which need it. You can pipe notifications from that profile to the main one, in the profile settings, that way you can get the notifications of those apps even when not in the dedicated profile.

    Additionally, if you do not need those notifications, you can disable the profile to run in the background when not in use.

    As others have said, do not touch services and apps you do not recognize. Be very careful as things could turn south. Do not touch things unless you know what you are doing.




  • Memento is a movie about a guy who tries to find the murderer of his wife but has a condition where he only remembers the last few minutes, so works with post-its, photos and tatoos to piece things together. Great movie!

    Predestination is a time traveling cop trying to prevent a terrorist attack.

    I’m leaving the best part out which is thought provoking, but you will find it and appreciate it when you watch both movies I think.