Someone posted a blog here a little while ago. I wrote up a big response only to find that OP deleted the post. I figured I might as well post my response here since it took me 45m to type out of my phone 🫠
What an interesting list. Some of these suggestions are good with others are not. I think we can reorder things a bit and make this more reasonable.
Jenkins
Jenkins is terrible! It should have been killed off a decade ago. Seriously, just don’t use Jenkins. There are much better offerings now.
Source control and CI/CD
The current trend is to rely on your source control provider for ci/cd. You may or may not have a choice in this space so let’s name some big ones. GitHub, Gitlab, Azure DevOps, Bitbucket, Gitea/Forgejo. They all act as a git server and all offer automation. Learn whichever your company uses. If you get to choose… GitHub is great! Gitlab is also good but the automations will be focused on bash and tend to get messy IMO. ADO is truly a Microsoft product with many nonsensical choices. I find it frustrating to use. I haven’t done ci/cd with bitbucket. If you want a foss option, check out forgejo (a fork of gitea). I have not used either yet though it looks nice and I really want to.
Containers
Docker is a fine choice. I really like some alternatives tools like buildah, podman, etc. but nearly every piece of documentation out there is based on docker. The choice is yours here but docker will probably give you the simplest experience.
Kubernetes is an amazing runtime environment! IMO should be used as a standard interface for running resources in a public cloud. However, this is a huge jump and you’ll want to learn at least a dozen good tools here. This one is a many years long practice but absolutely worthwhile. A quick and very incomplete list of tools: k9s, k3d, helm, kustomize (better than helm in most cases), flux, Argo (better than flux), istio. Seriously these are just the basics.
Infrastructure Management
While ansible is good, I would be looking to retire it at this point. A big possible exception is if you are running your own hardware and don’t have a great interface for alternative tools. If somebody just gives you a VM to use, then ya use ansible.
Terraform is great but don’t use it. OpenTofu is a foss fork and people should honestly just use this instead. But both tools have some limitations and oddities. People seem to love using terragrunt as well to make this easier to use.
If you’re using k8s, there’s also the open tofu controller. I’ve haven’t personally used it, but people I 100% trust in this space absolutely love it.
Observability
Firstly I like the numeronym instead: o11y.
Don’t use nagios. It’s old and there are better alternatives.
Elasticsearch is ok but I don’t really like it. Everything is stored as a document and just… eh, there are better options.
Prometheus is quite good.
Here’s the biggest mistake that people make today. Use OpenTelemetry as the core of your o11y solution. It’s the 2nd biggest CNCF project (right behind k8s) and it’s a fantastic tool. It lets you collect telemetry data and build data pipelines to whatever storage devices you want. That includes Prometheus and elasticsearch but you also can choose many more options as well with only tiny configuration changes.
ChatGPT
This entire post looks 100% like a copy/paste from ChatGPT. AI is a cool tool but OP, you should learn to use it a little better. Tell it to not use so much fluff text or such a rigid structure. Make edits afterwards. And most important of all, make sure it’s actually providing good info.
I’m curious what you like more about Github Actions over Gitlab’s CICD. I’m a bit new to the space, but the github actions approach seems like they define pre-built steps using docker containers? Seems like something that would work well until it doesn’t, but is the element you like that you can more easily define repeatable deployment steps?
personally i have a very big preference for the “actions” model: actions are “real” code; not bash - so there’s actual variable passing and communication with the CI tool (eg an action can tell the CI system to set a secret value, and the CI tool then hides that value from console output in the future). you can do these things with bash as well, but bash isn’t really a great language - i’d trust node far more to, for example, generate a hash or a JWT etc and actions then have access to the entire of NPM for libraries
these kinds of things (not only this) make it far superior to a bash-based system IMO
Looking at the custom actions, it seems like its still all bash behind the scene? Or am I missing something?
The part I’m getting lost on is how does node tell a system what dependencies to install. Or is the part you like that actions abstracts those steps?
the various github-supplied actions are good example:
https://github.com/actions/setup-python
the action.yml here is the metadata for an action (and you invoke this action by simply referencing actions/setup-python - perhaps with a sha or tag or something to pin it - and the runner clones and runs it, making custom actions simply code with no build process etc necessary which is very nice to not have to bootstrap your build process with a build process eg docker build)
see the “runs” section there - it invokes node20 to run the action, and specifies some code to run to cleanup
in the src here we can see what i was mentioning as well about having bi-directional comms with the CI system
https://github.com/actions/setup-python/blob/main/src/setup-python.ts
line 54 & 55 we have core.getInput and core.getMultiLineInput, 59 we have core.warning (so logs are formatted and filterable etc - these messages also show up in the build summary; similarly line 100 we have core.startGroup), 154 is core.setFailed so you get proper failure reasons rather than “the last line of stderr”
now, none of this is specific to running in a real language - in fact i believe all this information is communicated over stdout or stderr - but the ease of simply creating a repo with code in it, and having that as a reusable CI step without worrying about docker container hosting, and having that able to have semantic meaning for its inputs and outputs (and proper reliable escaping of the special CI communication prefixes) is something that kinda doesn’t exist outside of this style system