> Thanks for your comment and your experience. I agree that at a large scale it would be silly to receive individual emails for error messages.
At small to medium scale, having a mailing list for the dev team which gets emailed when issues come up can be quite handy. It can't be your whole process - someone still needs to take responsibility for actually fixing problems. And you might need to aggressively rate limit it when errors happen. But for the occasional email it can work quite well. Its much easier than building a dashboard.
Eg "[ops] Monthly backup process FAILED", "[ops] Warning: prod4 at 95% RAM usage"
At small to medium scale, having a mailing list for the dev team which gets emailed when issues come up can be quite handy. It can't be your whole process - someone still needs to take responsibility for actually fixing problems. And you might need to aggressively rate limit it when errors happen. But for the occasional email it can work quite well. Its much easier than building a dashboard.
Eg "[ops] Monthly backup process FAILED", "[ops] Warning: prod4 at 95% RAM usage"