It's trivially capped and rolled.
It's centralized.
It provides a common logging interface across platforms.
It's extremely simple (the script that pipes syslog output directly into Redis is probably just a couple lines long).
I have to believe that the people who really seem to like syslog have never worked in organizations that had to deploy things like Splunk or (worse) LogLogic and ArcSight just to make sense of the giant morass of useless text gunk they generate.
Have you noticed how none of the cool kids postprocess http log files anymore?
It is not less or more centralized than syslog configured with centralization (which is trivial to set up).
How is this more common than syslog across platforms (unless you include windows in "across platforms ?").
It is not simpler than syslog either, since writing to syslog is just a matter of using the right python logging backend.
Analysing tons of data from syslog is a pain, but I don't see how any solution will not require at some point in the stack to enforce a format/structure in your log. How is this fundamentally different than post-processing http log ?
What you really want to do (and what everyone does btw) is to push your logs to a central syslog-server and stream them into redis or whatever analytics solution from there.
I think the key point here is that all the above mentioned implementations have significant adoption and are in a sense "battle-tested". For example, what if your background worker has failed and log events are piling up in the Redis list you are using as queue? Do you have monitors in place to detect that situation, and at what value do your alarms go off? Projects like this have a way of taking a lot more time than originally thought, often at the expense of your core development time. I personally don't like spending the time writing and maintaining code for a project that isn't aligned with the problem I'm trying to solve, so I avoid it whenever possible.
On the flip side, if you are setting out to build a really robust logging system on top of Redis, and that's something of value to your organization, then more power to you!
tail -n 1000
* Freeform text is a terrible way to track system events.
* Periodically rotated flat files are not a great way to store log information.
* Goofy little UDP messages are not a good way to convey system events
* The syslog PRI field dates back to when we exchanged messages with UUCP.
I could keep going, but since you're just going to reply with "lolwut umad?", I'll leave it at that.
And I'll add it tends to ship in a horrible default configuration with events scattered randomly over multiple files, no safe-guards against filling up the disk and no safeguards to ensure the stupid daemon is actually running.
However...
Freeform text is a terrible way to track system events.
Nothing stops you from logging structured text.
Periodically rotated flat files are not a great way to store log information.
Modern syslog daemons will write to pretty much anything you want.
Goofy little UDP messages are not a good way to convey system events
Modern syslog daemons offer tcp transport. Some even try to offer some delivery guarantees (disk-backed spool), although personally I wouldn't rely on that for truly critical stuff.
The syslog PRI field dates back to when we exchanged messages with UUCP.
Thanks, I always wondered where those were from...
And, well, you forgot a couple bullets:
* syslog() is available everywhere, out of the box
* It's trivial to move from file-based logging to syslog
* We have mature syslog-daemons that dispatch events pretty reliably
* Unless you're facebook you probably don't need anything more fancy.
So, I'd say syslog gets the job done quite well, as long as you don't mistake it for a message queue.
Out of pure curiosity, what do you see as the tool most likely to displace syslog in the future? Is there any alternative available that fixes most of these problems without rolling your own from pieces and parts?
Pretty much any other time I've had to work with it, I've wished for something better, so I, for one, am very happy by the thought of people starting to "go out of their way to avoid syslog".
The easy solution is just, y'know, write the log to a file and scp it back to some central place every so often. But then you have to either (a) keep track of how much of a file you've copied, which is a pain; or (b) only grab files that you're no longer actively writing to (as determined by naming scheme or something), but that introduces some latency, depending on how often you rotate.
With the details, it would be a much more interesting post.
logstash takes logs from various inputs (syslog, files, Redis, HTTP), filters/normalized the formats into JSON, and outputs various formats (ElasticSearch, Redis, MongoDB, Graylog2). There is a WebUI with search and graphs. It's designed to scale-out and run on multiple machines.
[1] http://logstash.net/ [2] https://github.com/logstash/logstash