Skip to content

Top Best Ask Show New Jobs

Another Redis case: Centralized logging (opens in new tab)

(sunilarora.org)

26 pointstheonlyroot14y ago42 comments

Would like to know what logging strategy people have been following in a distributed environment.

42 comments

21 comments · 7 top-level

andrewvc14y ago· 6 in thread

My only question: how is this better than syslog?

tptacek14y ago

It's queryable.

It's trivially capped and rolled.

It's centralized.

It provides a common logging interface across platforms.

It's extremely simple (the script that pipes syslog output directly into Redis is probably just a couple lines long).

I have to believe that the people who really seem to like syslog have never worked in organizations that had to deploy things like Splunk or (worse) LogLogic and ArcSight just to make sense of the giant morass of useless text gunk they generate.

Have you noticed how none of the cool kids postprocess http log files anymore?

cdavid14y ago

I don't really understand your list in view of the article. How is this solution more queryable than syslog: they record events into redis without any schema related to it (just a string), so I fail to see the improvement there. They put it back to a log file anyway.

It is not less or more centralized than syslog configured with centralization (which is trivial to set up).

How is this more common than syslog across platforms (unless you include windows in "across platforms ?").

It is not simpler than syslog either, since writing to syslog is just a matter of using the right python logging backend.

Analysing tons of data from syslog is a pain, but I don't see how any solution will not require at some point in the stack to enforce a format/structure in your log. How is this fundamentally different than post-processing http log ?

moe14y ago

I think you're missing the forest for the trees.

What you really want to do (and what everyone does btw) is to push your logs to a central syslog-server and stream them into redis or whatever analytics solution from there.

theonlyrootOP14y ago

I agree, it is nothing different than syslog if all you want to do is just dump all logs in one file. But one advantage of this approach in our case was that we wanted to show last 1000 critical logs on a web interface and with logs stored in redis, it was pretty easy to do that. And as redis was part of our stack, it was very easy to hook it up for this specific task.

trentonstrong14y ago

Syslog implementations like syslog-ng support both TCP and UDP relaying of all log data on a machine to a centralized Syslog server, and can even bypass storing those logs to the source machine's disk at all. Syslog-ng also supports inserting that data directly into MySQL, and there are various other backends (like Splunk, though I know it's commercial) that can accept the TCP and UDP streams and index them in all sorts of fancy ways.

I think the key point here is that all the above mentioned implementations have significant adoption and are in a sense "battle-tested". For example, what if your background worker has failed and log events are piling up in the Redis list you are using as queue? Do you have monitors in place to detect that situation, and at what value do your alarms go off? Projects like this have a way of taking a lot more time than originally thought, often at the expense of your core development time. I personally don't like spending the time writing and maintaining code for a project that isn't aligned with the problem I'm trying to solve, so I avoid it whenever possible.

On the flip side, if you are setting out to build a really robust logging system on top of Redis, and that's something of value to your organization, then more power to you!

tedjdziuba14y ago

Ahem.

tail -n 1000

tedjdziuba14y ago· 5 in thread

Holy balls, talk about going out of your way to avoid syslog.

tptacek14y ago

Syslog is a pile of shit, Ted. It's a relic. You clearly happen to love that relic, and I think you should find a way to place it just-so in a nicely lit alcove in your apartment. The rest of us should move on from it. I don't think less of you for admiring it. I have useless old things on display in my house too.

* Freeform text is a terrible way to track system events.

* Periodically rotated flat files are not a great way to store log information.

* Goofy little UDP messages are not a good way to convey system events

* The syslog PRI field dates back to when we exchanged messages with UUCP.

I could keep going, but since you're just going to reply with "lolwut umad?", I'll leave it at that.

moe14y ago

Yes, syslog is a pile of shit. It's a relic.

And I'll add it tends to ship in a horrible default configuration with events scattered randomly over multiple files, no safe-guards against filling up the disk and no safeguards to ensure the stupid daemon is actually running.

However...

Freeform text is a terrible way to track system events.

Nothing stops you from logging structured text.

Periodically rotated flat files are not a great way to store log information.

Modern syslog daemons will write to pretty much anything you want.

Goofy little UDP messages are not a good way to convey system events

Modern syslog daemons offer tcp transport. Some even try to offer some delivery guarantees (disk-backed spool), although personally I wouldn't rely on that for truly critical stuff.

The syslog PRI field dates back to when we exchanged messages with UUCP.

Thanks, I always wondered where those were from...

And, well, you forgot a couple bullets:

* syslog() is available everywhere, out of the box

* It's trivial to move from file-based logging to syslog

* We have mature syslog-daemons that dispatch events pretty reliably

* Unless you're facebook you probably don't need anything more fancy.

So, I'd say syslog gets the job done quite well, as long as you don't mistake it for a message queue.

Andys14y ago

There's syslog the protocol, syslog the API, and syslog the daemon. There are syslog daemon and protocol replacements that are much better, but retain the API for wide-spread compatibility.

zbailey14y ago

We use syslog where I work, and I've always felt the same way, but never heard any suggestions for better options with as wide adoption, support, and background as syslog.

Out of pure curiosity, what do you see as the tool most likely to displace syslog in the future? Is there any alternative available that fixes most of these problems without rolling your own from pieces and parts?

m0nastic14y ago

The only time in my life I've ever been envious of syslog was when I had to build an aggregator/event correlator for a bunch of telco equipment that only talked TL1.

Pretty much any other time I've had to work with it, I've wished for something better, so I, for one, am very happy by the thought of people starting to "go out of their way to avoid syslog".

mkelly14y ago· 3 in thread

Depends on how much you care about latency, right?

The easy solution is just, y'know, write the log to a file and scp it back to some central place every so often. But then you have to either (a) keep track of how much of a file you've copied, which is a pain; or (b) only grab files that you're no longer actively writing to (as determined by naming scheme or something), but that introduces some latency, depending on how often you rotate.

StavrosK14y ago

I prefer sending the logs to some other computer over UDP.

tptacek14y ago

Why do you want to send logs over UDP?

I get that you were making a snarky allusion to syslog, but what part of the syslog UDP protocol do you feel beats Redis' TCP protocol?

rarrrrrr14y ago

This only works if you can tolerate loss of some log items.

ajays14y ago

The article was fairly devoid of the necessary details. One of the main ones being: how many events are we talking about? 10 events/sec? 100/sec? 10_000/sec ? And what is the size of these events? How many event emitters are connecting to the Redis server?

With the details, it would be a much more interesting post.

antoncohen14y ago

Take a look at logstash [1], it's on GitHub [2]. I think it could replace or integrate with RedisLogHandler.

logstash takes logs from various inputs (syslog, files, Redis, HTTP), filters/normalized the formats into JSON, and outputs various formats (ElasticSearch, Redis, MongoDB, Graylog2). There is a WebUI with search and graphs. It's designed to scale-out and run on multiple machines.

[1] http://logstash.net/ [2] https://github.com/logstash/logstash

aedocw14y ago

A similar concept, though using MongoDB with a capped collection is http://graylog2.org.

rbucker14y ago

I'm working on a universal subscriber, however, it currently does a nice job with redis. http://sub-watcher.com It forwards messages back to redis and to syslog. And it has several filtering options.

j / k navigate · click thread line to collapse