Learning from a Year of Security Breaches (opens in new tab)

(medium.com)

327 pointsarice9y ago49 comments

49 comments

39 comments · 11 top-level

Where do I start on centralized logging? I'm primarily an application developer, deployment isn't my strong suit. My hair is on fire at my current startup. There's a ton to do, we're trying to launch several new major efforts in January. What's a good plug and play solution that I don't have to think about?

Are there hosted installs of Elasticsearch/Logstash/Kibana? Is ELK even what I want?

Every time I start looking at centralized logging stuff it seems like a rabbit hole of problems we're too small to be worrying about, stuff that's not shipping features on my app.

lvh9y ago

You have a lot of decent options. You could do a lot worse than ELK. If you're on AWS, you can get hosted Elasticsearch. It comes out of the box with Logstash you can hook up to DynamoDB, and it also does Kibana out of the box. There are a number of other vendors; but there are decent reasons for keeping your logs as close as possible.

CloudWatch works fine too. CloudWatch comes integrated with AWS services out of the box. It can be more annoying to get your logs into it than ELK (the latter seems overall more popular). Its alerting and the AWS CLI integration pretty slick, though.

You should also go turn on CloudTrail right now. It lets you automatically log side-effectful API calls. It is not a replacement for a centralized logging pipeline, but it's great high-signal data to put into one.

I appreciate that your complaint (totally valid!) was "this is a rabbit hole", and I just gave you two options, and that might not help your perception that it's a rabbit hole. If you find yourself paralyzed by choice, either choice is much better than deferring the choice! Just pick one. Heck, if you can't pick, let me help: pick AWS hosted Elasticsearch.

A lot of people (also in the security space) like Splunk. I find it annoying to deploy (I've heard rsyslog-in-front-of-forwarders as a canonical deployment method for just ingesting syslog more than once because reasons) and overpriced. YMMV.

Disclaimer: shameless plug! You're not the only one with your hair on fire. One of the first things we're doing for Latacora customers is setting up a centralized logging pipeline.

dsacco9y ago

I second ELK, and I even stronger-second Splunk being overpriced (with the caveat that if you do deploy it, I think it's the best option, just not really worth it).

I think it's really important to internalize the idea that there is no Platonic ideal of a logging solution. It's a fundamentally frustrating manifestation of entropy that you're going to wrestle with, but it's a really necessary goal to work towards long term. Sort of a "the first step is admitting powerlessness" kind of deal.

tptacek9y ago

I've had good luck with Cloudwatch and, if you're on AWS, I'd recommend it over any other hosted log system (with the possible suggestion of a more elaborate ELK setup that you build yourself).

The trick to Cloudwatch is --- like most AWS services --- never using the web UI.

2 more replies

kevinr9y ago

It depends a LOT on how many machines and services on the machines you're dealing with. There's a remarkable amount of stuff at the small end which is good, cheap, and fast to deploy.

I've been using Loggly for my personal machines (~8, mostly cloud VPSes). On the plus side, it's free at my scale, and the analysis and reporting tools are nice at least in theory. On the minus side, I can't get my logs past 7 days archived to S3 without paying $150/month, which I really want since my main use-case is longer-term analysis and forensics.

I'm planning to switch to Papertrail, which for the princely sum of $7/mo will give me a simpler UI and a year's archiving to S3.

Loggly and Papertrail both use the same deployment strategy (you hook them up to syslog and/or your app's logging package), and I had Loggly up and running and providing useful feedback in solidly under four hours.

hobofan9y ago

For small to medium log volume, I can only recommend Loggly.

The killer feature it has is for me is searching structured (JSON) logs. Just use the Logstash/Greylog library in the language of your choice and send the logs to Loggly, and you quickly have a logging system where you can zoom in on the logs comming from different subsystems of your codebase or produced by a specific user.

jakozaur9y ago

I would use some SaaS solution. ELK can be cheaper, but can take tons of time to configure and maintain if you run it. For early startup paying under $100 / month for some logging solution is no brainer vs. spending time configuring.

Disclaimer: I work at Sumo Logic I would recommend: https://www.sumologic.com On top of grep like searches, you can do analytical searches (SQL on text data).

darekkay9y ago

Here are some SaaS choices:

* Sentry: https://sentry.io/welcome/

* Logentries: https://logentries.com/

* Loggly: https://www.loggly.com/

* Opbeat: https://opbeat.com/

* Papertrail: https://papertrailapp.com/

Sentry is open source and there is even an official up-to-date docker image: https://hub.docker.com/_/sentry/

Loggly published an "Ultimate Guide to Logging": https://www.loggly.com/ultimate-guide/

capkutay9y ago

Please excuse the shameless plug but since you are asking for one, Striim is a good out of the box centralized logging solution. We use Kafka as our messaging layer (you can either install on your own Kafka or use our internal) and we use Elasticsearch as the storage layer.

We also have streaming log parsers to connect your data. That whole thing about 'creating new alerts in minutes' is trivial in our platform since everything is based in SQL.

Unlike Splunk or ELK, our solution is based on in-memory streams so you don't have to wait for data to be indexed to fire off alerts on anomalous activity. Feel free to message me to find out more or simply download the product from http://www.striim.com/

d4mi3n9y ago

TFA aside, centralized logging is super useful for debugging a variety of issues. There are a number of hosted options, and setting them up isn't too hard. It usually involves configuring you're application's log device to talk to the remote service, or configuring syslog on your app servers to forward logs to said service.

See https://logentries.com/ for an example

theluketaylor9y ago

If you want a pretty prepackaged solution you could do a lot worse than splunk. They even offer it SaaS

https://www.splunk.com/en_us/cloud.html

ben_jones9y ago

I'm in the same boat. Looking for recommendations on strong, sturdy, buckets, for bailing water.

mag009y ago· 10 in thread

Hi, I wrote this!

To continue a discussion:

  - How does your engineering team track new "debt" after releasing code? (if at all, and why not)
  - Do you pay anyone for centralized logging, or wish you didn't? Are you making it useful?
  - Do you feel like your company is good at managing access when hiring / firing people?

Otherwise thanks for any feedback, I enjoy writing these!

haser_au9y ago

Can only speak about my corner of a very large organisation;

- Technical debt of custom coded solutions is a known issue across our organisation. New strategy is to move to market solutions, therefore outsourcing the risk to organisations with (hopefully) better code management than we have. For my corner, we don't have technical debt measured accurately enough for my liking.

- Yes, we pay for an use centralised logging. We've actually been through two solutions, and are now moving to a third due to various factors (cost, integrations, speed, out-of-the-box metrics). Integration into the centralised logging system is part of our Request for Tender marking criteria.

- Relatively good at disabling access after someone leaves. We integrate as much as possible to a central repository. It's just the outliers that tend to last beyond someone in the organisation. Critical systems are absolutely shutdown within 24 hours of a leaver departing (usually immediately if they're a bad leaver).

Edit: Formatting

BuuQu9hu9y ago

> (hopefully)

I hope you are auditing the code of those external orgs.

2 more replies

tptacek9y ago

Which logging systems did you like/not like?

1 more reply

ThePhysicist9y ago

Thanks for writing this, really insightful! A question: What's your advice on how to store secrets on the server-side?

Currently, I mainly use a seperate "secrets.yml" file that gets deployed via Ansible and is stored there encrypted using Ansible-Vault with a strong password. Is that a reasonable approach? What is your opinion about storing secrets in environment variables? It seems that some people advise this over storing them in files, but I have seen some cases where environment variables can be exposed to the web client as well.

mag009y ago

I don't like the idea of keeping secrets in ENV and limiting it to config, though it's the kind of thing I'd ask other folks about myself to understand any tradeoffs. I see Kubernetes and other things supporting secrets in env variables so unsure how common it is.

The big win is simply keeping secrets out of source code, out of an general engineer's copy/paste buffer, and with errors not going to a logging platform with single factor access. Your likelihood of a short term incident decreases dramatically. Especially if those secrets have well segmented access, (IE, not a single AWS key with `AdministratorAccess` everywhere).

1 more reply

mzzter9y ago

Have you heard of torus.sh keyrings? I don't know how well it works for an organization, but integrating torus into my side projects has been painless.

spydum9y ago

Great article, shared it on with my coworkers.

- poorly, really.

- for network, and security stuff, absolutely: splunk is the bees knees. For apps, each team tends to run their own mix (graylog2/elk/custom). Have pushed for more security type events from apps into splunk for correlation, but it just costs too damm much.

- depends on the region. I find US / UK do okay, but the more emerging/growth markets where we have employees, the worse it gets.

jasonlotito9y ago

You said this: "Rarely do I see a team eliminate all of their debt, but the organizations _that least respect_ their debt never get so far behind that they can no longer be helped in a breach."

Do you mean instead "that _at_ least respect"?

I ask only because they two have different meanings.

mag009y ago

Yep, fixing

dsacco9y ago

Thank you for writing these. These blog posts are my go-to resources when my client companies want to learn more about what they can do to improve their security posture long term. It's a really great series.

user59944619y ago· 5 in thread

> I wasn’t roped into a single intrusion this year at any companies with completely role driven environments where secrets were completely managed by a secret store.

> This can either mean one of a few things: These environments don’t exist at all, there aren’t many of them, or they don’t see incidents that would warrant involving IR folks like myself.

What are these secrets store? Do they exist?

lvh9y ago

In general, secret stores "manage secrets so that you don't have to". That can mean a few things, depending on who's using the term.

Sometimes, it's as simple as a shared password store (I've used one powered by GPG, for example). This is better than YOLO password policy, but not by much: humans still see individual keys.

If you want to be really fancy, you authenticate the human and then decide what they get to do, in a centralized fashion. This is often tricky to do, because you either don't have the funds to do that if you're small, or you have too many services to interact with if you're big. (Many organizations get pretty close -- I'm told that the DoD pretty much authenticates everything with smart cards, for example.)

Sometimes, it means a more automated system where software authenticates instead of a human, and it gets e.g. a certificate. Usually this is still always the same certificate, though; so the main difference is that it's a human versus a machine authenticating.

Sometimes, it means an HSM (hardware security module). These are secure physical devices that perform cryptographic operations for you, so that the key stays on the device.

user59944619y ago

Still need a secret to access the secret store... so steal the secret then steal the secrets in the store.

I fail to see how it is secured. (Though, I can understand that it is less bad than a YOLO policy).

> Many organizations get pretty close -- I'm told that the DoD pretty much authenticates everything with smart cards, for example.

I've been at a place with RSA SecurID (smart card and OTP) + active directory account as SSO authentication for everything (use one or both for 2FA). It was nice and well done.

1 more reply

dsp12349y ago

For example, Hashicorp Vault[0]

[0] - https://www.vaultproject.io/intro/index.html

user59944619y ago

Then people need secrets to access the secret store and you're back at square one ;)

1 more reply

theluketaylor9y ago

There are a number of secret stores. Some more basic ones resemble password managers on steroids, with audit logs of who checked out what and when. Or you can go to a full HSM (hardware security module) that totally isolates secrets (keys) from secret users (actual users, application code etc). HSMs allow you to sign or encrypt without ever having the keys used. It's hard to accidentally leak a secret if you never had it in the first place.

mikeyk9y ago· 1 in thread

Had the pleasure of working with Ryan when he was at FB--he's one of the best.

atmosx9y ago

You were part 'Red Team' incidents[1]? I can only imagine the panicked sysadmins running around like crazy, but jokes apart his is the best way to train a team's incidence response I've seen.

[1] https://medium.com/starting-up-security/red-teams-6faa8d95f6...

rokosbasilisk9y ago· 1 in thread

Im surprised credential theft is still the lowest hanging fruit.

I thought Banks seem to have solved alot of that.

kevinr9y ago

Banks are in the business of managing financial risk for their customers, and they have enough money to eat a lot of risk before it becomes a problem for them. Other business models with less money in them do not have the same kind of resources.

tptacek9y ago

This is the best security article I've read in a long time. If you're at a startup right now, drop most things and take a few minutes to read it carefully.

bm989y ago

Ugh. A good and scary reminder of what's lurking around the corner for any of us at any time - including holidays and vacations (Linode's holiday attack last December comes to mind). IMHO, the emotional impact of breaches on the staff who respond to them is under-discussed. The author touches on it here:

> The discovery of a root cause is an important milestone that dictates the emotional environment an incident will take place in, and whether it becomes unhealthy or not.

> A grey cloud will hover over a team until a guiding root cause is discovered. This can make people bad to one another. I work very hard to avoid this toxicity with teams. I remember close calls when massive blame, panic, and resignations felt like they were just one tough conversation away.

jonstewart9y ago

It's interesting to see press leaks highlighted here as a pattern for insider threat. I don't doubt the author that this is so for the limited scope of organizations considered (SFBA tech companies), but I've worked on several insider cases, had insight into many more, and it's almost always an employee or ex-employee, with an axe to grind, taking trade secret information to a new job at a competitor. In many instances, the competitor has no idea and is pissed when they find out.

One piece of advice that I'd give out with such cases is to listen to your Spidey Sense. A lot of organizations will say, after the fact, "well... something didn't seem right with Bob...". If you sense something isn't right, prepare to secure evidence and analyze it. Don't put IT assets back into circulation if there's doubt, and don't sit on it.

lvh9y ago

This is solid advice. To illustrate a little based on my own experiences and goals this year:

- Yes, centralized logging is the biggest thing. What you put into it matters; queriability matters; but nothing matters as much as having that centralized logging pipeline to begin with. Once you have that, you can start adding other relevant metadata, like host config states, API calls, et cetera.

- Giving employees a budget to buy the device they want is probably a better idea than BYOD. Strong password policies still matter. If it's BYOD, you probably still want to bring the device into policy. That can include physical rules (only do work work on the VPN or from the office) and software ones (you can use any device you want but it has to be running our osqueryd or whatever). Unfortunately, visibility becomes a double edged sword: there are good legal and ethical reasons for not wanting to see everything on an employee's laptop. (Overall, I think BYOD is a bad idea for most companies.)

- 2FA is pretty cool. It doesn't just solve the usual "bad/compromised password" model -- it also typically makes it a lot harder for employees to mismanage their credentials (e.g. re-use the same SSH keys and have their personal box be compromised). For some reason, having that around seems to remind developers that you can make users re-authenticate for important/unusual actions -- you don't just have to count on the ambient authority of a session cookie.

- We'd all like to imagine that we're going to be attacked by space alien 0day ninjas. Realistically, the main vector is an employee (rogue or confused deputy). Trainings are boring and don't work. Signature-based detection gets outdated pretty quick. I've done a little work on faster analysis tools -- I'm hoping we get a lot better at unobtrusively protecting people from even spearphishing in the next few years. (The tools we're building at Latacora are ready to beat a lot of attacker tactics right now, but I think we have an arms race ahead of us. Boring domain generation algorithms still aren't detected by most organization, so there's not a lot of evolutionary pressure.)

- I have no idea if we'll get better at quantifying metrics for debt and security risk. I did a little bit of research into this, and it's a wide open field. You can get decent high-level reports with a "DEFCON number", but most of these models are not sophisticated in the sense you'd expect actuarial tables to be. And that's what they should be! It's revenue-at-risk! Step one here is fortunately getting all of that data into that centralized logging pipeline, and security professionals seem to mostly agree that's what you do first, so hopefully we get better here.

coldcode9y ago

Logging everything is a great idea, but only if you read the log data. Target installed a system to monitor for certain kinds of security hacks which wrote to their logs files. The logging was turned off due to a high number of warnings cluttering up the logs. Of course the logging was telling them they were being hacked which they ignored for months, leading to all sorts of business disasters.

weld9y ago

Great article! For those interested in security debt and how it relates to startups, I wrote this in 2011: https://www.veracode.com/blog/2011/02/application-security-d... and presented it that year: https://www.youtube.com/watch?v=MKdiiXgvz_U This predates by a year the referenced security debt presentation which has much of the same material uncited.

j / k navigate · click thread line to collapse

49 comments

39 comments · 11 top-level

acidbaseextract9y ago· 11 in thread

Are there hosted installs of Elasticsearch/Logstash/Kibana? Is ELK even what I want?

Every time I start looking at centralized logging stuff it seems like a rabbit hole of problems we're too small to be worrying about, stuff that's not shipping features on my app.

lvh9y ago

Disclaimer: shameless plug! You're not the only one with your hair on fire. One of the first things we're doing for Latacora customers is setting up a centralized logging pipeline.

dsacco9y ago

I second ELK, and I even stronger-second Splunk being overpriced (with the caveat that if you do deploy it, I think it's the best option, just not really worth it).

tptacek9y ago

I've had good luck with Cloudwatch and, if you're on AWS, I'd recommend it over any other hosted log system (with the possible suggestion of a more elaborate ELK setup that you build yourself).

The trick to Cloudwatch is --- like most AWS services --- never using the web UI.

2 more replies

kevinr9y ago

It depends a LOT on how many machines and services on the machines you're dealing with. There's a remarkable amount of stuff at the small end which is good, cheap, and fast to deploy.

I'm planning to switch to Papertrail, which for the princely sum of $7/mo will give me a simpler UI and a year's archiving to S3.

hobofan9y ago

For small to medium log volume, I can only recommend Loggly.

jakozaur9y ago

Disclaimer: I work at Sumo Logic I would recommend: https://www.sumologic.com On top of grep like searches, you can do analytical searches (SQL on text data).

darekkay9y ago

Here are some SaaS choices:

* Sentry: https://sentry.io/welcome/

* Logentries: https://logentries.com/

* Loggly: https://www.loggly.com/

* Opbeat: https://opbeat.com/

* Papertrail: https://papertrailapp.com/

Sentry is open source and there is even an official up-to-date docker image: https://hub.docker.com/_/sentry/

Loggly published an "Ultimate Guide to Logging": https://www.loggly.com/ultimate-guide/

capkutay9y ago

We also have streaming log parsers to connect your data. That whole thing about 'creating new alerts in minutes' is trivial in our platform since everything is based in SQL.

d4mi3n9y ago

See https://logentries.com/ for an example

theluketaylor9y ago

If you want a pretty prepackaged solution you could do a lot worse than splunk. They even offer it SaaS

https://www.splunk.com/en_us/cloud.html

ben_jones9y ago

I'm in the same boat. Looking for recommendations on strong, sturdy, buckets, for bailing water.

mag009y ago· 10 in thread

Hi, I wrote this!

To continue a discussion:

  - How does your engineering team track new "debt" after releasing code? (if at all, and why not)
  - Do you pay anyone for centralized logging, or wish you didn't? Are you making it useful?
  - Do you feel like your company is good at managing access when hiring / firing people?

Otherwise thanks for any feedback, I enjoy writing these!

haser_au9y ago

Can only speak about my corner of a very large organisation;

Edit: Formatting

BuuQu9hu9y ago

> (hopefully)

I hope you are auditing the code of those external orgs.

2 more replies

tptacek9y ago

Which logging systems did you like/not like?

1 more reply

ThePhysicist9y ago

Thanks for writing this, really insightful! A question: What's your advice on how to store secrets on the server-side?

mag009y ago

1 more reply

mzzter9y ago

Have you heard of torus.sh keyrings? I don't know how well it works for an organization, but integrating torus into my side projects has been painless.

spydum9y ago

Great article, shared it on with my coworkers.

- poorly, really.

- depends on the region. I find US / UK do okay, but the more emerging/growth markets where we have employees, the worse it gets.

jasonlotito9y ago

You said this: "Rarely do I see a team eliminate all of their debt, but the organizations _that least respect_ their debt never get so far behind that they can no longer be helped in a breach."

Do you mean instead "that _at_ least respect"?

I ask only because they two have different meanings.

mag009y ago

Yep, fixing

dsacco9y ago

user59944619y ago· 5 in thread

> I wasn’t roped into a single intrusion this year at any companies with completely role driven environments where secrets were completely managed by a secret store.

> This can either mean one of a few things: These environments don’t exist at all, there aren’t many of them, or they don’t see incidents that would warrant involving IR folks like myself.

What are these secrets store? Do they exist?

lvh9y ago

In general, secret stores "manage secrets so that you don't have to". That can mean a few things, depending on who's using the term.

Sometimes, it's as simple as a shared password store (I've used one powered by GPG, for example). This is better than YOLO password policy, but not by much: humans still see individual keys.

Sometimes, it means an HSM (hardware security module). These are secure physical devices that perform cryptographic operations for you, so that the key stays on the device.

user59944619y ago

Still need a secret to access the secret store... so steal the secret then steal the secrets in the store.

I fail to see how it is secured. (Though, I can understand that it is less bad than a YOLO policy).

> Many organizations get pretty close -- I'm told that the DoD pretty much authenticates everything with smart cards, for example.

I've been at a place with RSA SecurID (smart card and OTP) + active directory account as SSO authentication for everything (use one or both for 2FA). It was nice and well done.

1 more reply

dsp12349y ago

For example, Hashicorp Vault[0]

[0] - https://www.vaultproject.io/intro/index.html

user59944619y ago

Then people need secrets to access the secret store and you're back at square one ;)

1 more reply

theluketaylor9y ago

mikeyk9y ago· 1 in thread

Had the pleasure of working with Ryan when he was at FB--he's one of the best.

atmosx9y ago

You were part 'Red Team' incidents[1]? I can only imagine the panicked sysadmins running around like crazy, but jokes apart his is the best way to train a team's incidence response I've seen.

[1] https://medium.com/starting-up-security/red-teams-6faa8d95f6...

rokosbasilisk9y ago· 1 in thread

Im surprised credential theft is still the lowest hanging fruit.

I thought Banks seem to have solved alot of that.

kevinr9y ago

tptacek9y ago

This is the best security article I've read in a long time. If you're at a startup right now, drop most things and take a few minutes to read it carefully.

bm989y ago

> The discovery of a root cause is an important milestone that dictates the emotional environment an incident will take place in, and whether it becomes unhealthy or not.

jonstewart9y ago

lvh9y ago

This is solid advice. To illustrate a little based on my own experiences and goals this year:

coldcode9y ago

weld9y ago

j / k navigate · click thread line to collapse