Help, Linux ate my RAM (opens in new tab)

(linuxatemyram.com)

102 pointsez7714y ago102 comments

102 comments

38 comments · 10 top-level

illumin814y ago· 10 in thread

I've tried to educate Oracle DBAs on why top is wrong and their memory really isn't being used. It's painful and they often refuse to believe that I know what I'm talking about, and that they should use the free -m command to see what memory is actually available for use.

Is there any particular reason why Oracle DBAs are less likely to believe this? Perhaps it's because most of them grew up in legacy UNIX environments rather than Linux.

krobertson14y ago

I think this is a pretty common misconception overall. Working at startups, often find devs with multiple hats sometimes doing ops tasks. Seen many who hop on a system trying to diagnose some issue, fire up top and proclaim "OMG, the problem is we're running out of memory!"

2nd is explaining virtual/resident set size.

francoisdevlin14y ago

Okay, I'm one of those devs. What is this virtual/resident set size thing you're talking about?

EDIT: Thank you for all the helpful responses!

6 more replies

ajross14y ago

That's depressing. Developers are the ones who have to understand this. In IT it's common to find people (even "DBAs") who have made a career of following procedures someone else wrote.

1 more reply

Legion14y ago

2 minute fix: get people to use htop, not top.

2 more replies

seclorum14y ago

Teach them to interpret and understand cat /proc/meminfo .. one of the most interesting things you can do with this is pipe it into gnuplot and watch it over time as things happen. Try it sometimes .. you might get through to one or two.

ivan7814y ago

If you want to be more enterprisey, you can pipe it to SNMP counter and then draw a graph with your Network Monitoring System. It is much more convenient if you have more that few servers.

1 more reply

gaius14y ago

I've been an Oracle DBA for 15 years, and no-one uses top for that, as it's well known not to account in any sort of meaningful way for the way Oracle uses shared memory. The only thing it's useful for is seeing which of the sysadmin's Perl scripts is chewing the CPU.

ivan7814y ago

As a longtime Linux user/admin and beginning Oracle DBA I now know that all memory should be occupied by Oracle, not by OS. :-) Serious mode on: I'm sure it's a big problem to be narrow expert. They can be brilliant specialist in their field, but one step aside and they are absolutely helpless.

illumin814y ago

Thanks for the great comment. I really wish more people like yourself at least understood Linux memory management. I think it will truly help you to be an exceptional DBA. I can also understand how an expert DBA might not know or care too much about the OS underneath his software, although I would argue that if you understand the OS fundamentals, you will be that much better at whatever specialty you have.

1 more reply

greedo14y ago

I've had the same discussion with Websphere administrators who can't grasp the concept of caching and the role of swapping. I even had one admin think that modifying the swappiness setting to keep memory free would be a good idea...

click17014y ago· 5 in thread

It's cute that this pops up every few years, and I think it points to a steady (if slow) attraction of new users to Linux.

scott_s14y ago

People say the same things on Mac: http://news.ycombinator.com/item?id=3584609

It represents a fundamental misunderstanding of how modern OSes work. That misunderstanding is not the problem; modern OSes are complex pieces of software, and most people shouldn't have to understand them. OSes should just work. The problem comes in when people who don't understand how they work get the itch to "improve" their system.

llambda14y ago

I'm actually continually amazed that even some technically inclined people misunderstand how memory works in /all/ modern OSes: recently I had a discussion on Twitter that went something like this: "OS X is horrible, it uses nearly all of my 8 gigs of RAM and my browser is horribly slow!" to which I replied, "it's perfectly normal for the OS to appropriate the RAM in the fashion you see in Activity Monitor; this doesn't mean your machine is slow due to lack of RAM." Unfortunately, even after a lengthy discussion and several forwarded links, I seem to have failed to make the case.

It would be interesting if someone more knowledgeable than I am were to do a write up explaining memory usage in OS X, Windows, and Linux; would be an awesome resource to share with curious tinkers who may be slightly misguided in their understandings of the inner-workers of their computers.

3 more replies

jws14y ago

Programmers need to remember that users are like this: (from an Apple discussion thread)

… [periodically]'installd' begins using 100% of all 4 CPU cores, my fan goes full speed and the whole computer gets very hot. Seems to happen before Software Update checks for updates. I usually go to the terminal and kill the 'installd' process, which reduces the fan speed and heat to normal within a minute.

I wonder if this guy is also one that says you need to reinstall from scratch every few months to keep things working?

3 more replies

dfc14y ago

It makes me feel old. As soon as I saw the headline I knew what the story was going to be. Another fresh wave of young faces learning about memory...I just wish they'd stay off the lawn.

spudlyo14y ago

It makes me feel a little embarrassed for HN actually. This article might be a revelation for the noobs on the Linux subreddit, but I'd expect the HN crowd to find it pretty pedestrian.

Not only that, but in some cases it is flat-out wrong:

No, disk caching only borrows the ram that applications don't currently want. It will not use swap. If applications want more memory, they just take it back from the disk cache. They will not start swapping.

Try again. You can tune this to some extent with /proc/sys/vm/swappiness but Linux is loathe to abandon buffer cache, and will often choose to swap old pages instead.

I have learned this the hard way. For example, on a database machine (where > 80% of the memory is allocated to the DB's buffer pool) try to take a consistent filesystem snapshot of the db's data directory and then rsync it to another machine. The rsync process will read a ton of data, and Linux will dutifully (and needlessly) try to jam this into the already full buffer cache. Instead of ejecting the current contents of the buffer cache, Linux will madly start swapping out database pages trying to preserve buffer cache.

Some versions of rsync support direct i/o on read to avoid this, but they're not mainstream and readily available on Linux. You can also use iflag=direct with dd to get around this problem.

mwexler14y ago· 5 in thread

firefoxatemyram.com is still available. Perhaps we can put a site up there as well.

jff14y ago

Yep, firefox is currently 555 MB resident and using 1.5 GIGABYTES of virtual memory space. Goddamn, firefox, you are a pig. Saddest thing? I've got gmail, github, and maybe 10 static pages open.

Linux just looks like it ate your RAM. Firefox straight up does eat it.

tiles14y ago

Am I missing something? Does Firefox not use unused RAM for cache in a similar manner as Linux uses unused RAM?

1 more reply

scott_s14y ago

I suspect you don't know what virtual memory is: http://news.ycombinator.com/item?id=3699481

1 more reply

ineedtosleep14y ago

Why is Firefox always mentioned in this context? Sure they have had a history of it, but lately they've been fine and if one compares the combined spawned processes of Chrome, Chrome typically has more memory consumption.

4 more replies

pcwalton14y ago

What does about:memory say?

3 more replies

bshep14y ago· 4 in thread

Is there any way to have top display this information?

viraptor14y ago

Use `htop` instead. It displays the information in a bit more verbose way. You'll get each section of "used" memory colour-coded, so that the last yellow area can be ignored as cache.

datagramm14y ago

yep +1 for htop.

obtu14y ago

free does the job of taking cache into account (mentioned in the original post). If you've read the neugierig post, and want a better per-process monitor:

gnome-system-monitor has a top-like monitor as well as graphs, and measures memory properly (including a discount for shared maps); smem works in the console; it doesn't have a term interface like top, but it can be combined with watch.

bcl14y ago

atop is what I use these days, very flexible.

plaes14y ago· 2 in thread

The cache can be cleared via `/proc/sys/vm/drop_caches`.

http://linux-mm.org/Drop_Caches

mjb14y ago

Yes it can, but you probably don't want to do that. You definitely don't want to do it in an automated way when the machine is experiencing memory pressure.

There are very good reasons that Linux (and most other modern operating systems) makes aggressive use of page caches and buffers. For the vast majority of applications dropping these caches is going to reduce performance considerably (disk is really really slow) and most applications for which this isn't true are probably using O_DIRECT anyway.

The arguments in favor of page caching are: (a) disks have very high latency (b) disks have relatively low bandwidth (c) for hot data RAM is cheaper disk IO both in dollars and in watts [1] and (d) it's basically free because the memory would have been unused anyway.

The arguments against page caching are: (a) occasionally the kernel will make poor choices and do something sub-optimal and (b) high numbers in 'free' make me feel better.

Too many inexperienced operators (or those experienced on other OSs) confuse disadvantage (a) for disadvantage (b) and decide to drop caches using a cron job.

[1] Old but good: ftp://ftp.research.microsoft.com/pub/tr/tr-97-33.pdf

plaes14y ago

Yes, it was actually sort of a response to that webpage that said it was not possible to free this cached memory.

The cache dropping is actually useful when you are doing benchmarking...

1 more reply

suboptical14y ago· 1 in thread

Looks like their site went down, have a mirror:

http://webcache.googleusercontent.com/search?q=cache:http://... http://webcache.googleusercontent.com/search?q=cache:http://...

JshWright14y ago

Looks like HN ate all their RAM...

ez77OP14y ago· 1 in thread

I found this page because I was pretty confused about these issues, and still am...

Question for the crowd: In this site the example given says that in reality there are 869MB of used RAM. I'm comparing this with my VPS values, and would like to know if this is the sum of some column in top. Is it? It looks like it's pretty close to the sum of the SHR column. Does this make sense? Thanks in advance.

mgedmin14y ago

You can't really do sums of top columns, because some memory is shared and you'll end up double-counting it.

And you can't just subtract the shared memory numbers, because different sets of pages are shared between different sets of processes, and top doesn't give enough information to figure out what's actually happening where.

Running the pmaps tool on all pids and summing the Pss number is perhaps the closest you can get to the actual memory use.

gghh14y ago

I found a good introduction on unix memory caching in chapter 3 (The Buffer Cache) of 'Design of the UNIX Operating System', http://www.amazon.com/Design-Operating-System-Prentice-Hall-... . At least it was good for me (mathematician by training, programmer by profession)

mark-r14y ago

Does the Linux disk cache push out pages that are used by running applications? I believe Windows does it, though I can't state that for a fact.

glfomfn14y ago

And Hacker News ate your website :-/

j / k navigate · click thread line to collapse

102 comments

38 comments · 10 top-level

illumin814y ago· 10 in thread

Is there any particular reason why Oracle DBAs are less likely to believe this? Perhaps it's because most of them grew up in legacy UNIX environments rather than Linux.

krobertson14y ago

2nd is explaining virtual/resident set size.

francoisdevlin14y ago

Okay, I'm one of those devs. What is this virtual/resident set size thing you're talking about?

EDIT: Thank you for all the helpful responses!

6 more replies

ajross14y ago

That's depressing. Developers are the ones who have to understand this. In IT it's common to find people (even "DBAs") who have made a career of following procedures someone else wrote.

1 more reply

Legion14y ago

2 minute fix: get people to use htop, not top.

2 more replies

seclorum14y ago

ivan7814y ago

If you want to be more enterprisey, you can pipe it to SNMP counter and then draw a graph with your Network Monitoring System. It is much more convenient if you have more that few servers.

1 more reply

gaius14y ago

ivan7814y ago

illumin814y ago

1 more reply

greedo14y ago

click17014y ago· 5 in thread

It's cute that this pops up every few years, and I think it points to a steady (if slow) attraction of new users to Linux.

scott_s14y ago

People say the same things on Mac: http://news.ycombinator.com/item?id=3584609

llambda14y ago

3 more replies

jws14y ago

Programmers need to remember that users are like this: (from an Apple discussion thread)

I wonder if this guy is also one that says you need to reinstall from scratch every few months to keep things working?

3 more replies

dfc14y ago

It makes me feel old. As soon as I saw the headline I knew what the story was going to be. Another fresh wave of young faces learning about memory...I just wish they'd stay off the lawn.

spudlyo14y ago

It makes me feel a little embarrassed for HN actually. This article might be a revelation for the noobs on the Linux subreddit, but I'd expect the HN crowd to find it pretty pedestrian.

Not only that, but in some cases it is flat-out wrong:

Try again. You can tune this to some extent with /proc/sys/vm/swappiness but Linux is loathe to abandon buffer cache, and will often choose to swap old pages instead.

Some versions of rsync support direct i/o on read to avoid this, but they're not mainstream and readily available on Linux. You can also use iflag=direct with dd to get around this problem.

mwexler14y ago· 5 in thread

firefoxatemyram.com is still available. Perhaps we can put a site up there as well.

jff14y ago

Yep, firefox is currently 555 MB resident and using 1.5 GIGABYTES of virtual memory space. Goddamn, firefox, you are a pig. Saddest thing? I've got gmail, github, and maybe 10 static pages open.

Linux just looks like it ate your RAM. Firefox straight up does eat it.

tiles14y ago

Am I missing something? Does Firefox not use unused RAM for cache in a similar manner as Linux uses unused RAM?

1 more reply

scott_s14y ago

I suspect you don't know what virtual memory is: http://news.ycombinator.com/item?id=3699481

1 more reply

ineedtosleep14y ago

4 more replies

pcwalton14y ago

What does about:memory say?

3 more replies

bshep14y ago· 4 in thread

Is there any way to have top display this information?

viraptor14y ago

Use `htop` instead. It displays the information in a bit more verbose way. You'll get each section of "used" memory colour-coded, so that the last yellow area can be ignored as cache.

datagramm14y ago

yep +1 for htop.

obtu14y ago

free does the job of taking cache into account (mentioned in the original post). If you've read the neugierig post, and want a better per-process monitor:

bcl14y ago

atop is what I use these days, very flexible.

plaes14y ago· 2 in thread

The cache can be cleared via `/proc/sys/vm/drop_caches`.

http://linux-mm.org/Drop_Caches

mjb14y ago

Yes it can, but you probably don't want to do that. You definitely don't want to do it in an automated way when the machine is experiencing memory pressure.

The arguments against page caching are: (a) occasionally the kernel will make poor choices and do something sub-optimal and (b) high numbers in 'free' make me feel better.

Too many inexperienced operators (or those experienced on other OSs) confuse disadvantage (a) for disadvantage (b) and decide to drop caches using a cron job.

[1] Old but good: ftp://ftp.research.microsoft.com/pub/tr/tr-97-33.pdf

plaes14y ago

Yes, it was actually sort of a response to that webpage that said it was not possible to free this cached memory.

The cache dropping is actually useful when you are doing benchmarking...

1 more reply

suboptical14y ago· 1 in thread

Looks like their site went down, have a mirror:

http://webcache.googleusercontent.com/search?q=cache:http://... http://webcache.googleusercontent.com/search?q=cache:http://...

JshWright14y ago

Looks like HN ate all their RAM...

ez77OP14y ago· 1 in thread

I found this page because I was pretty confused about these issues, and still am...

mgedmin14y ago

You can't really do sums of top columns, because some memory is shared and you'll end up double-counting it.

Running the pmaps tool on all pids and summing the Pss number is perhaps the closest you can get to the actual memory use.

gghh14y ago

mark-r14y ago

Does the Linux disk cache push out pages that are used by running applications? I believe Windows does it, though I can't state that for a fact.

glfomfn14y ago

And Hacker News ate your website :-/

j / k navigate · click thread line to collapse