The future is now! Instead of you sending money to Amazon and them sending you disks, they keep the disks and you send them the money anyway.
Progress! :-D
Amazon's even more expensive. Storing one petabyte on Amazon, at the slowest, cheapest, glacier level, will cost you $10000/month, so at about 10 months, you've paid a hundred grand. At thirty six months, you've paid three hundred sixty grand. Plus, you have no hardware to show for it, while amazon sells that broken outdated junk to a reseller to regain some of their investment. Had you hosted it yourself, you could have sold all that equipment off to refurbishers and resellers and regained at least a portion of your investment back. So yeah, it is pretty terrible to choose Amazon.
This is something that has been puzzling me. Many years ago I purchased 4x 2TB 5900 RPM drives for a 4 bay ReadyNAS (cost about ~$300 for drives plus ReadyNAS). They have been spinning nonstop for ~4 years [1] and haven't had to replace a single one. Not even an increase in errors to signal that the drive is going.
Yet - I've worked on a SAN that cost hundreds of thousands of dollars and would have to replace a disk about every month.
Granted the disks in SANs probably spin faster (thus faster data access/lower MTBF) - but that high failure rate seems rather suspicious to me.
In the Blekko cluster we have just under 10,000 drives. We have a two 20 drive 'boxes' (40 drives) from Western Digital, as drives fail we pull replacements from the 'new/refurbished' box, and we put the dead one in the outgoing box. When we get up to 20 we RMA them in bulk, 20 go out, 20 more come in. That becomes the new 'new/refurbished' box.
It really isn't SAN vs non-SAN it is all statistics.
That said, if you're running your ReadyNAS with raid 10 (mirrored drives in a RAID 0 config) you may find some unpleasantness when a drive does fail. Statistically you have a 1/10 chance of not being able to re-silver the mirror for a 5900 RPM desktop SATA drive. That gets a bit painful.
It's his right to make up ridiculous licenses. Like the sisterware license (you can use the software if you send me a pic of your sister if you have one), people shouldn't take it seriously and avoid code licensed like that. That Crockford doesn't get this is either him trolling or being clueless.
In some ways the bundleware served like a SMART warning, causing people to back up and migrate their projects.
I found myself in the odd position of hoping SF would come back up so I could finish what I was doing, when I'd normally welcome the news that they had shut down.
Lots of google searching for what's wrong with Sourceforge revealed nothing (other than complaints about their packaged installers/crapware). Now I have the answer to the question I was really asking.
I just need to wait and see if the developers of the packages I need will somehow migrate to github so I can get the source...
What's happened to the mailing list archives for all these projects, are they just down the toilet now? And of course the source, this is bloody terrible !
I knew EMC storage was utter shit when, upon attempting to create a new RAID group, I realized that the configuration tool's default was to stripe across drives within a shelf, not to create stripes that span shelves.
Worse, to create the more fault-tolerant, shelf-spanning RAID volumes, one must manually add drives, one by one to the array, in a process that involves about 44 (slight hyperbole) clicks per disk.
And then there was the fact that the configuration tool was Windows-only.
Yeah, screw those guys.
Having said that, EMC is stupidly overpriced bloat.
It was a 20-something disk RAID 10 [1], arranged so that every mirrored pair of disks spanned different enclosures, in order to mitigate the failure of any one shelf — that is, interleaving mirrors across controllers and shelves, exactly as you suggest I should have done — and further, such that any one shelf failing only affects the mirrors that had disks on that shelf.
EMC's software wanted to allocate the drives from two shelves, with an unequal number of drives per shelf. It was just grabbing the next however many disks, linearly.
So, no, they weren't trying to balance the mirror across enclosures or controllers. They just weren't thinking.
[1] By "RAID 10", I mean "striped mirrors" — that is, create a bunch of mirrors that span shelves and then stripe across them — not "mirrored stripes" which is what you appear to be suggesting, with "it is better to have two disks on the same controller/chassis in raid 0, then raid 1".
A striped mirror is recommended in everything I've ever read on the subject, because it puts the redundancy at the lowest level of the array's geometry.
Using a mirrored stripe, on the other hand, means that when one disk fails, any other disks striped with it, still presumably perfectly functional, can't be used; the controller must instead read from and write to the mirror. If a disk in that mirror subsequently fails, you've lost data — and remember that when striping, the chance of failure is multiplied by the number of disks in the stripe.
EDIT: Footnote.
[Disclosure: I work for SourceForge]
My condolences.
As usual, one's respect for BigCompany is inversely correlated to one's use of their products.
I'm having the same issue with the number of hardlinks, which, for linux ext4 systems, is limited to 65000.
Offworld, Roy Batty reference.
(continues reading)