But I do this from my own Pinboard account, which has
superpowers. Specifically, it lets me see private bookmarks
on all accounts. And since I already see everything, I don't
notice when everybody else starts to see everything, too.
This is not okay.I can't think of a way to store retrievable data, like a bookmark, on his servers without leaving some way for him to access the data if he wanted to.
1) It's trivial for him to inadvertently see something deeply personal to someone just by browsing the 'recent' list or doing a search.
UPDATE: I overstated this one - Maciej let me know by email that he can only access private data on the search / recent page if he intentionally masquerades a user. He can only inadvertently see private data when viewing individual user pages.
2) If his account's ever compromised (let's hope he's not reusing that password elsewhere!) then someone else gets that ability as well, accessible from any browser anywhere.
It's one thing when you have to ssh into a server somewhere and do a SQL query to access someone's private information. It's another thing to set up your admin account so you're casually exposed to it.
I like Pinboard's service too, but this isn't remotely cool.
The choice, really, is "do I look at my user's personal stuff and have a good idea that my service is working for them" vs. "do I not look at my user's stuff, and have much less of an idea if the service is working for them."
(yes, yes, good automated tests are the right way to solve the problem, but good automated test are difficult, especially when a broad range of behaviors could be 'correct')
In my own case, there are two places where I'm erroring on the "don't look at the user's data, even if it would allow you to provide better service" side. First? I don't mount a user's disk. it's a block device, as far as I am concerned.
As you might imagine, this makes backups way, way more difficult. Moves, too. It's possible that if I were to ignore this rule, I'd have enough spare disk bandwidth to do at least half-assed backups (I'm not doing any backups of customer data right now, which is pretty scary for me, because I'm in a business where if you lose the customer's data, you lose the customer in the worst way.)
Now, the right way to solve this problem is some kind of 'snapshot over the network' like zfs has or some other mechanisim that only transfers the change in block devices. Unfortunately, Linux has no such tools. (yes, yes, ZFS on linux is a possibility. So is a NAS. Rsync or something rsync like won't work because the problem is not network bandwidth but disk bandwidth. I mean, it's a problem that can be solved, a better way, but solving it the better way is more work.)
Next, from a technical support perspective? I could provide dramatically better service if I logged all the serial consoles. Dramatically better service. but last time I asked, people seemed uncomfortable with it, so I don't. (the thing is, it's only easy to log all consoles or log none of the consoles; I'd have to spend time and effort building a 'log consoles except for users that opt out' mechanisim, which is the right thing to do here, but due to time constraints that hasn't been done.) (to be clear, when a customer asks for help, or even if soemthing happens with a domain and I'm not sure it's working, I look at the serial console as is. My job would be nigh impossible without serial consoles. Also, by default I log the serial consoles on dedicated servers, though unlike Xen, it is easy for me to opt people out of that. Conserver is the tool I use there.)
Personally, I think I made the right choice when it comes to block devices (I have had good luck with my RAIDs, and treating disks as block devices means my customers can use whatever freaky filesystem they like.) but I think I made the wrong choice on the serial console; there really isn't that much personal stuff that comes up on the serial console, and it gives me a lot of clues.
I guess what I'm saying is that from a testing perspective? being able to see the user's bookmarks has a lot of advantages. It's a valid choice; and he's sharing that choice with the customers, which is the right thing to do.
(the interesting thing here, of course, was that if his account couldn't see the users bookmarks, while usually that would have given him much less diagnostic information, it would have given him more diagnostic information this time.)
Instead of automated test suites, I use checklists before deploying major changes,
performing a series of actions (like creating and editing bookmarks)
to make sure everything works as expected.
Somehow I always shiver when I read stuff like this. Some manual smoke testing is never bad, but I really can't understand how you can feel comportable about not having automated test suites on (larger) projects.Stuff like this is so easy to catch in a unit or integration test.
You then need to follow two rules. First, write tests for all new features. Second, add a failing test that exposes the bug before you fix any bug.
By doing these two things you will end up testing two very important areas of your code — new code (which will always be buggiest) and code that has recently proven to be buggy.
http://www.amazon.com/Working-Effectively-Legacy-Michael-Fea...
In a nutshell, you start by writing a lot of high level integration tests, which will serve you as some guarantee you don't screw up. Then you start refactoring, writing fine-grained unit tests along the way. Once you settle with code structure, you might want to start removing/rewriting test cases you wrote initially in case they seem redundant.
[1] https://www.destroyallsoftware.com/screencasts/catalog/untes... and further parts 2-4. This screencast is done mostly in ruby.
Unit and integration tests are great, but you can't rely on them for everything. There is always a temptation to trust the tests instead of the real world, and that's where you get in trouble. The caption for this photo might as well have read 'all tests are green' http://25.media.tumblr.com/tumblr_m7v0yrMrJ21rzupqxo1_500.jp...
But as a developer unit and integration tests are what gives me trust in what I ship. I wouldn't feel comfortable shipping something without having a proper test suite, let alone making a change to such codebase.
And no, you can not save 'y' to an integer field, so your rant is misplaced.
mysql> create table demo(int tinyint(1), bin binary(1));
mysql> insert into demo(int, bin) values ('y', 'y');
mysql> select * from demo;
+------+------+
| int | bin |
+------+------+
| 0 | y |
+------+------+ private bit not null default 0
If you use something like tinyint and have a range of values that are acceptable, bake this into the database with a check constraint: myfield tinyint not null default 0 check (myfield between 0 and 3)
Regardless of how comprehensive your unit tests are, you should prevent as much bad data as you can at the DB layer.The idea being that if the field is likely to be initialized to a default, you want the safer default. (IE, in that case things would be created as accidentally private, rather than as accidentally public)
Kudos to the author for the transparency. I actually have never used pinboard, but that blog post actually makes me more likely to use it rather then less.
What? Why does MySQL return the row with 'y' in it (in a 'binary' column) when you ask for "where row = 0'? wtf is going on here?
That seems to be the real culprit here. I assume it's MySQL's 'helpful' conversion routines again, but I'm just assuming, I don't understand the details.
Keep in mind that I don't have state secrets bookmarked in Pinboard. I'm just an anti-social type of user as their tag line notes and I appreciate how Maciej runs his business.