undefined | Better HN

0 pointsjgeraert9y ago0 comments

Seeing this as well in proprietary software sold by one of the biggest erp software vendors... And it's still not fixed and likely won't be fixed anytime soon.

0 comments

11 comments · 2 top-level

moron4hire9y ago· 9 in thread

There is a perception amongst lower programmers that FKs are only a thing you enable in production, that they are an impediment to rapid iteration.

I mean, it's completely wrong, I know that. But try being the new guy telling a team of 10 people that.

There is also a perverse corruption of "if it ain't broke, don't fix it" that goes on. If you can spend 100 hours manually validating every relationship in your application code, that's 100 hours you can put into your estimate and you know you can complete. If you only have a cursory understanding of SQL, then "learn more about FKs and implement them across the DB" seems like a big, unknowable blob of time that is impossible to estimate. It doesn't matter that it might only be 5 hours of work, at least we can be certain about 100 hours and bill the client for it.

Finally, there are a small set of things that are fundamentally wrong about all modern RDBMS implementations. For example: it's nonsensical to have a foreign key that isn't indexed. You always want an index on foreign keys, there is never a scenario where you don't want them. But while primary keys are indexed by default, foreign keys are not. And those sorts of things give the anti-RDBMS crowd enough of a foot-hold to argue for continued ignorance.

mattbroekhuis9y ago

I have found out the hard way that foreign keys are indexed by default in some implementations including MySQL innodb. Not in oracle.

My experience has been the opposite of yours -- I've heard of disabling foreign keys in production only for performance reasons, but not the other way around.

I like them turned on all the time as well. Valuable safety rail

aidos9y ago

Curious, this doesn't fit my mental model: it's nonsensical to have a foreign key that isn't indexed

I guess you normally want indexes on parent - child tree relationships (book -> chapter), but you don't want them all the time. What about when you have an 'article' with a 'status and a 'category' and you only ever find all articles by combination of category and status? In that case you'd be maintaining 3 indexes, category, status and status_category, but the only one you'd need would be status_category.

Indexes are a tool to to allow for optimising lookups while foreign keys are a tool to allow you to keep your data consistent.

moron4hire9y ago

I don't see how your example doesn't require an index, specifically because, as you said, indexes are a tool to allow for optimizing lookups.

I imagine that maybe you're suggesting you first query the DB for the ID of the combined status_category, and then query the article by that ID. That's not a good idea, for a couple of reasons. First, you're making two round trips to the DB when you could, with no additional effort (just effort in a different place) be doing one. Second, you've introduced a data race condition. If someone deletes that status_category after you've queried for it but before you've queried the articles, you aren't going to get the results you want.

It would be better to do a join across articles to status_category to status and category, then query based on the status and category values you want. Without an index on the FKs between status_category and status and category, a relatively small table can have a big impact on query performance.

Finally, while I know your example is arbitrary, it's a little hard to argue against a design that is probably wrong. I doubt the suggested schema for articles and categories is a good one. If I argue "you should never have to arbitrarily subtract '1' from a result just to get the results you want", it would not be a good counterargument to say, "yes, but sometimes you want to add 2 and 2 and get 5, so then you need to subtract 1". The problem isn't where you see it.

FKs aren't just a consistency tool. Consistency and referential integrity are features that results from having an FK, but the FK is a signal that data can be searched in a certain way.

3 more replies

jrochkind19y ago

I think in general most rdbms require indexes to properly enforce foreign key constraints in a reasonable way.

1 more reply

jrochkind19y ago

The number and kind of bad ideas a bad development team can have is virtually limitless.

scriptkiddy9y ago

> There is a perception amongst lower programmers that FKs are only a thing you enable in production, that they are an impediment to rapid iteration.

Really? I've been working as a web programmer for about 5 years now. I use the term Web Programmer because I feel like the term Web Developer carries with it the connotation that you only work with Javascript.

My specialty is definitely with Python and more specifically with Django. Django's ORM is the first ORM I ever used. Maybe it's because I self taught, but I never had the inclination that FKs were an impediment to rapid iteration. In fact, quite the opposite. FKs are a fantastic way to enforce relationship constraints between tables/objects/models/whatever you want to call them.

Whenever I start on a new project, the first thing I do is start defining my data structure. In Django, this mostly involves using the ORM layer to define Model Classes. I usually define the core sets of models necessary for the application. As an example, if I were building a simple blog, I'd start by overriding Django's built in user model so I have the flexibility to add columns or place constraints on existing columns. Then I would define the post model which includes an `author` and a `category` FK column. Then I would define the `Category` model. I generate a migration script and run it. Every time I need to make a change to the Schema, I simply add or change whatever I need to and generate a new migration script alongside the old one. These migrations are dependent upon previous migrations and have version numbers. If I need to, I can roll back to a previous schema. Django makes this all very simple and actually separates the migrations out by which app the model definitions reside in. This means that I can make changes to multiple model definitions and only apply the changes to the Db for one of those model classes. It's very flexible.

So, after my long-winded explanation above, I don't see how people could find FKs to be restrictive when there are so many tools like Django's ORM and migration system that makes altering your schema so simple.

TimJYoung9y ago

Re: FKs being indexed. This isn't always the case. Our embedded/workgroup-level database engine, ElevateDB, always uses a system-generated index for all PK, FK, and UNIQUE constraints. They're needed for constraint enforcement, anyways, and offer good "default" performance improvements by ensuring that the query optimizer can flip (most) inner joins in order to improve nested-loop join performance.

dspillett9y ago

> it's nonsensical to have a foreign key that isn't indexed. You always want an index on foreign keys

Incorrect, though cases where you don't want the index are rarer than those where you do especially when thinking about simple examples.

> there is never a scenario where you don't want them.

If the parent entities never have their primary key values changed and are never deleted, then the database engine itself will never use the index to enforce the key. If you never need to join from the parent entities to the child entities then your queries are unlikely to make use of it either.

The extra index takes space (maybe a fair amount of space if the key is wide and/or you have a [bad] design with a wide clustering key), potentially space in your in-memory page pool, and processing time & IO during inserts, updates and deletes. If you are unlikely to need the index then why take that hit?

> But while primary keys are indexed by default, foreign keys are not.

Primary keys need to be indexed to avoid a full table scan on every insert (or update of a key value) to that table or any table that refers to the key via a foreign key constraint. Foreign keys will only cause a scan with no index present if a key value is changed (or a row deleted) in the parent table. As some entities shouldn't be deleted and primary key values should be immutable, it follows that this sort of situation can happen.

It would be possible to make the index optional by other means then requiring you to declare it if you want it, but SQL's modelling language tends towards declaring what you want not what you don't want so it fits better with the syntax to have you add the index if you want it rather than deleting it if you don't or having something in the syntax like "ADD CONSTRAINT fk_key_name ON (<field(s)>) REFERENCES <target>(<field(s)>) WITHOUT INDEX".

> And those sorts of things give the anti-RDBMS crowd enough of a foot-hold to argue for continued ignorance.

This isn't purely a relational problem. noSQL data stores have indexes too, and not having an index on a referencing key that might be needed to check on referential integrity (looking for orphan records) can be a problem there too. We shouldn't change the behaviour of relational stores because some people who use noSQL don't understand good data modelling. Many people using noSQL do understand good data modelling of course, but some use noSQL because they don't want to try understand SQL rather than because it is the wrong tool for the job at hand, and those people probably don't understand noSQL either but get away with not doing so in the short term).

moron4hire9y ago

> If the parent entities never have their primary key values changed and are never deleted, then the database engine itself will never use the index to enforce the key.

Enforcing referential integrity is not the only thing we can do with foreign keys. Once you have an FK, you're going to want to query against it. To do that without data races requires a join. The join can be optimized better if the FK is indexed.

Come up with an example of an FK you never want to join on and then maybe we talk about not wanting an FK indexed. I've never seen anyone make a convincing argument that this is even as much as a 1% case. It should be so incredibly, almost inconceivably rare to not want an FK indexed that yes, I'm saying it should go against the traditional grain and just be the default. Traditions are not sacred.

2 more replies

mannykannot9y ago

The relational model provides a pretty general and unified way to represent information, together with a cohesive and powerful set of primitives for working with it, yet many systems architects insist on hiding it behind an ad-hoc interface that looks like a throwback to pre-relational days.

j / k navigate · click thread line to collapse

0 comments

11 comments · 2 top-level

moron4hire9y ago· 9 in thread

There is a perception amongst lower programmers that FKs are only a thing you enable in production, that they are an impediment to rapid iteration.

I mean, it's completely wrong, I know that. But try being the new guy telling a team of 10 people that.

mattbroekhuis9y ago

I have found out the hard way that foreign keys are indexed by default in some implementations including MySQL innodb. Not in oracle.

My experience has been the opposite of yours -- I've heard of disabling foreign keys in production only for performance reasons, but not the other way around.

I like them turned on all the time as well. Valuable safety rail

aidos9y ago

Curious, this doesn't fit my mental model: it's nonsensical to have a foreign key that isn't indexed

Indexes are a tool to to allow for optimising lookups while foreign keys are a tool to allow you to keep your data consistent.

moron4hire9y ago

I don't see how your example doesn't require an index, specifically because, as you said, indexes are a tool to allow for optimizing lookups.

FKs aren't just a consistency tool. Consistency and referential integrity are features that results from having an FK, but the FK is a signal that data can be searched in a certain way.

3 more replies

jrochkind19y ago

I think in general most rdbms require indexes to properly enforce foreign key constraints in a reasonable way.

1 more reply

jrochkind19y ago

The number and kind of bad ideas a bad development team can have is virtually limitless.

scriptkiddy9y ago

> There is a perception amongst lower programmers that FKs are only a thing you enable in production, that they are an impediment to rapid iteration.

TimJYoung9y ago

dspillett9y ago

> it's nonsensical to have a foreign key that isn't indexed. You always want an index on foreign keys

Incorrect, though cases where you don't want the index are rarer than those where you do especially when thinking about simple examples.

> there is never a scenario where you don't want them.

> But while primary keys are indexed by default, foreign keys are not.

> And those sorts of things give the anti-RDBMS crowd enough of a foot-hold to argue for continued ignorance.

moron4hire9y ago

> If the parent entities never have their primary key values changed and are never deleted, then the database engine itself will never use the index to enforce the key.

2 more replies

mannykannot9y ago

j / k navigate · click thread line to collapse