undefined | Better HN

0 pointsfnord1239y ago0 comments

>The second annoyance I have is the push for schemaless. Schemaless does not exist. There is always a schema, except in a schemaless data store, the schema has been moved to app code and then my earlier comment applies.

This is a great point. It's like types for programming languages. They always exist; it's just a matter of whether you have the compiler managing it or whether you need to keep it in your head when you're hacking.

> I think the lack of understanding of SQL and bad ORM experiences (Hibernate WTF) is what led people to think SQL/RDMBs were the problem when in fact they were not.

Again, like types in programming languages, I think the ability to iterate quickly without having to predict how the data will be used in the future leads to 'schemaless' approaches which can be kicked out the door more quickly (maybe) than ponderous schema laden tables.

0 comments

14 comments · 4 top-level

mindcrime9y ago· 8 in thread

I think the lack of understanding of SQL and bad ORM experiences (Hibernate WTF)

I don't understand this comment. I've never had a bad experience with Hibernate. If anything, Hibernate has saved me tons of time over the years.

hibikir9y ago

I, on the other hand, have spent an inordinate amount of time, including waking up in the middle of the night for incident response, due to Hiberanate misuse, which is so easy that I'd not touch Hibernate with a ten foot pole.

There is one case where Hibernate can save you a significant amount of time vs plan SQL, and that's arbitrary update statements. But the moment there are two joins in there, predicting hibernate performance is not necessarily easy, and differentiating a query that will run well from one that will kill your system at scale is far harder than in plan SQL. In practice, the only way to use Hibernate responsibly is to examine the likely queries that it will generate.

My least favorite example was a situation where manual SQL gave me a very fast query that took a couple of milliseconds to run on a large database, but Hibernate's answer was not even just one query: It was an accidentally quadratic monstrosity, when the original programmer wanted was one row: Hibernate just wasn't smart enough to figure out which row it wanted in the DB, so it grabbed half the database. After I saw the beautiful queries that Hibernate had created, I had to give the finger to the existing data layers and do my own thing, which I am told remains to this day.

The way Hibernate is built only makes sense if it can hide the database behind it. In reality, it doesn't, so I not only have to make sure everyone knows SQL well, but they also have to learn hql and hibernate's most common pitfalls. All to save a little bit of time on the rare cases where I might have very dynamic queries that somehow hibernate won't screw up.

Java EE was a pretty unfortunate thing, but under duress, I'd much rather use EJB3 and straight SQL over the Spring + Hibernate stack that many of us had to deal with between 2000 and 2010.

matwood9y ago

I'd be curious about the size of the systems you have used with Hibernate. Do you know off the top of your head what will cause Hibernate to issue a flush? How about the order of operations in a batch? I've read the entire Hibernate book multiple times while trying to squeeze out speed, and finally decided enough.

The problem with Hibernate is I have to learn all of these Hibernate quirks to maybe get decent speed out of a system, when I could have just written SQL and been done. This isn't even including debugging when Hibernate decides to issue N+1s.

mindcrime9y ago

I'd be curious about the size of the systems you have used with Hibernate.

I wouldn't even know what metric to use to quantify that. Number of tables? On the order of hundreds, probably low hundreds... definitely not thousands.

It may just be that I've been lucky in that the data models I've worked with have been things that map well to using Hibernate. I will say that I haven't worked with a lot of data models where you need more than 3 or 4 joins in any given query. I know some people write far more complex queries than that, and maybe those are the kinds of scenarios that Hibernate doesn't handle well?

The problem with Hibernate is I have to learn all of these Hibernate quirks to maybe get decent speed out of a system, when I could have just written SQL and been done

Do you hand-roll your own caching mechanism when just writing plain SQL?

1 more reply

cygned9y ago

I had to use Hibernate in a project. We were loading an entity from the database, someone invoked a setter and the set value was updated in the database automatically. Totally blew my mind.

Never touched it again since then (not doing much Java anyway).

cakes9y ago

This mirrors the experience I have as well. I (currently) don't know what will cause Hibernate to flush but I certainly do know that I had to know this exactly and had to do some really painful debugging at points. The worst one I was on was just early prototyping of an app that used Hibernate and we had so many problems with performance and Hibernate(isms) that the whole thing seemed destined to fail even though our use case was exactly what ORM/Hibernate are for...yet we still have to control when changed flushed because otherwise things would get wonky really quickly.

dunham9y ago

Yeah, profiling helps a little. It taught me that queries flush (and you can easily go n^2 from that if you're not careful).

More recently, we've had n+1 issues (putting a printing breakpoint in EntityType.loadByUniqueKey helped me track down where). My hack to get around that was to do a bulk Criteria query up front to pre-load the session cache with the entities that were being fetched one at a time.

There were a lot of quirks that I've had to learn over the last decade, and I'm still discovering more.

gerbilly9y ago

Speed? I found Hibernate is great for CRUD.

Like where you have dozens and dozens of domain objects that you want to persist, and you don't want to spend all your time writing DAO code with SQL in in that is all the same but different.

Enterprise applications are rife with those.

For speed write some hand written SQL to handle those few cases (like batch imports maybe).

Also I found a handy rule:

Only create an association between entities when the entity makes no sense without the other.

For all other cases split them up and perhaps create aggregates by hand.

1 more reply

singham9y ago

I have used Nhibernate in my project at a company. But beyond simple queries we hit a performance issue. For a complex query, it took about 10 seconds to load the data to the frontend. So we had to rewrite the "code logic" ( the Nhibernate part ) into an "custom sql query" (Nhibernate allows you to write custom queries ) and then write custom mappers to map the results back to our application. The reason is Nhibernate does its processing objectwise and inmemory.

kbenson9y ago· 2 in thread

> It's like types for programming languages. They always exist; it's just a matter of whether you have the compiler managing it or whether you need to keep it in your head when you're hacking.

While true to a degree, it's worth noting that some languages also reduce the number of types considerably at the same time, and maybe even provide some other conveniences when dealing with types (e.g. automatic interpretation as numeric when used in a numeric operation).

You still have to keep types in your head, but the reduction of types generally makes this non-problematic. Although, as your code grows and you define your own types (either formally through an included system or as ad-hoc data structures), you start needing to keep more types in your head, and this starts eating away at any benefit you had from not having to define them in the first place.

dllthomas9y ago

You and the parent are not using quite the same definition of "types", I think.

kbenson9y ago

I think we are, if you look at it from a high level. C has some well known types most people are familiar with. Perl has a few types too, scalar, array and hash (and a few more lesser known or used ones). This makes Perl very easy to use initially, because you can just treat strings as string and numbers as numbers, and Perl will mostly do what you want based on the operators involved. You create complex data structures by just creating them (ad-hoc) or by actually defining a class, which actually gives you a new type.

As you create any non-trivial program, you'll likely create some types, whether in C or Perl (as classes). What was initially a benefit in Perl, since you didn't have many typed to keep track of, becomes a liability, as there are more types but still no formal way to ensure they are being used in the correct locations and as expected arguments without runtime checking (and manual checking at that, unless you use a good module to handle a lot of that for you).

That some languages keep types fairly low level (C, with typed relating fairly closely to machine representation) and some are high level (Perl, with each type being a higher level container) is really irrelevant. In the end, they are just labels that codify behavior (size, acceptable methods of use).

In SQL and NoSQL systems, you have the same trade-offs. Do I define everything up-front so I know (presumably) when there's a problem, or do I create the structure ad-hoc as needed and enforce the structure through the function of the application that creates the data?

1 more reply

rwj9y ago

This should be emphasised more. Schemaless is good for productivity during prototyping (like dynamically typed languages), but long term a well-defined schema will help maintenance (like statically typed languages).

exclusiv9y ago

Sometimes it really is schemaless, although usually only schemaless for certain fields. Imagine a case where users can create custom fields or store arbitrary data in some fields.

j / k navigate · click thread line to collapse

0 comments

14 comments · 4 top-level

mindcrime9y ago· 8 in thread

I think the lack of understanding of SQL and bad ORM experiences (Hibernate WTF)

I don't understand this comment. I've never had a bad experience with Hibernate. If anything, Hibernate has saved me tons of time over the years.

hibikir9y ago

Java EE was a pretty unfortunate thing, but under duress, I'd much rather use EJB3 and straight SQL over the Spring + Hibernate stack that many of us had to deal with between 2000 and 2010.

matwood9y ago

mindcrime9y ago

I'd be curious about the size of the systems you have used with Hibernate.

I wouldn't even know what metric to use to quantify that. Number of tables? On the order of hundreds, probably low hundreds... definitely not thousands.

The problem with Hibernate is I have to learn all of these Hibernate quirks to maybe get decent speed out of a system, when I could have just written SQL and been done

Do you hand-roll your own caching mechanism when just writing plain SQL?

1 more reply

cygned9y ago

I had to use Hibernate in a project. We were loading an entity from the database, someone invoked a setter and the set value was updated in the database automatically. Totally blew my mind.

Never touched it again since then (not doing much Java anyway).

cakes9y ago

dunham9y ago

Yeah, profiling helps a little. It taught me that queries flush (and you can easily go n^2 from that if you're not careful).

There were a lot of quirks that I've had to learn over the last decade, and I'm still discovering more.

gerbilly9y ago

Speed? I found Hibernate is great for CRUD.

Like where you have dozens and dozens of domain objects that you want to persist, and you don't want to spend all your time writing DAO code with SQL in in that is all the same but different.

Enterprise applications are rife with those.

For speed write some hand written SQL to handle those few cases (like batch imports maybe).

Also I found a handy rule:

Only create an association between entities when the entity makes no sense without the other.

For all other cases split them up and perhaps create aggregates by hand.

1 more reply

singham9y ago

kbenson9y ago· 2 in thread

> It's like types for programming languages. They always exist; it's just a matter of whether you have the compiler managing it or whether you need to keep it in your head when you're hacking.

dllthomas9y ago

You and the parent are not using quite the same definition of "types", I think.

kbenson9y ago

1 more reply

rwj9y ago

exclusiv9y ago

Sometimes it really is schemaless, although usually only schemaless for certain fields. Imagine a case where users can create custom fields or store arbitrary data in some fields.

j / k navigate · click thread line to collapse