Advanced SQL window function quiz (opens in new tab)

(windowfunctions.com)

163 pointsandy_boot8y ago79 comments

79 comments

69 comments · 18 top-level

thom8y ago· 14 in thread

There are some big flaws in window functions I'd love to see fixed. The main one being the very limited ways in which you can refer to the 'current' row in window and filter expressions. The second being... you're almost always better off just doing all this work in code, in a programming language with dependable performance characteristics.

nickpeterson8y ago

I don't know, I think most performance gained from pulling data out and processing it is generally an illusion. Windowing functions can have terrible performance, but that's pretty easy to determine and mitigate (indexing, multiple queries in a procedure). Pulling data down and 'doing something' with it starts getting into the world of round trips, serialization, concurrency models...

Plus there is the whole, "My report wants to use the same logic as this web form", oh it's in Java? Guess we'll just pull the data out, process it, populate a temp table, report it, clear those values....

If you have business data in SQL, just write the logic in SQL.

Amezarak8y ago

I generally haven't had a problem with window function performance - I was mostly using MSSQL.

That said, the single biggest performance drain I've seen is people not using the database, and instead pulling data out to process in code. I don't think I've ever seen a case where it was faster to process data outside of the database - though there has also been plenty of times when the pertinent SQL to had to be written "smarter." Think in sets, not iteratively! There's a terrifying amount of loops in a lot of legacy SQL statements.

3 more replies

iteriteratedone8y ago

The point is you dont have business data yet, you need to create it ... using business logic. All the data is in sql but for some reason you dont store value x (probably an aggregate of sorts) and sql environment doesnt have a good generalized syntax for logic.

I think the problem is maintenance instead of performance. When you really want performance (in data reads) youll be precomputing and storing the values. Then your report can filter and be indexed properly.

I dont think that any tech solution will give you fast reads and fast writes you need to pick one.

Your server probably has alot of down time ... use it , make a cron job and make a seperate reporting table ... reporting on raw data does not scale , window functions do not scale

Using window functions is equivalent to using a better cpu. Maybe the environment you work in is faster but you have not factored the logic to make the best use of the resources.

You can take a crap approach in java and port it to sql. But the approach is crap and you should create a seperate reporting process or at least use hadoop.

derefr8y ago

Rather than taking the data out of the database, how about plugging your programming language of choice into the database?

For Postgres, for example, you can write stored procedures using any of these: https://www.postgresql.org/docs/9.1/static/external-pl.html

3 more replies

emmelaich8y ago

I agree, too little is done in SQL.

One good reason to avoid it and stored procedures is for parallelism. But you need a lot of data to make that worthwhile. Especially in these days of 3+Tb RAM machines.

electrum8y ago

Access to the current row was improved in the SQL:2011 standard. See section "8.2 Navigation within a window" in this paper: https://sigmodrecord.org/publications/sigmodRecord/1203/pdfs...

thom8y ago

This is great news, although I'm not sure it gives me the thing I _most_ want, which is just to be able to refer to the 'current' row within a filter's where clause.

atwebb8y ago

Not always the case, especially with columnstore indexes and batch mode, even traditionally, rownum/lead/lag/ordering functions are extremely handy along with aggregates. I can't actually think of many, result set/batch, scenarios that would be better replicated in code, not counting the speed on extra delivery (every row for a sum) and possible miscalculations. Add in the ability to index views and some other things for specific scenarios, then again, I'm mostly MSSQL and lack in other offerings, or knowing where they might lack.

Also, the site is blocked for me currently...

nickpeterson8y ago

I think often conversations about this are really people talking past each other. If you work in the enterprise, and spend tons of time dealing with 'business logic' and bespoke systems, using SQL seems preferable because many of your headaches come from systems being out of sync, or reports showing different values. Databases represent a way to guard against a lot of problems that tend to occur in that environment. It's only when you branch into other realms of development where it starts to feel really backwards and archaic.

1 more reply

tathougies8y ago

SQL has dependable performance characteristics... What is not dependable about how window functions are computed?

thom8y ago

Where clause pushdown through views, or worse CTEs. Views with window functions querying views with window functions. Queries with multiple windows. I didn't just randomly make the statement, modern query planners are an absolute arse to predict in the way that you can 100% of the time in external code. The moment you're doing anything really 'sequency' instead of just calculating some dumb running totals, window functions can become a nightmare, and I just made the statement so people don't automatically think there's a friction-free story for doing general sequence processing in SQL. Not sure why this has offended people so much.

barrkel8y ago

If you have a small amount of data, yes; but if you have any significant amount of data, you are losing multiple orders of magnitude of performance when you pull all that data out of the database.

thom8y ago

I lose orders of magnitude of dev time every time Postgres can't work out how to push a where clause down into a view, tbh. In our case, doing processing on sequences at insert time is much more pleasant, performant and featureful than trying to do it after the fact with window functions.

MechanicalTwerk8y ago

https://github.com/postgres/postgres/commit/0a459cec96d3856f...

andy_bootOP8y ago· 9 in thread

Author here - I'm scrolling thru the logs and it is interesting to see that on the first question lots of people try to write "select name, total_weight" despite there only being a weight column which you have to use an aggregate on to get the total weight.

I guess I need to rethink my design a little.

koyote8y ago

I did that and to be honest I am not quite sure why.

I think it's due to lack of reading the question properly and impatience.

But it also might be due to how programmers sometimes think/work. Often, when I am trying to solve a problem I like writing something (anything) down in broad strokes first so that I can then go back in and 'fix' each smaller part of the larger problem.

meritt8y ago

The vast majority of developers don't know SQL beyond simple CRUD queries. The surge in popularity of NoSQL over the past decade has made SQL become a lost art. Even when using relational databases, ORMs often alleviate the need for understanding the principles.

Thank you for making this.

Q6 has a flaw. You request "Order by: ranking, name" but the accepted solution sorts by rank only (Alfie should come before Ashes)

danielbarla8y ago

I suspect there's a need to keep the design simple, but the source table schema should probably be more prominent. I suspect most people try to guess the names of the columns before they ever click "Show table schema".

Nice quiz, by the way, and I like the topic. In my experience, too few devs / dbas make good use of window functions.

Karlozkiller8y ago

I thought the underlying schema was pretty obvious. However I made lots of mistakes due to trying it out from my phone, and it didn't cooperate as well as one would've liked.

But I think the design was good. I also noticed the option to look at the schema, if needed.

johannes_ne8y ago

It took me a while to see the 'show schema' link. Maybe this should always be shown.

jabagawee8y ago

Quick bit of feedback: question 10 refers to the "WINDOW command" which I think should be called the "WINDOW clause" to help clueless folks (ie me) Google for it.

andy_bootOP8y ago

good idea. I'll fix that.

oneweekwonder8y ago

You had me on the plural for your table names. Have not seen it in a while. Personally believe it is a preference thing.

andy_bootOP8y ago

Despite being the author I think you are correct. It should be table: 'cat' not 'cats'

gravypod8y ago· 6 in thread

Why are they recalculating the sum of the weights in the Q0 query?

Their answer:

   select age, sum(weight) as total_weight
   from cats
   group by age
   having sum(weight) > 12;

Which is equivalent to:

   select age, sum(weight) as total_weight
   from cats
   group by age
   having total_weight > 12;

atarashi8y ago

The HAVING clause is evaluated before the SELECT so you have to specify the full clause `having sum(weight) > 12` and not just the alias. Most SQL engines are probably smart enough to not actually recalculate everything.

cema8y ago

Yes, SQL engines tend to be much smarter than the SQL languages as such.

trollied8y ago

Most RDBMSes won't let you use a column alias in the same scope as it is defined (pretty sure the SQL standard itself doesn't specify that it should be allowed). You can get around this by using a CTE.

Also, optimisers aren't stupid, the actual sum() function will not be recalculated.

andy_bootOP8y ago

Author here. - As others have said this SQL Engine makes me write the SQL out the long way. For some reason the shorter version (which I wrote first) is rejected by my postgres client.

meritt8y ago

https://www.postgresql.org/message-id/7608.1259177709%40sss....

Answer from core PostgreSQL dev, Tom Lane

1 more reply

sourcesmith8y ago

SQL is a declarative language not a procedural language.

drraid08y ago· 4 in thread

Meanwhile I'm stuck with mysql 5.4 without foreign keys, cuz, you know, performance. (The main db for a billion dollar security company)

barrkel8y ago

MySQL foreign keys add locks to referenced rows, and can have surprising effects on concurrent queries, particularly if you aren't running at read committed or lower isolation level.

There's also this: https://bugs.mysql.com/bug.php?id=48652 - still open after all these years (and yes I know there are much longer open bugs).

Alupis8y ago

InnoDB storage engine is available in 5.4, you don't have to use only MyISAM[1].

However, depending on your application, you may be enforcing these constraints elsewhere and not need them at the engine level, even if it's a nice-to-have.

[1] http://download.nust.na/pub6/mysql/doc/refman/5.4/en/storage...

zimpenfish8y ago

I know a database that's using InnoDB tables but also has no foreign keys and, indeed, JOINs are forbidden.

(And yes, the workflow does include actions that involve multiple tables in one go.)

dwanvik8y ago

Check out the new MySQL 8.0 release candidate. Is has windowing functions now.

https://dev.mysql.com/doc/refman/8.0/en/window-functions.htm...

pzxc8y ago· 3 in thread

I really really like this creation of yours, and I've already learned a couple things from it.

However, you really really need to be careful about SQL injection. I can see that you tried to lock it down as much as possible, as far as I can tell the account the queries are running under only has SELECT permission (no update/delete/etc). However I was still able to get some data you probably don't want me to have...

This query returns a list of all tables in your database:

SELECT c.relname||'' FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace

And this query returns a list of users and their privileges:

SELECT usename||'', usecreatedb||'', usesuper||'', usecatupd||'' FROM pg_user

I guess your name is Andy based on your HN username, but who is Rusty? :) And that's just what I've been able to get in 5 minutes of trying, let alone a determined attacker.

andy_bootOP8y ago

Noted, I am currently running on the assumption that there is nothing interesting in that PG (There isn't). But yes I would like to lock that user down further it really shouldn't be able to access those things.

andy_bootOP8y ago

It is now more locked down (at least "pg_catalog, public, information_schema" are no longer available). Thanks.

postila8y ago

||'' is another way to do ::text ? :-)

masto8y ago· 3 in thread

My feedback after answering the first question:

1. Q0's success messages says "Group by and Having are commonly used aggregate functions". This terminology is incorrect. SUM() is an aggregate function. GROUP BY and HAVING are clauses. 2. Without an ORDER BY clause, there's no guarantee of the order of the result, but the quiz appears to require the output rows to be in the same (in the case of Q0, undefined) order as the expected rows. While this is unlikely to spontaneously change, it's like depending on the order of an unordered hash map. As long as you're teaching people things, it's good to instill the habit of always ordering their results.

masto8y ago

Q1: It's not apparently from the way the schema is shown whether "name" is a unique key. The suggested SQL will produce incorrect results in the case of a duplicate name as the window framing defaults to RANGE UNBOUNDED PRECEDING, so following peer rows are included in the aggregation.

andy_bootOP8y ago

True. Well observed. How would you resolve this?

1 more reply

andy_bootOP8y ago

1) Yes. I'll re-word that part. 2) I debated with myself on that. On one hand it is clearly better to ORDER BY. On the other hand I'm trying to teach people the window functions and I don't want them typing in more SQL than they have to.

karmakaze8y ago· 2 in thread

I've used a few of these and it's great to practice using more!

Q7 includes output:

  Misty   5.7  0.2
  Oscar   6.1  0.4
  Smokey  6.1  0.0

How would one show 0.4 for both Oscar and Smokey?

http://www.windowfunctions.com/questions/7

dancek8y ago

I'm not sure how you'd do this with windowing functions, but at least you can use common table expressions (WITH) to achieve what you're asking for.

The site is down right now, so I had to test using my own db instance. Column and table names might not match.

  WITH
      distinct_weights AS (
        SELECT DISTINCT weight
        FROM cats),
      weights AS (
        SELECT
          weight,
          lag(weight)
          OVER (
            ORDER BY weight ) AS lag
        FROM distinct_weights)
  SELECT
    cats.name,
    cats.weight,
    cats.weight - weights.lag
  FROM cats, weights
  WHERE cats.weight = weights.weight
  ORDER BY weight;

yields

  Any     5.5   null
  Misty   5.7   0.2
  Oscar   6.1   0.4
  Smokey  6.1   0.4

karmakaze8y ago

I see, it's joining with the lagged weight. Same without CTE:

  SELECT c.name, c.weight, c.weight - lw.lower_weight AS weight_to_lose
  FROM cats c
  JOIN (SELECT weight, lag(weight, 1, weight) over (order by weight) 
        lower_weight FROM cats 
        GROUP BY weight
        ORDER BY weight) lw ON c.weight = lw.weight
  ORDER BY weight

picardo8y ago· 2 in thread

I'm stuck on the first question. I'm not sure why this is wrong. Can anyone help me?

> select name, sum(weight) OVER (PARTITION BY name) as running_total_weight from cats order by name

http://www.windowfunctions.com/questions/1

papercrane8y ago

It's asking for a running total of all weights, so you shouldn't partition by name. Use an ORDER BY instead of PARTITION BY.

picardo8y ago

Ah, I get it. Thanks!

zer8y ago· 1 in thread

Q6: Ranking has an error, or I misunderstood. It says "Order by: ranking, name" but in truth one should not order by name!

Anyway, overall very nice!

andy_bootOP8y ago

you are correct. I'll fix that.

emmelaich8y ago· 1 in thread

See also Markus Winand's blog and http://use-the-index-luke.com/sql/partial-results/window-fun...

and Joe Celko https://www.red-gate.com/simple-talk/sql/t-sql-programming/w...

nablaone8y ago

We've got course that cover window function, not sure if I can post the url here. grrr.

nothrabannosir8y ago· 1 in thread

Lovely website, I'm finally learning about window functions as I do this. Thanks a bunch.

One question / bug report? Q7 rejects this:

select name, weight, weight - (min(weight) over (order by weight asc rows between 1 preceding and current row) ) as weight_to_lose from cats

is that much worse than lag() or is it just non-idiomatic?

(edit: and Q10 asks about "window", but it was quite hard to google. if you could provide another one of those links to the relevant docs, that could help a bunch :D)

andy_bootOP8y ago

That query looks fine to me, I'll tweak the code.

Yep I need to fix up Q10 a bit.

Twisell8y ago· 1 in thread

Great job!

And here is a small suggestion, you might want to change the name of output column "unique" in Q3 because.

  *expression* as unique --does work and is recommended answer
  *expression* unique --fail because of use of a restricted word
  *expression* "unique" --work because of double quote

Using a restricted word that need quotation in some context is globally discouraged and in this context can keep someone stuck independently of the problem at stack.

PS: also apply to "position" in Q4 maybe thats intentional?

andy_bootOP8y ago

Good points. I'll update Q3 and look at Q4 it was not intentional to use keywords.

einhverfr8y ago· 1 in thread

I guess window functions are advanced features, but I found it a little bit basic. Some things you could do to make it more advanced:

    * add questions about range mode
    * have questions which discuss default behavior regarding ordering and partitioning
    * Have questions which rely on things like RANGE 3 PRECEDING or other constructs

dwanvik8y ago

Last time I checked, PG didn't support RANGE <value> PRECEDING, only RANGE UNBOUNDED PRECEDING. The new MySQL 8.0windowing functions do, though.

Eremotherium8y ago· 1 in thread

Thank you for doing this.

Sidenote: either HN killed your backend or I did with a recursive CTE. If so: sorry :)

andy_bootOP8y ago

No problem, just restarted it.

Its a brand new project so I may need to do some hardening.

rattray8y ago· 1 in thread

This is cool! How many questions long is it?

rattray8y ago

(it's 10 questions long)

postila8y ago· 1 in thread

select version();

PostgreSQL 9.3.20 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4, 64-bi

I wonder why so old version is being used. I will be not supported by the community pretty soon.

postila8y ago

I mean, in more recent Postgres versions, more things related to window functions are available – like CUBE, GROUPING SETS, etc. 9.3 was released back in 2013.

dbkaplun8y ago

Source:

https://github.com/bootandy/window_funcs

tofflos8y ago

Nice. I rarely use window functions and find them a bit arcane. This was really accessible.

j / k navigate · click thread line to collapse

79 comments

69 comments · 18 top-level

thom8y ago· 14 in thread

nickpeterson8y ago

If you have business data in SQL, just write the logic in SQL.

Amezarak8y ago

I generally haven't had a problem with window function performance - I was mostly using MSSQL.

3 more replies

iteriteratedone8y ago

I dont think that any tech solution will give you fast reads and fast writes you need to pick one.

Your server probably has alot of down time ... use it , make a cron job and make a seperate reporting table ... reporting on raw data does not scale , window functions do not scale

Using window functions is equivalent to using a better cpu. Maybe the environment you work in is faster but you have not factored the logic to make the best use of the resources.

You can take a crap approach in java and port it to sql. But the approach is crap and you should create a seperate reporting process or at least use hadoop.

derefr8y ago

Rather than taking the data out of the database, how about plugging your programming language of choice into the database?

For Postgres, for example, you can write stored procedures using any of these: https://www.postgresql.org/docs/9.1/static/external-pl.html

3 more replies

emmelaich8y ago

I agree, too little is done in SQL.

One good reason to avoid it and stored procedures is for parallelism. But you need a lot of data to make that worthwhile. Especially in these days of 3+Tb RAM machines.

electrum8y ago

Access to the current row was improved in the SQL:2011 standard. See section "8.2 Navigation within a window" in this paper: https://sigmodrecord.org/publications/sigmodRecord/1203/pdfs...

thom8y ago

This is great news, although I'm not sure it gives me the thing I _most_ want, which is just to be able to refer to the 'current' row within a filter's where clause.

atwebb8y ago

Also, the site is blocked for me currently...

nickpeterson8y ago

1 more reply

tathougies8y ago

SQL has dependable performance characteristics... What is not dependable about how window functions are computed?

thom8y ago

barrkel8y ago

If you have a small amount of data, yes; but if you have any significant amount of data, you are losing multiple orders of magnitude of performance when you pull all that data out of the database.

thom8y ago

MechanicalTwerk8y ago

https://github.com/postgres/postgres/commit/0a459cec96d3856f...

andy_bootOP8y ago· 9 in thread

I guess I need to rethink my design a little.

koyote8y ago

I did that and to be honest I am not quite sure why.

I think it's due to lack of reading the question properly and impatience.

meritt8y ago

Thank you for making this.

Q6 has a flaw. You request "Order by: ranking, name" but the accepted solution sorts by rank only (Alfie should come before Ashes)

danielbarla8y ago

Nice quiz, by the way, and I like the topic. In my experience, too few devs / dbas make good use of window functions.

Karlozkiller8y ago

I thought the underlying schema was pretty obvious. However I made lots of mistakes due to trying it out from my phone, and it didn't cooperate as well as one would've liked.

But I think the design was good. I also noticed the option to look at the schema, if needed.

johannes_ne8y ago

It took me a while to see the 'show schema' link. Maybe this should always be shown.

jabagawee8y ago

Quick bit of feedback: question 10 refers to the "WINDOW command" which I think should be called the "WINDOW clause" to help clueless folks (ie me) Google for it.

andy_bootOP8y ago

good idea. I'll fix that.

oneweekwonder8y ago

You had me on the plural for your table names. Have not seen it in a while. Personally believe it is a preference thing.

andy_bootOP8y ago

Despite being the author I think you are correct. It should be table: 'cat' not 'cats'

gravypod8y ago· 6 in thread

Why are they recalculating the sum of the weights in the Q0 query?

Their answer:

   select age, sum(weight) as total_weight
   from cats
   group by age
   having sum(weight) > 12;

Which is equivalent to:

   select age, sum(weight) as total_weight
   from cats
   group by age
   having total_weight > 12;

atarashi8y ago

cema8y ago

Yes, SQL engines tend to be much smarter than the SQL languages as such.

trollied8y ago

Also, optimisers aren't stupid, the actual sum() function will not be recalculated.

andy_bootOP8y ago

Author here. - As others have said this SQL Engine makes me write the SQL out the long way. For some reason the shorter version (which I wrote first) is rejected by my postgres client.

meritt8y ago

https://www.postgresql.org/message-id/7608.1259177709%40sss....

Answer from core PostgreSQL dev, Tom Lane

1 more reply

sourcesmith8y ago

SQL is a declarative language not a procedural language.

drraid08y ago· 4 in thread

Meanwhile I'm stuck with mysql 5.4 without foreign keys, cuz, you know, performance. (The main db for a billion dollar security company)

barrkel8y ago

MySQL foreign keys add locks to referenced rows, and can have surprising effects on concurrent queries, particularly if you aren't running at read committed or lower isolation level.

There's also this: https://bugs.mysql.com/bug.php?id=48652 - still open after all these years (and yes I know there are much longer open bugs).

Alupis8y ago

InnoDB storage engine is available in 5.4, you don't have to use only MyISAM[1].

However, depending on your application, you may be enforcing these constraints elsewhere and not need them at the engine level, even if it's a nice-to-have.

[1] http://download.nust.na/pub6/mysql/doc/refman/5.4/en/storage...

zimpenfish8y ago

I know a database that's using InnoDB tables but also has no foreign keys and, indeed, JOINs are forbidden.

(And yes, the workflow does include actions that involve multiple tables in one go.)

dwanvik8y ago

Check out the new MySQL 8.0 release candidate. Is has windowing functions now.

https://dev.mysql.com/doc/refman/8.0/en/window-functions.htm...

pzxc8y ago· 3 in thread

I really really like this creation of yours, and I've already learned a couple things from it.

This query returns a list of all tables in your database:

SELECT c.relname||'' FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace

And this query returns a list of users and their privileges:

SELECT usename||'', usecreatedb||'', usesuper||'', usecatupd||'' FROM pg_user

I guess your name is Andy based on your HN username, but who is Rusty? :) And that's just what I've been able to get in 5 minutes of trying, let alone a determined attacker.

andy_bootOP8y ago

It is now more locked down (at least "pg_catalog, public, information_schema" are no longer available). Thanks.

postila8y ago

||'' is another way to do ::text ? :-)

masto8y ago· 3 in thread

My feedback after answering the first question:

masto8y ago

andy_bootOP8y ago

True. Well observed. How would you resolve this?

1 more reply

andy_bootOP8y ago

karmakaze8y ago· 2 in thread

I've used a few of these and it's great to practice using more!

Q7 includes output:

  Misty   5.7  0.2
  Oscar   6.1  0.4
  Smokey  6.1  0.0

How would one show 0.4 for both Oscar and Smokey?

http://www.windowfunctions.com/questions/7

dancek8y ago

I'm not sure how you'd do this with windowing functions, but at least you can use common table expressions (WITH) to achieve what you're asking for.

The site is down right now, so I had to test using my own db instance. Column and table names might not match.

  WITH
      distinct_weights AS (
        SELECT DISTINCT weight
        FROM cats),
      weights AS (
        SELECT
          weight,
          lag(weight)
          OVER (
            ORDER BY weight ) AS lag
        FROM distinct_weights)
  SELECT
    cats.name,
    cats.weight,
    cats.weight - weights.lag
  FROM cats, weights
  WHERE cats.weight = weights.weight
  ORDER BY weight;

yields

  Any     5.5   null
  Misty   5.7   0.2
  Oscar   6.1   0.4
  Smokey  6.1   0.4

karmakaze8y ago

I see, it's joining with the lagged weight. Same without CTE:

  SELECT c.name, c.weight, c.weight - lw.lower_weight AS weight_to_lose
  FROM cats c
  JOIN (SELECT weight, lag(weight, 1, weight) over (order by weight) 
        lower_weight FROM cats 
        GROUP BY weight
        ORDER BY weight) lw ON c.weight = lw.weight
  ORDER BY weight

picardo8y ago· 2 in thread

I'm stuck on the first question. I'm not sure why this is wrong. Can anyone help me?

> select name, sum(weight) OVER (PARTITION BY name) as running_total_weight from cats order by name

http://www.windowfunctions.com/questions/1

papercrane8y ago

It's asking for a running total of all weights, so you shouldn't partition by name. Use an ORDER BY instead of PARTITION BY.

picardo8y ago

Ah, I get it. Thanks!

zer8y ago· 1 in thread

Q6: Ranking has an error, or I misunderstood. It says "Order by: ranking, name" but in truth one should not order by name!

Anyway, overall very nice!

andy_bootOP8y ago

you are correct. I'll fix that.

emmelaich8y ago· 1 in thread

See also Markus Winand's blog and http://use-the-index-luke.com/sql/partial-results/window-fun...

and Joe Celko https://www.red-gate.com/simple-talk/sql/t-sql-programming/w...

nablaone8y ago

We've got course that cover window function, not sure if I can post the url here. grrr.

nothrabannosir8y ago· 1 in thread

Lovely website, I'm finally learning about window functions as I do this. Thanks a bunch.

One question / bug report? Q7 rejects this:

select name, weight, weight - (min(weight) over (order by weight asc rows between 1 preceding and current row) ) as weight_to_lose from cats

is that much worse than lag() or is it just non-idiomatic?

(edit: and Q10 asks about "window", but it was quite hard to google. if you could provide another one of those links to the relevant docs, that could help a bunch :D)

andy_bootOP8y ago

That query looks fine to me, I'll tweak the code.

Yep I need to fix up Q10 a bit.

Twisell8y ago· 1 in thread

Great job!

And here is a small suggestion, you might want to change the name of output column "unique" in Q3 because.

  *expression* as unique --does work and is recommended answer
  *expression* unique --fail because of use of a restricted word
  *expression* "unique" --work because of double quote

Using a restricted word that need quotation in some context is globally discouraged and in this context can keep someone stuck independently of the problem at stack.

PS: also apply to "position" in Q4 maybe thats intentional?

andy_bootOP8y ago

Good points. I'll update Q3 and look at Q4 it was not intentional to use keywords.

einhverfr8y ago· 1 in thread

I guess window functions are advanced features, but I found it a little bit basic. Some things you could do to make it more advanced:

    * add questions about range mode
    * have questions which discuss default behavior regarding ordering and partitioning
    * Have questions which rely on things like RANGE 3 PRECEDING or other constructs

dwanvik8y ago

Last time I checked, PG didn't support RANGE <value> PRECEDING, only RANGE UNBOUNDED PRECEDING. The new MySQL 8.0windowing functions do, though.

Eremotherium8y ago· 1 in thread

Thank you for doing this.

Sidenote: either HN killed your backend or I did with a recursive CTE. If so: sorry :)

andy_bootOP8y ago

No problem, just restarted it.

Its a brand new project so I may need to do some hardening.

rattray8y ago· 1 in thread

This is cool! How many questions long is it?

rattray8y ago

(it's 10 questions long)

postila8y ago· 1 in thread

select version();

PostgreSQL 9.3.20 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4, 64-bi

I wonder why so old version is being used. I will be not supported by the community pretty soon.

postila8y ago

I mean, in more recent Postgres versions, more things related to window functions are available – like CUBE, GROUPING SETS, etc. 9.3 was released back in 2013.

dbkaplun8y ago

Source:

https://github.com/bootandy/window_funcs

tofflos8y ago

Nice. I rarely use window functions and find them a bit arcane. This was really accessible.

j / k navigate · click thread line to collapse