Unfortunately every time I read something about minifiers I got the feeling that people are optimizing the wrong problem.
If you gzip data over the line it's already compressed. So minifying your stuff will only help you a little.
The problem is on the client side. You can compress what you like but if the browser starts dropping frames because it has to compile/handle a ton of Javascript and CSS then minifying doesn't help the end user.
For small files you might be mostly correct, but for larger ones min+compress can product much better gains than compression alone.
IIRC the algorithm used employs a rolling compression window, and can only match strings of tokens whose distance apart is smaller than that window. IIRC the default window is 8KBytes and the maximum is 32KBytes. Even if you use the maximum at the expense of CPU time that isn't going to cover many large files. Minifying increases the effective range of the compression window, each match is shorter but you will find more matches and usually this balances out in a way that benefits the compression result.
It isn't quite that simple in reality as there is huffman encoding and other tricks in the mix. This means that even for inputs smaller than the compression window you may see some benefit as minifying can reduce the input data's alphabet significantly.
Ignoring the "why it helps", it is easy to show that it does help in a great many real cases:
ds@s2:/tmp$ wget --quiet https://code.jquery.com/jquery-3.2.0.min.js
ds@s2:/tmp$ wget --quiet https://code.jquery.com/jquery-3.2.0.js
ds@s2:/tmp$ gzip jquery-3.2.0.min.js
ds@s2:/tmp$ gzip jquery-3.2.0.js
ds@s2:/tmp$ ls -l j*
-rw-r--r-- 1 ds ds 79201 Mar 16 21:30 jquery-3.2.0.js.gz
-rw-r--r-- 1 ds ds 30023 Mar 16 21:30 jquery-3.2.0.min.js.gz
In this example the result of min+comp is less than 40% the size of the result from compression alone.For completeness, minifying alone achieves less than compression alone:
-rw-r--r-- 1 ds ds 267686 Mar 16 21:30 jquery-3.2.0.js
-rw-r--r-- 1 ds ds 79201 Mar 16 21:30 jquery-3.2.0.js.gz
-rw-r--r-- 1 ds ds 86596 Mar 16 21:30 jquery-3.2.0.min.js
-rw-r--r-- 1 ds ds 30023 Mar 16 21:30 jquery-3.2.0.min.js.gz
One further factor is CPU time consumed on the client decompressing and parsing the content but this is likely to be insignificant compared to the network or local IO time, if a device's CPU is under-powered enough that this is significant then it is unlikely to be able to run the decompressed code with useful performance.Also, purifycss and uncss are your friends to cut stuff down, to reduce the final load for the user.
True, the difference between 10KB compressed and 7KB minified+compressed is negligible for your visitors, but it still takes 30% off of your traffic bill.
We can build a "prod" version of the app and the minifying process will drop all the debugging code as well as any unused or uncalled functions from the output.
I would have thought that understanding what functions of a dynamic language that can be safely removed would require parsing/AST analysis beyond those found in the typical minifier.
It was a script embedded to the other people's pages (and yes, it delivered substantial functionality, it was not just a tracker), so minification saved a lot of traffic/money for the company.
-) Less work for decompression
-) Less total length means lexing+parsing will be a bit quicker
-) shorter class names will also mean a lower memory consumption because of shorter strings, and ideally fewer allocations if some pooling or smart allocator is used
But those points can probably be completely ignored, since JS is a way way bigger factor.
Depending on the scale, shaving a few kB here and there can amount to significant savings in the long run.
Say you have a document structured like [boring data] [secret data] [boring data]. I don't know if any existing compressor lets you do this, but the gzip file format (really the 'deflate' format used inside it) allows you to encode this (schematically) as follows:
[compressed boring data] || [uncompressed secret data] || [compressed boring data]
where each || is i) a chunk boundary (the Huffman compression stage is done per-chunk, so this avoids leaks at that level), and ii) a point where the encoder forgets its history - ie, you simply ban the encoder from referencing across the || symbols.
If you wanted, you could even allow references between different "boring" chunks (since the decoder state never needs resetting), just as long as you make sure not to reference any of the secret data chunks.
Edit to add: Also, if the "boring" parts are static, you can pre-compress just those chunks and splice them together, potentially saving you from having to fully recompress an "almost static" document just because it has some dynamic content.
debatable with HTTP2 . Furthermore, separate files are easier to cache. If one of them doesn't change it doesn't have to be loaded again. That's my experience with bundles, especially when one uses asynchronous module definition instead of babel, webpack and co.
Remy: I'd suggest posting a CV and linking to it from this post. I looked and couldn't find one anywhere on your site; you'll get a lot more qualified interest if people can find out more about you than just a few blog posts.
Not like it would lead to anywhere either, you know. :P
Two years of teaching university, MSc in computing, love of automation, combinatiorial optimization without any significant amount of deep math skill, circuit design, gate array stuff, low-level CPU optimization stuff, hardware counters, microcontrollers, SQLite, Julia language, C, shell, a bunch more programming languages, basic CSS and HTML, Mustache templates, Linux. No JavaScript. The whole of luisant.ca is done by hand and from scratch, so you can see what I can do with that. Also statistics.
I'm uncommonly silly and not at all serious, while still standing up for what I believe is right, even if everyone disagrees with me. I do integrity though and while I can stand my ground, I will yield to good evidence any day.
I wear goofy clothes to work, usually very colourful stuff. Don't expect conformity.
And I have no love greater than the one for the cause of nature conservation, and all research that goes with that.
That's about me in a nutshell.
Told you nothing will come of it. :P
1. I am not interested in jobs that focus on JavaScript, but I'd never say "No JavaScript".
2. when people ask for your CV, post a link (or if you're concerned about privacy, solicit requests via PM/email).
3. Your integrity, sillyness and attire are mostly immaterial to the job hunt. It's most appropriate for folks to make this kind of assessment during an in-person interview.
4. what is all this "nothing will come of it" business? If you think you can't get hired as a result of an HN post, I think you're sorely mistaken. Capitalize on these fifteen minutes. You've created some original, interesting content. The traffic you're getting to your site won't last, so strike while the iron is hot.
Just post your damn CV.
Seriously, if you want a job, publish your CV on your website already.
Well, fuck my life. Here goes, posted on the site too.
This required way more courage than anything else I have done in the last few months, even after I censored it to death.
Thanks for the good words, guys. I know you are right. Even though I hate to agree with you, you are right. I hate to do this, but you are right.
So it is done.
To be clear: I don't mean philosophical reasons. I personally love letting javascript deal with the 'cascading' part and I don't have a problem with the idea of having styling embedded in the final page.
What I'm curious about is if this has any kind of negative impact on performance, bandwidth, etc. Because the CSS is loaded on the component level, and because Webpack 2 does tree shaking, the page will be guaranteed to only contain CSS for the components that are on the page. And if I'd 'lazy-load' parts of the app, I'd get that benefit for my CSS as well with no extra effort.
On the other hand, any benefits of having a compiled (and hopefully cached) bundle.css are offset by the need for an extra request for the css file, as well as the very likely situation that there'll be a bunch of unused css in that bundle.
Am I missing some drawback to the above-mentioned approach?
What will be very interesting in the coming years (as the work gets done around it) is "full css" optimization. That is, when you know you have all of the styles for the whole page available to the minifier. If the minifier knows that no other CSS is being loaded, it can do a lot more work to remove and merge rulesets. In the case of styles bundled with Webpack, common CSS could be reduced even further, after tree shaking has taken place.
But this is probably quite a ways off.
In a past life a website I worked on had a huge browser paint performance and content flash issue that was eventually cleared out by moving all the styles out of <style> tags in the DOM.
A more interesting problem to solve, I think, is that of optimising CSS rules for browser rendering.
Things like:
* Rearrange rules within the file to put similar rules within the sliding window.
* Rearrange rules so that tail of the last declaration of one rule and the start of the next selector create the longest possible common substring.
* Rearrange the order of declarations within the rules to maximize the length of common substrings that span two declarations, ie ": 2em;\nbackground-color: rgb("
Any sponsors? No? Didn't think so. Not even you, big G? Aww...
For now some minifiers do sort the values, which helps.
I'm engaging in this ugly probably unsightly (but helpfully quick and maintainable) practice in a project with a short developmental cycle right now, and I've yet to have any issues outside of temporarily forgetting that I have already globally defined a specific style or enclosed a style I thought I'd left global.
(It's a corp. annual report -- that my team got tasked with as a favour -- so it has some repeating styles, and others isolated between pages)
I used to have a doc with actual numbers, but I've since lost it. If I dig it up, I'll link it here.
I learned this one the hard way a few months ago. We ran into a flexbox bug in one browser which we worked around by adding some-rule: 0.0000001px instead of 0px. However, our minifier collapsed that using scientific notation, which triggered a rendering issue in a different browser due to the out-of-spec CSS. The whole adventure left me feeling like I'd travelled back in time.
https://www.w3.org/TR/css3-values/#numbers
Which browser had problems with it?
[0] http://tosche.net/2013/10/font-size-in-the-metric-system_e.h...
[1] although non-japanese typographers like Otl Aicher have recommended its use it doesn't seem to have had much success outside Japan
This has tripped me up many a time when I've created CSS colour strings (mostly for <canvas> use) by concatenating Strings and Numbers in JavaScript. When a Number gets small enough, it ends up in scientific notation, and the CSS parser rejects it.
It's very interesting, however, that no one minifier is a consistent winner in these test cases, and that running CSS through multiple minifiers is actually, potentially, not all that crazy. (The very debatable real value in doing that notwithstanding.)
Have you seen my post on the Remynifier, where I do exactly that?
I've mentioned it before, but it's really not a great idea to use multiple minifiers. Minifier bugs can get nasty, and using multiple minifiers exponentially increases the likelihood that you'll encounter some weird or broken behavior. Make sure to test thoroughly.
It didn't possibly create bugs by rewriting to new units (especially poorly supported units like q) and had the best results overall.
I'd like it to be a wee bit more aggressive on the rounding but other than that it seemed a clear winner.
1. Server: check Accept-encoding header for gzip or brotili support
2. Server: compress either brotli or gzipped file, or fall back to a raw file
3. Server: send data to client
4. Client: receive (and decompress, if not raw)
5. Client: parse (big) resource
Also, compression through uglifying/minifying improves parse speed, which is really helpful on (old) mobile devices. Adding compression through gzip or brotli introduces additional overhead, because the uncompressing step will be in-memory and stalls the processing of the file.No, you don't need to waste CPU cycles on compression for each connection. You can store the .css.gz on the filesystem along with the .css and have the webserver pick up the appropriate file based on Accept-encoding.
That way you can precompress with the slowest compression options.
Even if the difference is minimal, it could mean the difference between, say, three TCP packets and four, which adds up for users on high-latency connections.
https://chrome.google.com/webstore/detail/javascript-and-css...
Wow. #1 on HN. Wow.
I'd usually hang around a bit more, but I'm really tired. I posted this past my midnight. 00:51 now, and I'm fading fast.
Thanks for all the love, everyone. I'll come over tomorrow (12 hours from now, or so) to answer any questions or to pick up any corrections.
There used to be a bug with flex-wrap: wrap; where an element would wrap to the next line while it should have fit. You could fix it by instead using width: 25%; use width: 24.999999%; so it would be 25% on the screen but it would fix the problem so it didn't wrap to the next line. So you should look out with this.
Also didn't know one could use counters already. Browser support is great. I thought it was still under approval.
Amazing stuff, thanks
But scientific notation? C'mon
My pleasure! All from scratch, designed for nothing but readability.
I am building textbooks on the same code, so the blog is a great stress and user test.