At the time, the recommendations only updated once a day, but an active user would have to dynamically load that content repeatedly, and at the same time, the recs were getting updated for users who hadn't visited that day. By switching to static, we could generate a new static site for you every time you watched something (which could change your recommendations), and increase reliability at the same time, so it would have been a much better customer experience. Unfortunately we couldn't get enough of the frontend engineers to buy into the idea to get it off the ground, and also they were already well along the path to having a data pipeline fast enough to update recs in real time.
(I can't find any official reference to this though, but another user has referenced this some time ago: https://news.ycombinator.com/item?id=8060200)
I guess theres also a question of whether you are caching data from the backend that powers a front end app, or actually caching the full front end itself.
It's the usual boomerang cycle of discovery and adoption.
Both types of sites have their benefits and it's a balancing act to use the right tool for the job. It's getting this right that comes with experience and an understanding of the current pitfalls of each. It's the rough edges that push people in the other direction and without the experience of the pitfalls of each, it's inevitable that people start predicting that one solves all the problems facing the other.
I fully expect the usual over reliance on the wrong type of tech for the sake of it being the current hotness and then an over correction in the other direction the moment we have a new generation of developers.
Then everybody got tired of waiting for their sites to rebuild every time they changed something and switched to WordPress, which wasn't static. Suddenly your changes showed up right away! Hooray! Then everybody got tired of WordPress falling over under any load stronger than a stiff breeze, so suddenly static site generators were in fashion again.
If you think of approaches to building a content management system as a continuum, with purely static at one end and purely dynamic at the other, you can see the entire history of the segment as a series of oscillations along that continuum. Each approach has drawbacks, but the drawbacks of the approach you aren't using always seem minor while those of the approach you are using seem painful, so the market just bounces back and forth between them ad infinitum as people rush to discover if the grass on the other side of the fence is really as green as it looks.
Back then, this really had some advantages: Updates were rare, but views comparably frequent, while databases were either not that performant or quite expensive. This way, you could serve everything from cache (remember Squid?), and, compared to a dynamic site, it was really quick, even in admin-mode. Given the modern machines and the lots of memory they come with, it's quite ironic to see this come back, while we saw the triumph of the LAMP stack on comparably modest machines. Nevertheless, if you've only a few updates and lots of views, it's a good idea towards green computing. (Save some mountaintops! [1])
On the other hand, there is some "magical" limit regarding flexibility and complexity, where things tend soon towards unmaintainable code. So, the judgement is left to you, respective to the purpose.
Edit: [1] "Mountaintop mining" at Google-images: https://www.google.com/search?tbm=isch&hl=en&q=mountaintop+m...
Back in 2006 when I worked for Yahoo!, and they had a CMS / template management system called Jake that statically generated templates for the PHP-based frontend servers to evaluate at request time. The idea was that you put as much of your logic as possible into the template generation layer, leaving the request-time logic to handle the stuff that changed request by request.
Now, that all sounds quite reasonable, but the two layers were written in different languages. The pre-template-generation logic was written as inline Perl (plus a little custom syntax, because why not), while the dynamic frontend logic was written in PHP. Perl was frequently used to generate chunks of PHP code to be executed by the frontend servers, and sometimes this PHP code wrote chunks of inline JavaScript. To say that debugging said JS was fun would be an understatement.
Well, that's a bit of an exaggeration. When I left Yahoo Europe in 2005, there was still Perl all over the place both in Europe and the US at least. I managed the Yahoo Europe billing system, and that was mostly Perl on the backend, for example.
[small world, btw., courtesy of some minor profile-stalking: I interviewed with Ed about a position a few years back; your service looks interesting - I have a client that might be interested]
That said, all code-generation tools - straight up generators, compilers, whatever - are a mindbending experience. Keeping track of whether a variable is available at generation time or runtime is trickier than I initially expected.
http://sourceforge.net/projects/rthree/
Looking at the source now, I think we might have been trying to use every pattern in the GoF book in one project.
And many particularly personal properties such as mail, or the billing system (my area), would take extra precautions about what information could be made available on other properties (e.g. what info from mail could be shown on the homepage) even if the user was logged in, to prevent leaking information that shouldn't leak. This would lead to extra logins that I'm sure seemed unnecessary, and logins where the user probably thought they were already logged in, but where non-personal information was keyed to the browser rather than their user id.
I'm sure there are bugs and unintentional quirks too - the system was crazy complex already in 2005, but I really hope they've stuck to their guns when it comes to how carefully they treated personal data back then.
There's a spectrum of evaluation strategies from eager to lazy. Programs may mix various flavors.
Lazy evaluation combined with memoization is a sexy and elegant approach to some problems, as are dynamic web sites. On the other hand, whenever possible to do so, it's hard to beat the speed and simplicity of handing over a precomputed answer like "42".
A (very loose) analogy:
Static generation of HTML is roughly like using Lisp macros to evaluate some things before runtime (at "expand time" or "compile time").
The resulting transformed code could be a simple literal like "<html> ... </html>". Or the code might need further evaluation at runtime -- which is roughly like the precomputed HTML containing JavaScript to do things at runtime.
template(data) -> html
I'm a fan of the inverse of this. The template and the output (html) are the same thing; I don't separate them. Instead I just update the html when something changes. The function is: update(html, data)
Where update is a function you write that modifies the html in place.The upside to this is that generating new content is cheap, you only ever update the things that need updating.
The downside is that your html output and updating function are tightly coupled; you can't change the structure of your html without also changing your update function.
I think that's ok though. Changing templates are infrequent enough to warrant the increased cost of fixing everything when you do change them.
You're outsourcing half your site to third parties, and basically letting them do whatever the hell they like with it. Disqus comments? Better hope the people behind that system don't decide to outlaw comments about the thing your website is about. Javascript embedded shop system? Good, so long as you don't need to modify the look very much and don't mind all your data being hosted in a different part of the world (like, the US for people in other regions).
And if they decide that all your data needs to be shared with the NSA or some other government organisation, then tough luck. If they hacked... well, tough luck again.
Without hosting such systems yourself, you're relying on a lot of third parties to be transparent, honest and respectful of your privacy (and that of your visitors). It's basically like a return to the days of free hosting and services like Bravenet.
But that doesn't really apply to a lot of things. Comments for example, do you really trust a third party more with those? Because if your site is in a grey area, then it's very possible their terms/country/whatever might require them to ban discussion of the topic. Self hosted means your rules, not a large corporation's.
Besides, any middleman is the weakest part of the chain if someone wanted to shut down a site or significantly cripple it without going through a court case. You may like controversy, but a large company would rather see the back of anyone that might potentially hurt its public image. We already see issues where internet mobs go after hosting companies and providers based on something someone said on Twitter. Every third party service is yet another potential target for them, and one that could buckle even more easily than the hosting company (especially if you're not paying for their services).
In addition to page load issues, they also more or less completely solve the Slashdot effect (aka the Reddit Hug Of Death, these days). A competently-configured Nginx server on a 512mb VPS, serving static-only content, will handle ridiculous amounts of traffic without flinching.
Ever since a front-page mention on BoingBoing took down my feature film BloodSpell's site immediately after release in 2007, avoiding server load on a site has been high-priority for anything I'm launching that is likely to have bursty traffic.
It's nice to see usable tools for managing larger sites with a static generator developing and becoming popular.
I'm very much in the Nginx/static camp, but it would be useful to know how bad the spikes can get.
It was very interesting to see and was quite a lot of traffic for sure.
Bear in mind that the screenshot he posted is for concurrent visitors too.
[0] http://www.damninteresting.com/
[1] https://www.reddit.com/r/todayilearned/comments/3d3vct/til_a...
Reddit depends wildly on the size and activity of the subreddit you're featured on. Imgur shows view stats, so you can get some idea from that.
For scale on whether nginx can handle that sort of load, I've had tiny VPSes sitting there happily handling 200 SIMULTANEOUS users serving static files, which translates to between 500,000 and 3 million uniques a day, maybe more depending on your site design.
My company is handcuffed to a legacy custom CMS that's still using Rails 2.3.8. We have no need for it, as we have a front-end guy who would be perfectly comfortable using git, as he has me to ask whenever there's a problem. When you use the database to store content, you lose a great number of useful properties. I had to build a custom tool to search through the entire database to find encoding errors. And I had to keep re-adjusting it every time I found some hiding somewhere. It was annoying and painful. Content is code, not data, it needs to be managed like code.
Content is certainly not code; it does not get compiled, it contains no logic. At best content is a property of an object with its structure defined in/as code.
Assuming that the content you're publishing is primarily writing, then yes, that makes sense. But most companies on the web are not primarily blogs or news. Hell, even the way news is going, a lot of these Snowcrash-type (or what was that famous NYT article called?) articles are too complicated for a template-driven workflow.
The company I work for is a marketing company, it primarily uses the Internet for e-commerce. Our marketing team gives their ideas to our front-end guy, including mock-ups and copy, and he goes ahead and makes all the HTML/CSS.
I think most companies on the web wouldn't be able to tell the difference between a static site and a CMS, because non-technical people won't even touch an admin back-end anymore. There's just too much to screw up to leave in the hands of a non-professional.
For selfish reasons I admit.
I'd love an app that gave me that ease of hosting and creation and generated a static site from there (hook it up to an S3 bucket or something). I'd pay for sure.
Currently I'm using Nginx w/ SSIs for lightly dynamic sites. Works well enough and is very very simple.
Keep reading, it's just the optional hosting that's paid, the software itself is open source.
The one static generator I wish there was (unless there is one and I just haven't found it) is one that would take a tree of code files and display it kinda like github does, in a browsable file browser hierarchy with syntax highlighting when drilling down to individual files.
Kind of like a precompiled static file browser, there are several dynamic file browsers around but they all require server-side code (php usually) to do the directory listing and so on, but I think it should be possible to precompute all the directory display pages with symlinks to the individual files, and do the highlighting in those in JS/CSS
I might end up writing this at some point as it's definitely an itch I'd like scratching unless it does exist already and somebody kindly points me to it
https://en.wikipedia.org/wiki/LXR_Cross_Referencer
And I think there is one for GNU GLOBAL. The point of these is usually the cross references, not necessarily making it pretty though.
When you look at page speed, not much of the slow-down is from servers delivering pages, but browsers having to digest HTML/CSS/IMG/JS
I guess it all depends on what you are trying to achieve. Thanks as always for the article, interesting to see someone taking this with both hands.
What local static site generation with html,js,css pushed to remote server solves is giant problem of insecure code.
> more than 70% of today’s WordPress installations are vulnerable to known exploits (and WordPress powers more than 23% of the web).
I did some consulting for a company that was being destroyed by wordpress installations being hacked. When they first started offering wordpress for a blog option (blogs were not their main offering) the threat profile was different. Fast forward six years and they were getting hacked daily.
I recommended that they move their wordpress blogs to flywheel, and have flywheel manage wordpress for them. It worked and they were able to focus on their main offering, the real reason customers were paying them.
Services like flywheel are one answer to the problem of insecure CMS code. The other, and better in my opinion for a lot of sites is to run the CMS locally, keep the database local, and push the rendered code to the server.
I think both can be to blame, although you're right that JS load is becoming more and more of an issue. I used to work on a site that used Joomla and literally hundreds of separate database queries were executed for each page request. That is going to put a huge strain on server-delivery time, and it's unsurprising that this site couldn't handle much of a load before falling over.
Static pages are really so much faster than any dynamically-generated site I've seen. Still, I agree about JS - it's possibly time for a 'back to basics' approach on client-side scripting akin to the static page revolution that has occurred on the server-side. At the very least, I think the 'many scripts, many sources' issue needs to be resolved.
And it's not just legacy code, new applications in 2015 are still using those mostly server side technologies without too much done on the front-end.
I've been following their repo for a while, hopefully by next year they'll be far along for it to be usable.
Ghost has an excellent editor and it would be awesome to have a static site built in any way you like that links straight to your blog posts via API calls.
They have a trello card that mentions it https://trello.com/c/QEdjRlgK/67-open-public-api-via-oauth-a...
and are working on it as we speak :) https://github.com/TryGhost/Ghost/issues/4004
I know there's a wordpress API as well but I find wordpress too bloated IMO.
The PittsburghToday site is representative of the idea that a static web site is only static in the technical sense of the back-end content serving. The front-end is still dynamic since the data for the charts is being obtained from Google Docs and the Twitter feed from Twitter, etc.
I always felt like the odd man out, so I am glad to see strong interest in static web sites nowadays.
However you go about generating the HTML that a visitor sees, you still have to content with how to create tooling that satisfies content authors. My position has always been "use whatever tools you want - I'll figure out an automated scheme to convert it". I think that with static site generators it is easier to have that separation of authoring and publishing.
My most recent talk was two weeks ago at PyCon Japan. Following are my slides for that talk, in case that's useful for anyone who wants to get a better understanding of SSG history and advantages/disadvantages: http://justinmayer.com/talks/
Also happy to answer any questions here, of course. (^_^)
I've since started using http://gohugo.io
It's lightning fast with 1000s of pages, and quite easy to pick up.
What I like about Metalsmith is you build you own workflow. A basic installation does nothing but copy files from the source directory to the destination directory.
I'm working on a beginner's guide to Metalsmith right now, in fact!
There definitely appears to be a lot of interest in this space because you get the best of all worlds. Static site generators definitely seem like the way to go for all but actual web applications.
https://wordpress.org/plugins/simply-static/
It's still a work in progress (it launched a little over a month ago) but the feedback so far has been positive.
http://pl.atyp.us/wordpress/index.php/2013/04/using-wordpres...
The older posts on my site were all converted this way, though the newer ones are done directly in Pelican.
https://developmentseed.org/blog/new-healthcare-gov-is-open-...
https://developmentseed.org/blog/2012/june/25/prose-a-conten...
I have not used prose.io, but the company that built it, Development Seed, is strong. You might have heard of one of their other projects: Mapbox.
I have no idea if prose.io is being currently maintained. Development Seed was in the middle of transitioning to doing Mapbox full time, and Healthcare.gov was their "one last" consulting project. By all accounts their work on the static part of the site was great, but was totally overshadowed by the failure of the enrollment feature.
This is great because it allows engineers to maintain a simple static website and content editors can use a web based GUI that is kind of like WordPress (though CloudCannon has a long way to go on the usability of its web GUI). We used CloudCannon at Hillary for America for a small project. It ended up not being the right tool for the job, but I definitely think there is a use case for CloudCannon and the team behind it is super open to feedback and iteration.
Another workflow could be to use WordPress locally in a VM to edit, then generate a static site you upload through git or whatever to a bare minimum server.
Any plugin that fit the bill? I remember searching but never found something perfect.
Its trivial to put in a cache level that generates and stores static html in from of WP, Drupal, etc. So you get both worlds; the tools that dynamic CMS's give you and the performance of a static site.
I think it took me 5 minutes to install varnish on a WP server I have. Varnish delivers these pages straight from ram. My page load performance is fairly absurd. If that's too technically daunting or your webhost doesn't support varnish, totalcache is also good. Boost for Drupal is good too.
I think the main problem is that it requires an extra step when you're running Apache (modifying .htaccess) to become truly static.
Our primary motivation was being able to build a static website as easy as using WordPress - so we figured, why not turn WordPress into a static site generator?
As notacoward mentioned, there is a few gotchas so we decided to launch it as a SaaS in order to abstract those things away and make it fully hosted on a CDN out of the box. We still give you full access to the WP install though, so no vendor-lock in or shackles for the customer.
Happy to give anyone a demo if you're interested. You can contact me at mathias AT dotsqua.re
edit: apparently it's mentioned in the TFA
We figured that inputs to the search was from a static range, i.e. these players, those games, that league, this type of incident (foul, goal, celebration, etc).
Then we pre-calculated all possible combinations and fired them through what we called a "cache cannon".
It was highly parallelizeable, simple to store on disk (we stored JavaScript files whose names were the form inputs), and worked extremely well.
Even for something like a search engine, unless you're doing full text search over a very wide corpus, you can look at pre-populating a cache and that cache actually being stored on your web servers and being directly addressable.
The design above, allowed that search engine to work over the weekend peak of 2 million users. That's where it shone... we just did not have to worry about the thundering herd with a pre-populated cache.
If I can do everything with JS on the visitors browser, why not host some shell HTML on S3 and never worry about a server? Maybe hit AWS Lambda if need be for one specific thing? Dunno. The age of the do everything server seems to be coming to a close.
Sounds like a great idea to overcome the need to obsess about connection multi-plexing.
The reason many people (including me) have gone to static generators is not just the static part but the generator part as well. It's not just static but preprocessed, no further operations necessary except to deliver dead bits. That captures all of the caching, security, and other advantages in a way that hybrid approaches tend not to.
> If the JS running on the client has to fetch many such resources and stitch them together, then it's functionally equivalent to a standard dynamic website and shares many of its drawbacks.
Whilst you're talking about extremes (e.g. rendering a page using Javascript), it seems to me that the most abundant use for AJAX-style approaches on static sites is when there are a few "value added" parts which rely on a DB. For example, a static blog with Disqus comments.
As other commenters have noted, the ability to push bits and pieces of functionality into JS has tipped the balance in many cases, e.g. from "if you want comments, it'll all have to be done in PHP" to "there's no point rendering this on demand, we can do the dynamic bits in JS".
Heck, once CDNs and caching get involved, I wonder how many requests will even hit your HTTP process at all.
I want a simple site generator. I don't want markdown, I don't want a fancy templating engine. I want some simple templating system that takes in normal HTML and generates pages from simple templates I define. I want to shove in some arbitrary HTML and have it spit out a site using some base templates.
To the best of my knowledge, that doesn't exist. It would be perfect for someone like me who wants to keep a website updated, but doesn't always want to run PHP on the server for something as simple as that.
I implemented a shoddy version of it on my own, but it's far from ideal. I'm pretty astounded there's not a well thought out version of it out there, considering how useful it seems it would be.
I built a very simple admin-interface with it,
input: https://github.com/lms-io/scormfu-admin/blob/master/source/i...
output: https://github.com/lms-io/scormfu-admin/blob/master/build/in...
It may not be along the lines of what you're hoping for (it may not be simple enough), but I've found Stasis[1] to be a powerful tool for static site generation.
You can write plain old HTML pages and/or fragments, with your desired level or genericity, and then run them through a set of transformations[2] to fill in content, set attributes (e.g. classes, styles, whatever), un/wrap elements, and so on.
https://github.com/skx/templer
In the past I wrote all my HTML pages by hand. Then I started just writing the "body"-area, and concatenating a header, footer, and the body together to generate output. After that I started to make more changes.
In the end it grew, as these things do, but at the core templer allows you to put "stuff" into a "layout", and generates the output. It might suit you, or it might not. But there are many similar tools out there.
This was one of the original "static site generators". And it's still used today, for example by the Smithsonian for some of their museum sites.
Statsi Site Generation is one of those thing s where I think Not Invented Here is a legitimate point of view. Your exact use case is never going to match the exact use case of other people so you should roll your own rather than trying to customize an existing solution.
Plus writing a SSG is super fun, it's like the most 1960's thing you can do (Input Docs->Processing->Output Docs)
People can hate on it all day, but that is exactly what it does. There is a reason it was such a huge tool 15 years ago.
It was an inspiration to what became WordPress
Understanding of the command line, delayed WYSIWYG feedback loop, FTPing/synchronizing files. Some generators don't even tackle "pretty, easy to install templates a la Wordpress" either.
All surmountable, but I haven't seen an "all in one, easy to use" fix yet.
I'd love if I could recommend a single client-side app for people to use that did it all. Something like Coda but tailored for beginners?
IMO, they are the next big thing if their are contextualised in Micro Service Structure, so that's why I build Monera - http://github.com/Ideabile/monera
What you think?
The article is frustratingly biased in this regard. Static sites should just play to their strengths, otherwise you probably want a CMS that will act like a static site when it needs to.
Hard to deny that the additional functionality most people end up wanting beyond a simple online journal brings additional security risks, whatever the framework.
However, quote > "The static version is more than six times as fast on average!"
This must be an engineering problem, especially on easily cached content. Serving static web sites Does require computation. But the current tools are very well made and optimized for it, witch is not the case with most CMS systems.
Once in a while someone manage to get CDN hosting just right, but it's really rare, and it's not something you can simply automate with a dynamic site (like we can for static sites with netlify). Typically the result is identical to the Smashing Magazine Site, often a lot worse. Smashing does a good job of caching at their origin datacenter, but their HTML doesn't get cached at edge nodes. Many other sites does a far worse job of caching at their origin.
It might be true that to some degree it's an engineering issue, but if it's one that hits 95%+ of all sites built with a dynamic approach and can be completely eliminated with a static approach, then obviously it might be better to shift the balance and default to doing thing statically instead of reaching for Wordpress/Rails/Drupal/whatever for each new site...
If your dynamic site loads slower then a static site, you are probably doing needless database round-trips, redirects, synchronized writes, or html rendering.
The list goes on and on, as a whole ecosystem of purely browser-based add-ons to websites is emerging. Apart from that, modern web apps built with Ember.js, AngularJS or React are often deployed entirely as static websites and served directly from a CDN with a pure API back end that’s shared between the website’s UI and the mobile client."
--
I'm not sure I understand. It doesn't seem to me that a fully single-page, AJAX web site is truly "static". If much of the utility and content must be paged in via client-side JS calls, that too will contribute to load time and the same problems that are attributed to dynamic document generation. It may be all asynchronous and fancy, but from a UX point of view, the content isn't there until the data's retrieved. How's this any different than arguing for a grid of IFRAMEs?
After all, if your page is a minimal HTML DOM harness for a bunch of JS, can one really be said to have "loaded" the page simply in virtue of having loaded the stub HTML?
Or is this argument based mainly on either the implicit premises that (1) not all the functionality and components are used at once? or (2) that much of any given site's functionality can be off-loaded to third-party components (e.g. Disqus) which can all be loaded in parallel from different network sources?
It's open source. Design and content is edited collaboratively and it deploys a static site.
Are there any other CMS systems designed to deploy static sites?
Best Regards
I can run a static page off an Apache (or any wwwserver) instance. Just chuck files in /var/www/ where you want them.
Now where it gets interesting is I use Node-red to generate the pages; content and all. I want headers? It's a variable. I want ads? It's another variable Google provides. I want chat? Easy ( I can do it with nodered or 3rd party). I can bridge that webchat with my or someone else's IRC room.
Now, I can script it so the pages are updated from nod-red server to webserver. They can easily sit on the same box, as node-red takes few resources.
And the kicker is that I could get that done in an hour or so. Check out Node-Red . It really is that amazing.
Static site generator doesn't mean there's no backend. A website is called 'dynamic' when its operation depends on communicating with a server. The JS logic delivered to the client can range from animation to async http requests.
The distinction between "static site generator" and Webpack / Gulp is very gray. It all depends on what you want to do with your client-side JS logic.
Now I tend to think that the best way would be to use a random CMS (like Wordpress) and mirror it's output as a static site. That way, the dynamic part the end user uses for creating content could be behind some secure login-wall, and the public site would just be static.
As for comments, I guess they just won't be supported.
That way you can use whichever framework/stack/templating/database you're already familiar and productive with, and in the end you're just deploying a static build folder from localhost.
I started doing this when it came down to hacking Jekyll to implement something that's trivial to do in a microframework, so I went with microframework + page caching. I do the build and deploy with a gulp task that I'd like to generalize into a gulp module.
Often all that is needed is -r --no-parent
------
I host 10s of more or less static sites (Contact forms being the most dynamic elements) which are generated on the spot from one PHP (laravel) installation.
Anyone know the best way to cache the html/css statically to serve?
I think I'll write an article "pure functions with memoization are the next big thing!" Except they are not "new" because they've been around for decades. The only difference is the Web 2.0 uberkids haven't "discovered" the concepts yet.
If your content doesn't change frequently and/or the costs of regenerating the static content is minimized for you, great.
At what point do we see static sites take a fair share of the top-X-trafficked sites? Top 100? 1000? 1,000,000?
This is probably great for a small corp's info site... but then the client asks for a contact form or members/admin secured area, and there we go down the rabbit hole again.
Honestly, most media sites could (and probably should) be static. Think of Time, or Cracked, or CNN: a lot of content, which could be regenerated once and viewed by millions of people per regeneration. Comments could be grafted in with JavaScript (which would suit me just fine, since I don't read such sites for the comments anyway).
> This is probably great for a small corp's info site... but then the client asks for a contact form or members/admin secured area, and there we go down the rabbit hole again.
It's not an all-or-nothing thing; a web server can serve both static and dynamic content, after all.
Some have more than 10k pages, search functionality, internationalization and large content teams behind them. Expect some interesting case studies :)
Usability is also an issue. Wordpress is a far more friendly environment for them to make changes or create a new post than creating a text file with specific formatting and running a script.
As I am more of a python guy I wonder if there is a similar (as in not primarily for blogging) generator for python?