Opinion | Oct. 23, 2020

Pushing Nuxt.js Static Generation One Step Further

How to stop shipping things that you only need at build time to further lighten your bundle size.
Pushing Nuxt.js Static Generation One Step Further

Back in the days $ nuxt generate was already a thing, but it was only addressing the SEO issue, not the performance one. Indeed, after the first page load, later page navigations were resulting in running through the asyncData hook again. This was making our application then behaving like a Single Page Application, querying our APIs. In our quest to full static this led some of us to rely on "hacky" modules like the awesome Nuxt Payload Extractor by DreaMinder. Thankfully Nuxt.js was aware of that and this issue got solved with Nuxt.js 2.13 release. Static generation now being available for everyone.

So let's throw target: "static" into our nuxt.config.js file and call it a day?

Sure! But developers always want more and have still been seeking new ways to optimize our websites. For example, people are talking about removing the client-side JavaScript from the built website. That's what Netlify did while revamping theirs, and it should be a thing available in VitePress. Indeed, while we all love building our applications with powerful JavaScript frameworks like Vue.js or React, sometimes the final website does not need those much, or at all, so why shipping them? That's a concept I'm quite thrilled about! But I have one issue with it: it makes our applications behaving like SPAs quite impossible, or at least it does not sound like something going well along to me. This led me to this question:

What could we do to keep our websites SPA capabilities, while reducing their bundle size?

The answer I came up with was to move all the data loading and parsing from the website to a dedicated data layer.

A Data Layer Approach

Data is one of the core pieces of a static website, whenever there's a change to it we want to rebuild our website to reflect it. Thus, data can be responsible for a great part of our JavaScript bundle. For example, if we want to use content from our favorite headless CMS that's 10 to 20 kB of additional gzipped SDK to add to our bundle, just to load that content. Nonetheless fetching is only one part of data, parsing is the second: using an image CDN kit? Add another 20ish kB to our bundle. Still relying on moment.js? Enjoy another 20 to 70 kB of JavaScript while migrating to something else.

Here we learned two things regarding our application data:

  • it has loaders: those are packages responsible for fetching data from a data source, e.g. an SDK to get data from a CMS;
  • It has parsers: they are packages in charge of mutating the data, e.g. a date library.

The numbers I mentioned above for loaders and parsers might look insignificant but they quickly add up to hundreds of kilobytes. Why is it an issue though? We are generating our websites static so we are good, no? No, we are not. Indeed, while our application got static at build time, Nuxt.js will still load all our loaders and parsers in case it has to fallback to SPA. This means each of our users would still have to download them. With the data layer approach we are aiming at avoiding that.

To understand the issue better let's have an example. If we only look at those loaders and parsers, our application could look something like this:

A regular Nuxt.js application with loaders and parsers being part of the bundle

If we add up the payload of libraries used there, our bundle is already over 70 kB gzipped (we are not here to discuss library choices, that's an example) To mitigate that with the data layer approach, we want to do two things:

  1. extracting all the loaders from the application to perform loads elsewhere ;
  2. extracting all the parsers from the application to parse data elsewhere so our application only cares about displaying parsed values.
So, what is "elsewhere"?

"Elsewhere" is the data layer itself. It's some kind of buffer between our client application and the different services we are using. The data layer takes care of shipping tailor-made payload for every asyncData or fetch hooks in our application. With it our application will only have to deal with ready-to-use payloads, avoiding our users to download those unnecessary packages. Taking our previous example it could look like this:

A Nuxt.js application with a data layer, loaders and parsers being extracted from the bundle

You can notice here that our componentB does not fetch the data layer directly but instead it receives parsed data from its parent. Indeed, if componentB only takes care displaying the publication date of an article this data could have already been parsed on our dedicated data layer which is consumed by page here for example. This could prevent us from loading some kilobytes of date library. Although, I'm not saying that we should move all our parsers to the data layer. While it makes sense to already parse rich text and code snippets to HTML in it, it might make less sense to parse date there. If we want to display a localized date to the client or have a lot of different date format on a given page it might not be convenient to do the parsing at the data layer level:

For parsers there is a right balance to find between convenience and performance.

So far we addressed asyncData and fetch as they are quite similar. There is one case we have not discussed that is "runtime data". What if we have a search page proudly powered by Algolia? With that case in the same fashion, we can extract that search method into our data layer. Perform needed parsing there. Then sending to the client only what it needs to display.

We saw that building our application with the data layer approach will save most of our users from a rather significant JavaScript bundle. Moreover, building things with this structure also comes with other benefits:

  • It's easier to maintain and test: when working on our data layer we only think of what data a given page or component needs to display.
  • It can also help with reducing the data payload in a way as we can there cherry-pick what data will be sent to our application, like what we are doing with GraphQL in a sense.

There are few downsides although, one of them being that in order to preserve our SPA capabilities in production we then need our data layer to be available there too. This may sound hard to achieve, although as we will see this can be set up in different ways...

Implementing a Data Layer

Enough theory. Let's have a more concrete look at how we can create this data layer within an existing Nuxt.js application. Just a small disclaimer before we start: this is not a comprehensive tutorial but more general ideas about the way to do it, actual implementation remains yours.

Grouping Loaders and Parsers

The first thing we need to do is to group our loaders and parsers. To do so we need to look at our parsers. As explained above we need to define which ones are worth moving to the data layer, which ones are not. Basically we are asking ourselves: "Does it make sense for the client to be able to parse that data itself or not?" If the answer is "no" we can consider the parser as a good candidate to be moved to the data layer.

Parsers identified can then be moved to our existing asyncData or fetch hooks. This way we are already separating them from the client runtime and grouping them with their related loaders. Here is a small example to help us visualize, considering the following blog post page:

<template>
  <div class="__page__blog-post">
    <header>
      <h1>{{ title }}</h1>
      <small>{{ timestampToHuman(published_date) }}</small>
    </header>
    <article v-html="markdownToHTML(body)" />
  </div>
</template>

<script>
export default {
  async asyncData() {
    const { title, published_date, body } = await getDataFromCMS(context.params.slug);

    return { title, published_date, body };
  }
}
</script>

On this page we have two parsers: timestampToHuman that takes care of formatting a timestamp to a human-readable date, and markdownToHTML that transforms a Markdown input to raw HTML. With our example we want to move them to our asyncData method along with their related loader (getDataFromCMS here):

<template>
  <div class="__page__blog-post">
    <header>
      <h1>{{ title }}</h1>
      <small>{{ published_date }}</small>
    </header>
    <article v-html="body" />
  </div>
</template>

<script>
export default {
  async asyncData() {
    const { title, published_date, body } = await getDataFromCMS(context.params.slug);

    return {
      title,
      published_date: timestampToHuman(published_date),
      body: markdownToHTML(body)
    };
  }
}
</script>

And here we go, our loaders and parsers are now grouped together. We can achieve the same thing with fetch hooks as well as potential methods fetching "runtime data".

Packaging Our Methods

At this stage it may appear to you that our refactored methods are already quite looking like endpoint handlers. That's the point! Now we need to extract them to a dedicated data layer package. It can take the shape of a directory inside our Nuxt.js application. Keeping the same example we can create ~/datalayer/pages/blog/_slug.js:

import getDataFromCMS from "getDataFromCMS";
import timestampToHuman from "timestampToHuman";
import markdownToHTML from "markdownToHTML";

export const handler = async (context) => {
  const { title, published_date, body } = await getDataFromCMS(context.params.slug);

  return {
    title,
    published_date: timestampToHuman(published_date),
    body: markdownToHTML(body)
  };
};

Notice that we are importing our loaders and parsers in it because we are outside our Vue.js application. That's where we actually "'extract" them from the client application, the whole point of this approach!

Once we are done doing that for all our methods we should end up with a beautiful data layer package, gracefully decoupled from our client application. Now we need to consume it! But not through a simple import as it will ruin all our efforts made so far. We need to use it through a clever mix of the two following methods:

The webpack Chunks Way

This is probably the most convenient way to consume our data layer in our application: webpack chunks. Nuxt.js makes use of webpack under the hood to bundle our Vue.js applications. So what are webpack chunks? They are a means to achieve code splitting by creating separate bundles, chunks. These will allow our application to lazy load parts of our data layer only when it has to fallback to SPA mode.

Feels like something really obscure? In fact, it's not as complicated as it sounds to perform. Assuming we exported our handlers inside the ~/datalayer folder of our application, like in our previous example, we could then update our asyncData hook like this:

export default {
  async asyncData(context) {
    const { handler } = await import("~/datalayer/pages/blog/_slug");

    return await handler(context);
  },
};

The import keyword will tell webpack to create a dedicated chunk for our imported handler. This chunk can be named and tuned using magic comments if you want to but will work as expected out of the box. And that's it! The job is done. Our data layer will only get loaded when actually running through the asyncData method, saving most of our users from downloading it. Awesome!

Please take note that here we are forwarding context to our handler function, this can be helpful for dynamic routes. Although if you use loaders or parsers that were injected inside the Nuxt.js context please bear in mind that those would still end up in the global payload. With the data layer approach, we have to get our loaders and parsers on the data layer side only. Nuxt Content users, I don't know how to help you with it~

The API Way

The second way we can access our data layer is through a dedicated API. This can be a little bit more complicated to achieve as we need something that is available when developing, at build time, and in production. Lambda functions can be an answer, or a simple Express server we run at build time. Fun fact, that's what Nuxt.js was doing when generating its documentation at some point. Debbie then came with @nuxt/content and everything went brrr~ 💚

When our API is properly set up, the last thing we need to do is to update our asyncData and fetch hooks, as well as potential runtime data methods. Those now need to perform simple network calls to our brand new data layer API. With the asyncData hook it can look like the following:

export default {
  async asyncData(context) {
    const response = await fetch(`https://example.com/datalayer/pages/blog/${context.params.slug}`);

    return await response.json();
  },
};

The API way may sound quite overcomplicated at first glance. Although because the client will never have to load our handlers it also comes with the benefit of hiding our API tokens. This may be something you were looking for for some part of your data layer.

That's it! We added a new layer to our application and learned how to consume it either through lazy loading its code or from a distant API. It manages all the data for it and provides us a way to extract libraries from the client payload, further allowing us to lighten it. We achieved the data layer approach.

How It's Working on lihbr.com

Last week I released my blog (this very site) code on GitHub. It uses this data layer approach which allowed me to greatly reduce its client JavaScript bundle. Thanks to it, it only ships 39 kB of gzipped libraries, which could easily be brought down to 20ish kB if dropping Sentry usage. The total bundle being around 160 kB.

To achieve those numbers it entirely relies on the API way we discussed above. This came with the challenge to find a way to make its data layer available in development, at build time, and in production. To meet it I leveraged one of my favorite Nuxt.js features: Generate Routes Payload (Sébastien, if you are reading this, please don't drop it in Nuxt.js 3) This feature allows us to define all our routes with their related payload (asyncData output) at once, which I prefer anyway over the new crawler (as I am definitely a control freak) Please note though that this works for me as I am not relying on fetch hooks. Thanks to this feature I was able to serve my data layer through these ways:

  • In development ($ nuxt): my data layer is served through generate.routes.payload thanks to a module of mine that makes payloads available in development.
  • At build time ($ nuxt generate): my data layer is served through generate.routes.payload, this is the intended usage of this feature.
  • In production my data layer is served through Netlify Functions, which are only used for form submissions and previewing content.

And that's how the magic happened for me~ (ノ◕ヮ◕)ノ*:・゚✧

So, Is the Data Layer Right for You?

Maybe not. Let's be frank, this approach is still quite an extreme optimization you can make to your application. Meanwhile, after analyzing some Nuxt.js websites I'm definitely convinced that this approach can make a lot of sense. In some cases I noticed that it can reduce the client payload by up to 50%! So... that's something you can consider if you really need it!

On another note, I think the future holds nice things for achieving the data layer approach. Nuxt.js functions could provide an easy way to make our data layer accessible through the API way in every environment. About webpack chunks, perhaps Nuxt.js could automagically extract asyncData and fetch hooks into dedicated chunks. This would allow everyone to benefit from it, mostly for free, and achieving the data layer approach will then only be about moving parsers to those hooks.

Finally, I'm not expecting anyone to switch to this data layer approach, but I wanted to share this way I built my website with. I hope it was interesting for you to think about it and that it made some sense. Please reach out to me on Twitter if you want to talk more about it!

Keep trying new things while coding, thanks for reading~

Edit (11/21/2020): Hey! Since I released this article, I've been invited by Tim Benniks to talk about the data layer approach in a video format! Check it out on YouTube~

Like what you read?
Un árbol en forma de silla donde la gente se hace ahí para esperar