Migrating to React land: Gatsby

By GIXnews
Migrating to React land: Gatsby

Migrating to React land: Gatsby

I am an engineer that loves docs. Well, OK, I don’t love all docs but I believe docs are a crucial, yet often neglected element to a great developer experience. I work on the developer experience team for Cloudflare Workers focusing on several components of Workers, particularly on the docs that we recently migrated to Gatsby.

📣 We’ve moved the Cloudflare Workers docs to @gatsbyjs

The new documentation is…

🏃‍♀️ faster
⭐️ more accessible
🎁 a perfect foundation for the redesign later this year
🏗️ open-source

shout out to @exvuma for this incredible work 💪🙌https://t.co/k3huvCvash pic.twitter.com/MBWxVtlrin

— Cloudflare Developers (@CloudflareDev) March 4, 2020

Through porting our documentation site to Gatsby I learned a lot. In this post, I share some of the learnings that could’ve saved my former self from several headaches. This will hopefully help others considering a move to Gatsby or another static site generator.

Why Gatsby?

Prior to our migration to Gatsby, we used Hugo for our developer documentation. There are a lot of positives about working with Hugo – fast build times, fast load times – that made building a simple static site a great use case for Hugo. Things started to turn sour when we started making our docs more interactive and expanding the content being generated.

Going from writing JSX with TypeScript back to string-based templating languages is difficult. Trying to perform complicated tasks, like generating a sidebar, cost me – a developer who knows nothing about liquid code or Go templating (though with Golang experience) – several tears not even to implement but to just understand what was happening.

Here is the code to template an item in the sidebar in Hugo:

<!-- templates -->
{{ define "section-tree-nav" }}
{{ $currentNode := .currentnode }}
{{ with .sect }}
 {{ if not .Params.Hidden }}
  {{ if .IsSection }}
    {{safeHTML .Params.head}}
    <li data-nav-id="{{.URL}}" class="dd-item
        {{ if .IsAncestor $currentNode }}parent{{ end }}
        {{ if eq .UniqueID $currentNode.UniqueID}}active{{ end }}
        {{ if .Params.alwaysopen}}parent{{ end }}
        {{ if .Params.alwaysopen}}always-open{{ end }}
        ">
      <a href="{{ .RelPermalink}}">
        <span>{{safeHTML .Params.Pre}}{{.Title}}{{safeHTML .Params.Post}}</span>
 
        {{ if .Params.new }}
          <span class="new-badge">NEW</span>
        {{ end }}
 
        {{ $numberOfPages := (add (len .Pages) (len .Sections)) }}
        {{ if ne $numberOfPages 0 }}
 
          {{ if or (.IsAncestor $currentNode) (.Params.alwaysopen)  }}
            <i class="triangle-up"></i>
          {{ else }}
            <i class="triangle-down"></i>
          {{ end }}
 
        {{ end }}
      </a>
      {{ if ne $numberOfPages 0 }}
        <ul>
          {{ .Scratch.Set "pages" .Pages }}
          {{ if .Sections}}
          {{ .Scratch.Set "pages" (.Pages | union .Sections) }}
          {{ end }}
          {{ $pages := (.Scratch.Get "pages") }}
 
        {{ if eq .Site.Params.ordersectionsby "title" }}
          {{ range $pages.ByTitle }}
            {{ if and .Params.hidden (not $.showhidden) }}
            {{ else }}
            {{ template "section-tree-nav" dict "sect" . "currentnode" $currentNode }}
            {{ end }}
          {{ end }}
        {{ else }}
          {{ range $pages.ByWeight }}
            {{ if and .Params.hidden (not $.showhidden) }}
            {{ else }}
            {{ template "section-tree-nav" dict "sect" . "currentnode" $currentNode }}
            {{ end }}
          {{ end }}
        {{ end }}
        </ul>
      {{ end }}
    </li>
  {{ else }}
    {{ if not .Params.Hidden }}
      <li data-nav-id="{{.URL}}" class="dd-item
     {{ if eq .UniqueID $currentNode.UniqueID}}active{{ end }}
      ">
        <a href="{{.RelPermalink}}">
        <span>{{safeHTML .Params.Pre}}{{.Title}}{{safeHTML .Params.Post}}</span>
        {{ if .Params.new }}
          <span class="new-badge">NEW</span>
        {{ end }}
 
        </a></li>
     {{ end }}
  {{ end }}
 {{ end }}
{{ end }}
{{ end }}

Whoa. I may be exceptionally oblivious, but I had to squint at the snippet above for an hour before I realized this was the code for a sidebar item (the li element was the eventual giveaway, but took some parsing to discover where the logic actually started).

(Disclaimer: I am in no way a pro at Hugo and in any situation there are always several ways to code a solution; thus I am in no way claiming this was the only way to write the template nor am I chastising the author of the code. I am just displaying the differences in pieces of code I came across)

Now, here is what the TSX (I will get into the JS later in the article) for the Gatsby project using the exact same styling would look like:

 <li data-nav-id={pathToServe} className={'dd-item ' + ddClass}>
   <Link className="" to={pathToServe} title="Docs Home" activeClassName="active">
     {title || 'No title'}
     {numberOfPages ? <Triangle isAncestor={isAncestor} alwaysopen={showChildren} /> : ''}
     {showNew ? <span className="new-badge">NEW</span> : ''}
   </Link>
   {showChildren ? (
     <ul>
       {' '}
       {myChildren.map((child: mdx) => {
         return (
           <SidebarLi
             frontmatter={child.frontmatter}
             fields={child.fields}
             depth={++depth}
             key={child.frontmatter.title}
           />
         )
       })}
     </ul>
   ) : (
     ''
   )}
 </li>

This code is clean and compact because Gatsby is a static content generation tool based on React. It’s loved for a myriad of reasons, but my honest main reason to migrate to it was to make the Hugo code above much less ugly.

For our purposes, less ugly was important because we had dreams of redesigning our docs to be interactive with support for multiple coding languages and other features.

For example, the template gallery would be a place to go to for how-to recipes and examples. The templates themselves would live in a template registry service and turn into static pages via an API.

We wanted the docs to not be constrained by Go templating. The Hugo docs admit their templates aren’t the best for complicated logic:

Go Templates provide an extremely simple template language that adheres to the belief that only the most basic of logic belongs in the template or view layer.

Gatsby and React enable the more complex logic we were looking for. After our team built workers.cloudflare.com and Built with Workers on Gatsby, I figured this was my shot to really give Gatsby a try on our Workers developer docs.

Decision to Migrate over Starting from Scratch

I’m normally not a fan of fixing things that aren’t broken. Though I didn’t like working with Hugo, did love working in React, and had all the reasons to. I was timid about being the one in charge of switching from Hugo. I was scared. I hated looking at the liquid code of Go templates. I didn’t want to have to port all the existing templates to React without truly understanding what I might be missing.

There comes a point with tech debt though where you have to tackle the tech debt you are most scared of.

The easiest solution would be of course to throw the Hugo code away. Start from scratch. A clean slate. But this means taking something that was not broken and breaking it. The styling, SEO, tagging, and analytics of the site took small iterations over the course of a few years to get right and I didn’t want to be the one to break them. Instead of throwing all the styling and logic tied in for search, SEO, etc…, our plan was to maintain as much of the current design and logic as possible while converting it to React piece-by-piece, component-by-component.

Also there were existing developer docs still using Hugo on Cloudflare by other teams (e.g. Access, Argo Tunnel, etc…). I wanted a team at Cloudflare to be able to import their existing markdown files with frontmatter into the Gatsby repo and preserve the existing design.

I wanted to migrate instead of teleport to Gatsby.

How-to: Hugo to Gatsby

In this blog post, I go through some but not all of the steps of how I ported to Gatsby from Hugo for our complex doc site. The few examples here help to convey the issues that caused the most pain.

Let’s start with getting the markdown files to turn into HTML pages.

Markdown

One goal was to keep all the existing markdown and frontmatter we had set up in Hugo as similar as possible. The reasoning for this was to not break existing content and also maintain the version history of each doc.

Gatsby is built on top of GraphQL. All the data and most all content for Gatsby is put into GraphQL during startup likely via a plugin, then Gatsby will query for this data upon actual page creation. This is quite different from Hugo’s much more abstract model of putting all your content in a folder named content and then Hugo figures out which template to apply based on the logic in the template.

MDX is a sophisticated tool that parses markdown into Gatsby so it can later be represented as HTML (it actually can do much more than that but, I won’t get into it here). I started with Gatsby’s MDX plugin to create nodes from my markdown files. Here is the code to set up the plugin to get all the markdown files (files ending in .md and .mdx) I had in the src/content folder into GraphQL:

gatsby-config.js

const path = require('path')
 
module.exports = {
 plugins: [
   {
     resolve: `gatsby-source-filesystem`,
     options: {
       name: `mdx-pages`,
       path: `${__dirname}/src/content`,
       ignore: [`**/CONTRIBUTING*`, '/styles/**'],
     },
   },
   {
     resolve: `gatsby-plugin-mdx`,
     options: {
       extensions: [`.mdx`, `.md`],
     },
   }, 
]}

Now that Gatsby knows about these files as nodes, we can create pages for them. In gatsby-node.js, I tell Gatsby to grab these MDX pages and use a template markdownTemplate.tsx to create pages for them:

const path = require(`path`)
const { createFilePath } = require(`gatsby-source-filesystem`)
exports.createPages = async ({ actions, GraphQL, reporter }) => {
 const { createPage } = actions
 
 const markdownTemplate = path.resolve(`src/templates/markdownTemplate.tsx`)
 
 result = await GraphQL(`
   {
     allMdx(limit: 1000) {
       edges {
         node {
           fields {
             pathToServe
           }
           frontmatter {
             alwaysopen
             weight
           }
           fileAbsolutePath
         }
       }
     }
   }
 `)
 // Handle errors
 if (result.errors) {
   reporter.panicOnBuild(`Error while running GraphQL query.`)
   return
 }
 result.data.allMdx.edges.forEach(({ node }) => {
   return createPage({
     path: node.fields.pathToServe,
     component: markdownTemplate,
     context: {
       parent: node.fields.parent,
       weight: node.frontmatter.weight,
     }, // additional data can be passed via context, can use as variable on query
   })
 })
}
exports.onCreateNode = ({ node, getNode, actions }) => {
 const { createNodeField } = actions
 // Ensures we are processing only markdown files
 if (node.internal.type === 'Mdx') {
   // Use `createFilePath` to turn markdown files in our `content` directory into `/workers/`pathToServe
   const originalPath = node.fileAbsolutePath.replace(
     node.fileAbsolutePath.match(/.*content/)[0],
     ''
   )
   let pathToServe = createFilePath({
     node,
     getNode,
     basePath: 'content/',
   })
   let parentDir = path.dirname(pathToServe)
   if (pathToServe.includes('index')) {
     pathToServe = parentDir
     parentDir = path.dirname(parentDir) // "/" dirname will = "/"
   }
   pathToServe = pathToServe.replace(//+$/, '/') // always end the path with a slash
   // Creates new query'able field with name of 'pathToServe', 'parent'..
   // for allMdx edge nodes
   createNodeField({
     node,
     name: 'pathToServe',
     value: `/workers${pathToServe}`,
   })
   createNodeField({
     node,
     name: 'parent',
     value: parentDir,
   })
   createNodeField({
     node,
     name: 'filePath',
     value: originalPath,
   })
 }
}

Now every time Gatsby runs, it starts running through each node on onCreateNode. If the node is MDX, it passes the node’s content (the markdown, fileAbsolutePath, etc.) and all the node fields (filePath, parent and pathToServe) to the markdownTemplate.tsx component so that the component can render the appropriate information for that markdown file.

The barebone component for a page that renders a React component from the MDX node looks like this:

markdownTemplate.tsx

import React from "react"
import { graphql } from "gatsby"
import { MDXRenderer } from "gatsby-plugin-mdx"
 
export default function PageTemplate({ data: { mdx } }) {
 return (
   <div>
     <h1>{mdx.frontmatter.title}</h1>
     <MDXRenderer>{mdx.body}</MDXRenderer>
   </div>
 )
}
 
export const pageQuery = graphql`
 query BlogPostQuery($id: String) {
   mdx(id: { eq: $id }) {
     id
     body
     frontmatter {
       title
     }
   }
 }
`

A Complex Component: Sidebar

Now let’s get into where I wasted the most time, but learned hard lessons upfront: turning the Hugo template into a React component. At the beginning of this article, I showed that scary sidebar.

To set up the li element we had the Hugo logic looks like:

{{ define "section-tree-nav" }}
{{ $currentNode := .currentnode }}
{{ with .sect }}
 {{ if not .Params.Hidden }}
  {{ if .IsSection }}
    {{safeHTML .Params.head}}
    <li data-nav-id="{{.URL}}" class="dd-item
        {{ if .IsAncestor $currentNode }}parent{{ end }}
        {{ if eq .UniqueID $currentNode.UniqueID}}active{{ end }}
        {{ if .Params.alwaysopen}}parent{{ end }}
        {{ if .Params.alwaysopen}}always-open{{ end }}
        ">

I see that the code is defining some section-tree-nav component-like thing and taking in some currentNode. To be honest, I still don’t know exactly what the variables .sect, IsSection, Params.head, Params.Hidden mean. Although I can take a wild guess, they’re not that important for understanding what the logic is doing. The logic is setting the classes on the li element which is all I really care about: parent, always-open and active.

When focusing on those three classes, we can port them to React in a much more readable way by defining a variable string ddClass:

 let ddClass = ''
 let isAncestor = numberOfPages > 0
 if (isAncestor) {
   ddClass += ' parent'
 }
 if (frontmatter.alwaysopen) {
   ddClass += ' parent alwaysOpen'
 }
 return (
   <Location>
     {({ location }) => {
       const currentPathActive = location.pathname === pathToServe
       if (currentPathActive) {
         ddClass += ' active'
       }
       return (
         <li data-nav-id={pathToServe} className={'dd-item ' + ddClass}>

There are actually a few nice things about the Hugo code, I admit. Using the Location component in React was probably less intuitive than Hugo’s ability to access currentNode to get the active page. Also isAncestor is predefined in Hugo as Whether the current page is an ancestor of the given page. For me though, having to track down the definitions of the predefined variables was frustrating and I appreciate the local explicitness of the definition, but I admit I’m a bit jaded.

Children

The most complex part of the sidebar is getting the children. Now this is a story that really gets me starting to appreciate GraphQL.

Here’s getting the children for the sidebar in Hugo:

    {{ $numberOfPages := (add (len .Pages) (len .Sections)) }}
        {{ if ne $numberOfPages 0 }}
 
          {{ if or (.IsAncestor $currentNode) (.Params.alwaysopen)  }}
            <i class="triangle-up"></i>
          {{ else }}
            <i class="triangle-down"></i>
          {{ end }}
 
        {{ end }}
      </a>
      {{ if ne $numberOfPages 0 }}
        <ul>
          {{ .Scratch.Set "pages" .Pages }}
          {{ if .Sections}}
          {{ .Scratch.Set "pages" (.Pages | union .Sections) }}
          {{ end }}
          {{ $pages := (.Scratch.Get "pages") }}
 
        {{ if eq .Site.Params.ordersectionsby "title" }}
          {{ range $pages.ByTitle }}
            {{ if and .Params.hidden (not $.showhidden) }}
            {{ else }}
            {{ template "section-tree-nav" dict "sect" . "currentnode" $currentNode }}
            {{ end }}
          {{ end }}
        {{ else }}
          {{ range $pages.ByWeight }}
            {{ if and .Params.hidden (not $.showhidden) }}
            {{ else }}
            {{ template "section-tree-nav" dict "sect" . "currentnode" $currentNode }}
            {{ end }}
          {{ end }}
        {{ end }}
        </ul>
      {{ end }}
    </li>
  {{ else }}
    {{ if not .Params.Hidden }}
      <li data-nav-id="{{.URL}}" class="dd-item
     {{ if eq .UniqueID $currentNode.UniqueID}}active{{ end }}
      ">
        <a href="{{.RelPermalink}}">
        <span>{{safeHTML .Params.Pre}}{{.Title}}{{safeHTML .Params.Post}}</span>
        {{ if .Params.new }}
          <span class="new-badge">NEW</span>
        {{ end }}
 
        </a></li>
     {{ end }}
  {{ end }}
 {{ end }}
{{ end }}
{{ end }}

This is just the first layer of children. No grandbabies, sorry. And I won’t even get into all that is going on there exactly. When I started porting this over, I realized a lot of that logic was not even being used.

In React, we grab all the markdown pages and see which have parents that match the current page:

 const topLevelMarkdown: markdownRemarkEdge[] = useStaticQuery(
   GraphQL`
     {
       allMdx(limit: 1000) {
         edges {
           node {
             frontmatter {
               title
               alwaysopen
               hidden
               showNew
               weight
             }
             fileAbsolutePath
             fields {
               pathToServe
               parent
               filePath
             }
           }
         }
       }
     }
   `
 ).allMdx.edges
 const myChildren: mdx[] = topLevelMarkdown
   .filter(
     edge =>
       fields.pathToServe === '/workers' + edge.node.fields.parent &&
       fields.pathToServe !== edge.node.fields.pathToServe
   )
   .map(child => child.node)
   .filter(child => !child.frontmatter.hidden)
   .sort(sortByWeight)
 const numberOfPages = myChildren.length

And then we render the children, so the full JSX becomes:

<li data-nav-id={pathToServe} className={'dd-item ' + ddClass}>
   <Link
     to={pathToServe}
     title="Docs Home"
     activeClassName="active"
   >
     {title || 'No title'}
     {numberOfPages ? (
       <Triangle isAncestor={isAncestor} alwaysopen={showChildren} />
     ) : (
       ''
     )}
     {showNew ? <span className="new-badge">NEW</span> : ''}
   </Link>
   {showChildren ? (
     <ul>
       {' '}
       {myChildren.map((child: mdx) => {
         return (
           <SidebarLi
             frontmatter={child.frontmatter}
             fields={child.fields}
             depth={++depth}
             key={child.frontmatter.title}
           />
         )
       })}
     </ul>
   ) : (
     ''
   )}
 </li>

Ok now that we have a component, and we have Gatsby creating the pages off the markdown, I can go back to my PageTemplate component and render the sidebar:

import Sidebar from './Sidebar'
export default function PageTemplate({ data: { mdx } }) {
 return (
   <div>
     <Sidebar />
     <h1>{mdx.frontmatter.title}</h1>
     <MDXRenderer>{mdx.body}</MDXRenderer>
   </div>
 )
}

I don’t have to pass any props to Sidebar because the GraphQL static query in Sidebar.tsx gets all the data about all the pages that I need. I don’t even maintain state because Location is used to determine which path is active. Gatsby generates pages using the above component for each page that’s a markdown MDX node.

Wrapping up

This was just the beginning of the full migration to Gatsby. I repeated the process above for turning templates, partials, and other HTML component-like parts in Hugo into React, which was actually pretty fun, though turning vanilla JS that once manipulated the DOM into React would probably be a nightmare if I wasn’t somewhat comfortable working in React.

Main lessons learned:

  • Being careful about breaking things and being scared to break things are two very different things. Being careful is good; being scared is bad. If I were to complete this migration again, I would’ve used the Hugo templates as a reference but not as a source of truth. Staging environments are what testing is for. Don’t sacrifice writing things the right way to comply with the old way.
  • When doing a migration like this on a static site, get just a few pages working before moving the content over to avoid intermediate PRs from breaking. It seems obvious but, with the large amounts of content we had, a lot of things broke when porting over content. Get everything polished with each type of page before moving all your content over.
  • When doing a migration like this, it’s OK to compromise some features of the old design until you determine whether to add them back in, just make sure to test this with real users first. For example, I made the mistake of assuming others wouldn’t mind being without anchor tags. (Note Hugo templates create anchor tags for headers automatically as in Gatsby you have to use MDX to customize markdown components). Test this on a single, popular page with real users first to see if it matters before giving it up.
  • Even for those with React background, the ramp up with GraphQL and setting up Gatsby isn’t as simple as it seems at first. But once you’re set up it’s pretty dang nice.

Overall the process of moving to Gatsby was well worth the effort. As we implement a redesign in React it’s much easier to apply the designs in this cleaner code base. Also though Hugo was already very performant with a nice SEO score, in Gatsby we are able to increase the performance and SEO thanks to the framework’s flexibility.

Lastly, working with the Gatsby team was awesome and they even give free T-shirts for your first PR!

Source:: CloudFlare