Daniel Worsnup

How to Build a Simple Markdown Plugin for Your Gatsby Site

September 29, 2019

The majority of software development boils down to automating tasks and processes that would otherwise consume valuable time, require manual effort, and be prone to accidental errors. Whenever you find yourself repeating some task or process repeatedly, there are a few questions you should immediately start asking yourself:

  1. Can this be automated?
  2. Is it worth the time to automate this?
  3. Is it worth the financial investment (if any) to automate this?

Most of the time, the answer to all of the above will be an astounding yes. There’s rarely a reason to waste time doing something that a computer can do for you much more quickly and with a much lower risk of error. Let’s consider a few examples of common automations that many of us rely on today:

  • JavaScript transpilers, such as Babel and TypeScript. These enable us to write modern JavaScript code with bleeding edge language features and maintain confidence that our code will run properly in a broad range of browsers and browser versions.
  • Development tools, such as IDEs, IntelliSense, bundlers, debuggers, linters, and formatters. This category speaks for itself.
  • Automated testing, which enables us to deploy our changes with confidence that core user flows have not been impacted.
  • Continuous integration and delivery, which enable us build, test, and deploy code to production at the push of a single button.

Because we’ve become so accustomed to the value of automations such as these, it can be hard to imagine our lives without them. But this is how it used to be! Let’s try not to take these things for granted, and instead remind ourselves that we are fortunate to be develop software in today’s technological ecosystem.

With all of this in mind, it’s important to note that there are legitimately good reasons for not building an automation, such as when:

  • You have higher priority work that requires more immediate attention, such as critical bug fixes or an imminent feature work deadline.
  • You aren’t sure yet if the task or process will be needed long-term and need more time to verify that it will be.

Let me know in the comments or on Twitter of other examples of powerful automations that we tend to take for granted, or other good reasons for postponing an automation!

The remainder of this post focuses on a simple automation I built for my Gatsby blog. We’ll take a look at why I built it, what it does, and how it works!

Cross-Posting Blogs

If you are a blogger then you are more than familiar with the struggle for visibility. You want people to read your work (why else would you have created it?), but there are a lot of other bloggers out there, some of which have the advantage of being sponsored by well-known, respected organizations. Though our motives may differ, we all share a common desire to grow a readership, and so we are forced to figure out how to survive in a competitive market.

Making your content accessible in more places than just your personal website is an easy way to increase your visibility, and you can do so by cross-posting to other blogging platforms, such as DEV and Medium. These platforms let you categorize and tag your content and will automatically point interested readers in your direction. On Medium you can even get paid!

Note: When cross-posting, be sure use canonical URLs to point back to the original content on your website so that search engines don’t penalize you for duplicate content.

If your personal blog is built with Gatsby’s blog starter (as mine is), you can cross-post to DEV extremely easily. Both platforms are powered by Markdown (for post content) and Front Matter (for post metadata), and although there are a few adjustments necessary, it’s about as close to 1:1 as it gets. One notable difference, however, is the syntax each platform uses for third-party video embeds.

Embedded Videos

Adding videos to your content can be extremely useful for conveying information that is hard to capture in another form, such as text or images. Here’s a somewhat meta video demonstrating how it looks in one of my blog posts:

Out-of-the-box, Gatsby’s blog starter does not support video embeds, but it’s easy to add support by installing the gatsby-remark-embed-video plugin. With this plugin, you can embed videos into your posts using the following syntax:

# An Awesome Video

Check out this awesome video:

`youtube: 12345abcde`

This will embed the Youtube video with ID 12345abcde. On DEV, however, embedding the same Youtube video is done like this:

# An Awesome Video

Check out this awesome video:

{% youtube 12345abcde %}

This is because DEV’s third party embed syntax is based on Liquid’s templating language, which DEV also supports in their Markdown.

A Few Solutions

As with any problem, there are multiple approaches we could take to solve this issue. Two main ideas came to my mind:

  1. Write all video embeds using gatsby-remark-embed-video syntax. Before cross-posting, go through and update all video embeds to use DEV syntax. These updates could be made manually, but it would be better to automate this with a Regex find/replace, which would mitigate the risk of errors.
  2. Write all video embeds using DEV syntax and figure out how to support this syntax in a Gatsby blog.

Option 2 is better for a few reasons:

  1. The embed syntax becomes consistent across both platforms.
  2. No extra update step is needed when preparing a blog post for cross-posting, which both saves time and prevents errors in the future.
  3. I get to learn how to write a Gatsby markdown plugin!

This brings us to the meat of the post: building a custom plugin that lets us embed Youtube videos using DEV’s embed syntax. Before diving into the implementation, let’s first briefly look at how Gatsby works with your markdown source files.

Gatsby and Markdown

Thanks to Gatsby’s flexible plugin architecture, populating a blog from markdown source files is a breeze. For a detailed tutorial on how to do this, check out Creating a Blog with Gatsby. There are a few core plugins involved, and the remainder of this post assumes that these plugins are installed and configured:

By default, gatsby-transformer-remark’s HTML output isn’t much more than a 1:1 representation of the Markdown input, for example:

  • An h1 for each #
  • An li for each 1. or * in a list
  • A wrapping ol/ul around each set of lis

For most types of blog post content this is exactly what we want, but there are situations where the compiled HTML needs to be either 1) more sophisticated or 2) changed completely. Our Youtube video embed case is an example of the latter, but let’s briefly take a look at an example of the former! Consider the following Markdown, which renders an image with some alt text:

![I'm the alt text](my-amazing-image.png)

By default, gatsby-transformer-remark will produce the following HTML output for this Markdown input:

<p>
  <img src="my-amazing-image.png" alt="I'm the alt text">
</p>

Whilst this output is completely functional, it isn’t optimized for the modern web. Instead of producing a simple img tag with a single src attribute, it would be much better to produce a fully responsive image, complete with srcset and sizes that will ensure the best experience for a broad range of devices:

<p>
  <img
    srcset="my-amazing-image-320w.jpg 320w,
            my-amazing-image-480w.jpg 480w,
            my-amazing-image-800w.jpg 800w"
    sizes="(max-width: 320px) 280px,
            (max-width: 480px) 440px,
            800px"
    src="my-amazing-image-800w.jpg"
    alt="I'm the alt text">
</p>

Responsive images are far superior to simple images.

Instead of the developer deciding at implementation time which image to load, responsive images enable the browser to decide at run time.

The reason we want to hand off this decision to the browser is because the browser is in the best position to make it! The browser has the most information about the user’s browsing context and can take into account any or all of the following factors:

  • Screen size
  • Device orientation
  • Pixel density
  • Current network conditions
  • The user’s current data-saver preferences

Ultimately, the user benefits. However, getting Gatsby to render a functional responsive image requires not only an extra transformation step when compiling Markdown to HTML, but also an image processing step that produces all of the required image sizes.

How can we do this? With more plugins!

Plugins Within Plugins

Customizing the behavior of gatsby-transformer-remark requires hooking into its internals. Luckily for us, the gatsby-transformer-remark plugin itself can be customized with plugins! For example, we can easily solve the responsive image problem by leveraging the great gatsby-remark-images plugin. In addition to providing the srcset and sizes attributes and resizing the original image, it also renders an elastic container to prevent layout jumps and supports the “blur up” placeholder loading effect. Amazing!

With all of our responsive image needs not only met but exceeded, we can return our focus to the Youtube video embed problem.

How a gatsby-transformer-remark Plugin Works

Before jumping into the code for our custom plugin, we need to know a bit more about how plugins for gatsby-transformer-remark work.

Plugins written for gatsby-transformer-remark define additional transformations that should be applied to your Markdown before it gets compiled to the final HTML that is rendered on your Gatsby site.

Thankfully, we don’t have to apply these transformations to raw Markdown source strings, which would be messy and unperformant.

Abstract Syntax Trees

gatsby-transformer-remark does the heavy lifting of parsing the raw Markdown source strings into Abstract Syntax Trees, or ASTs. If you aren’t familiar with the concept of an AST, don’t be intimidated! It’s just a fancy name for a simple idea:

An Abstract Syntax Tree is a tree representation of a source code string. Each node in the tree represents a construct from the source code.

An AST begins as a 1:1 reflection of the source code string from which it was built, meaning that it could be traversed and compiled back into the original string if needed. Sometimes it’s useful to operate on an unaltered AST. For example, our good friend ESLint examines your source code’s unaltered AST—rather than the source code itself—for issues. Other times it’s useful to have ASTs undergo mutating transformations to produce new ASTs, which are no longer equivalent to the original source. For example, many compilers will automatically optimize code by identifying and fixing parts of the code’s AST that can perform more efficiently.

Our plugin is an example of scenario #2. We want to transform our Markdown ASTs in such a way that instances of the Youtube video embed string are replaced with embed HTML for the specified video.

Markdown ASTs

Internally, gatsby-transformer-remark uses the remark processor to build ASTs that comply with the MDAST spec (short for Markdown Abstract Syntax Tree). Among other things, this spec defines the various node types that can exist in a Markdown AST, such as image, text and inlineCode. Consider the following Markdown:

I'm a paragraph containing `inline code`!

The resulting MDAST tree is as follows (with some irrelevant metadata removed for brevity):

{
  "type": "root",
  "children": [
    {
      "type": "paragraph",
      "children": [
        {
          "type": "text",
          "value": "I'm a paragraph containing "
        },
        {
          "type": "inlineCode",
          "value": "inline code"
        },
        {
          "type": "text",
          "value": "!"
        },
      ]
    }
  ]
}

Notice how the nodes in the AST map directly to the constructs in the Markdown source: text -> inlineCode -> text, nested together under a paragraph.

Writing a transformer plugin for gatsby-transformer-remark boils down to traversing Markdown ASTs (such as the one above) and making changes to relevant nodes. Our Youtube video embed plugin simply needs to do the following:

  1. Traverse the AST looking for nodes of type text
  2. If the node’s value matches the DEV video embed syntax, transform it!

Now that we have an idea of how a gatsby-transformer-remark plugin works and what ours needs to do, let’s jump into the implementation!

Building the Plugin

The Gatsby docs do a great job of explaining how to create custom plugins. For simplicity, the plugin we build here will only support Youtube video embeds, but a fun open source project would be a plugin that supports all of DEV’s third party embed tags (they have a lot!) and possibly even the Liquid templating language. You heard it here first!

We’ll create our embed plugin as a local plugin, meaning that it is scoped to a specific Gatsby project and lives in the project’s repository under the plugins directory of the project root. Create a directory for the Youtube video embed plugin:

cd path/to/gatsby/project
mkdir plugins # If it doesn't exist already
cd plugins
mkdir youtube-video-embed

The only files needed to create a plugin are package.json and index.js. Create these in the plugin directory:

cd youtube-video-embed
npm init # You can accept all of the default values
touch index.js

A plugin for gatsby-transformer-remark is simply a function that receives a Markdown AST as a parameter and alters it. Once configured, gatsby-transformer-remark will invoke this function once for each Markdown node, and recall that gatsby-source-filesystem produces a Markdown node for each Markdown source file in our project.

We’ll implement our plugin function in index.js. Let’s open it for editing and add the following code:

module.exports = ({ markdownAST }) => {
  console.log('video embed!', JSON.stringify(markdownAST))
}

Notice how markdownAST can be conveniently destructed from the first parameter to the plugin function. To make sure things are working, we’re currently just logging the AST of each Markdown node to the build console.

Configuring the Plugin

Next, we need to configure our Gatsby project to run the new plugin, which we accomplish by listing the plugin in gatsby-config.js. Since this is a plugin for gatsby-transformer-remark—not Gatsby itself—we list it under gatsby-transformer-remark’s own plugin list:

module.exports = {
  /* ... */
  plugins: [
    /* ... */
    {
      resolve: 'gatsby-transformer-remark',
      options: {
        plugins: [
          'youtube-video-embed'
        ],
      },
    },
    /* ... */
  ]
}

Note: Our plugin will soon render an iframe that loads the embedded video. Because of this, if your project also relies on the gatsby-remark-responsive-iframe plugin, you have to list our plugin first:

plugins: [
  'youtube-video-embed',
  'gatsby-remark-responsive-iframe'
]

With the configuration change in place, you should be able to run your Gatsby site (npm run dev) and see the AST of each Markdown node logged to the build console. If so, things are working! Now let’s make the plugin do something useful.

Traversing the AST

As mentioned earlier, we need to search markdownAST for nodes of type text so that we can transform them. We could write our own loops and recursion to do this, but instead let’s have the unist-util-visit library do it for us:

npm i unist-util-visit

This library exposes a visit function, which allows us traverse a Markdown AST by specifying the following:

  1. The type of node we want to visit (text), and
  2. A function to be called once for each node of the specified type

In index.js, import the library and call visit:

const visit = require(`unist-util-visit`)

module.exports = ({ markdownAST }) => {
  visit(markdownAST, 'text', (node) => {
    // We're at a text node!
  })
}

The next step is to check the value of each visited text node to see if it matches DEV’s Youtube video embed syntax. Recall that this syntax is {% youtube 12345abcde %}, where 12345abcde is the ID of the Youtube video to embed.

Let’s define a simple regular expression that matches the syntax and use it to check node.value. Each time we find a match, we log the video ID (which we can get from a match group) to the console:

const YOUTUBE_REGEX = /^{% youtube (\w+) %}$/

module.exports = ({ markdownAST }) => {
  visit(markdownAST, 'text', (node) => {
    const match = YOUTUBE_REGEX.exec(node.value)
    if (match) {
      console.log('Found one! The ID is', match[1])
    }
  })
}

Assuming that you have added some Youtube video embeds to your Markdown files, you should see the video IDs logged to the console when you run this:

Found one! The ID is zcjuXR8obvI
Found one! The ID is Q2CNno4JuJM
Found one! The ID is CpYLXl0Rm74

We’re so close! We have identified the text nodes we care about, and all that remains is to transform them.

Transforming the AST

Remember earlier when we discussed mutating ASTs? That’s exactly what we’re going to do! The MDAST spec defines an html node type that represents raw HTML within a Markdown source file. Let’s change the type of each video embed node to html and change the value to an HTML string that defines an iframe pointing to the embedded Youtube video:

module.exports = ({ markdownAST }) => {
  visit(markdownAST, 'text', (node) => {
    const match = YOUTUBE_REGEX.exec(node.value)
    if (match) {
      const videoId = match[1]

      node.type = 'html'
      node.value = `
        <iframe
          type="text/html"
          width="640"
          height="360"
          frameborder="0"
          src="https://www.youtube.com/embed/${videoId}"
        ></iframe>
      `
    }
  })
}

And that’s it! If you rebuild your Gatsby site and open it in the browser, you will see embedded Youtube videos powered by the DEV embed syntax.

Possible Improvements

The final plugin implementation above is intentionally minimal in order to be as digestible as possible, but there are many improvements that could be made when integrating this into your project. Here are a few thoughts:

  • The HTML string only passes a few parameters to the Youtube iframe player, but there are a variety of parameters you can use to configure the embedded video to your liking.
  • Whilst this implementation works fine for only supporting Youtube video embeds, it doesn’t scale well in its current state. To expand support to other video providers (such as Vimeo or Twitch) or other embed types (such as code snippets, music, tweets, etc), we would want to build a more generic system in which provider-specific details and behavior are abstracted. To see an example, check out the source code for gatsby-remark-embed-video, which I mentioned earlier in this post.
  • This implementation will transform text nodes anywhere in the Markdown AST, but we probably only want the transformation to apply to top-level paragraphs that only contain a single text node.
  • This implementation does not allow different video embed instances to be customized. For example, we may want some videos to autoplay and others to loop infinitely. In order to enable this, we’d want to extend the embed syntax to allow different parameters to be specified: {% youtube 12345abcde loop=true autoplay=false %}, for example.

Beyond Gatsby Plugins

One of my favorite things about being a programmer is having the ability to make my life and others’ lives easier by automating processes with code. Building automations not only saves precious time, but also prevents the silly errors often caused by manual grunt work on repetitive tasks. There are few better feelings than experiencing the fruits of these efforts and then fondly thanking the you-of-the-past for anticipating their value and setting aside time to build them.

The simple Markdown transformer plugin we built is just a small example of this. With the plugin in place, we can embed Youtube videos in our Gatsby blogs and cross-post to DEV without thinking twice about whether the video embeds will work properly.

Tell me in the comments or on Twitter about an automation that you have written for yourself or have shared with the world!

Happy coding!

Thanks for reading!

I’ve been planning on starting a tech blog for several years now, which is a feeling that I’m sure some of you can relate to. Only recently have I started taking it seriously, and so far the reception has been very positive. I’d like to thank everyone for reading, liking, commenting, and (re)tweeting, etc. Let me know what I can do to help you on your journey!


Written by Daniel Worsnup, a Software Engineer. All my links.