Create a Dynamic Sitemap with Next.js

JW

Jonathan Wong / March 02, 2020

3 min read––– views

To improve your Search Engine Optimization (SEO), you might need to add a sitemap or robots.txt file to your Next.js site.

A sitemap defines the relationship between pages of your site. Search engines utilize this file to more accurately index your site. You can also provide additional information such as last updated time, how frequently the page changes, and more.

A robots.txt file tells search engines which pages or files the crawler can or can't request from your site.

Static Sitemaps#

If your site does not update frequently, you might currently have a static sitemap. This is a basic .xml file defining the content of your site. Here's a simple example:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <url>
        <loc>https://leerob.io</loc>
    </url>
    <url>
        <loc>https://leerob.io/blog</loc>
    </url>
</urlset>

As your site scales, you will probably want to create your sitemap dynamically.

Dynamic Sitemaps#

If your site frequently changes, you should dynamically create a sitemap. Let's first look at an example where your site content is file-based (e.g., contained inside the /pages directory).

First, let's add globby so we can fetch a list of routes.

$ yarn add --dev globby

Next, we can create a Node script at scripts/generate-sitemap.js. This file will dynamically build a sitemap based on your /pages directory.

const fs = require('fs');

const globby = require('globby');
const prettier = require('prettier');

(async () => {
  const prettierConfig = await prettier.resolveConfig('./.prettierrc.js');

  // Ignore Next.js specific files (e.g., _app.js) and API routes.
  const pages = await globby([
    'pages/**/*{.js,.mdx}',
    '!pages/_*.js',
    '!pages/api'
  ]);
  const sitemap = `
        <?xml version="1.0" encoding="UTF-8"?>
        <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
            ${pages
              .map((page) => {
                const path = page
                  .replace('pages', '')
                  .replace('.js', '')
                  .replace('.mdx', '');
                const route = path === '/index' ? '' : path;

                return `
                        <url>
                            <loc>${`https://yoursitehere.com${route}`}</loc>
                        </url>
                    `;
              })
              .join('')}
        </urlset>
    `;

  // If you're not using Prettier, you can remove this.
  const formatted = prettier.format(sitemap, {
    ...prettierConfig,
    parser: 'html'
  });

  fs.writeFileSync('public/sitemap.xml', formatted);
})();

Finally, override your next.config.js to run this script as part of the build process. Your generated file gets created at public/sitemap.xml which then gets moved to the root of your site (requires Next v9.0+).

module.exports = {
  webpack: (config, { isServer }) => {
    if (isServer) {
      require('./scripts/generate-sitemap');
    }

    return config;
  }
};

External Content#

If you have externally hosted data (e.g., a CMS), you'll need to make an API request before you can create your sitemap. This implementation will vary depending on your data source, but the idea is the same. To demonstrate, I've created an example using placeholder data.

First, create a new file at pages/sitemap.xml.js.

import React from 'react';

class Sitemap extends React.Component {
  static async getInitialProps({ res }) {
    const request = await fetch(EXTERNAL_DATA_URL);
    const posts = await request.json();

    res.setHeader('Content-Type', 'text/xml');
    res.write(createSitemap(posts));
    res.end();
  }
}

export default Sitemap;

When the route /sitemap.xml is initially loaded, we will fetch posts from an external data source and then write an XML file as the response.

import React from 'react';

const EXTERNAL_DATA_URL = 'https://jsonplaceholder.typicode.com/posts';

const createSitemap = (posts) => `<?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
        ${posts
          .map(({ id }) => {
            return `
                    <url>
                        <loc>${`${EXTERNAL_DATA_URL}/${id}`}</loc>
                    </url>
                `;
          })
          .join('')}
    </urlset>
    `;

class Sitemap extends React.Component {
  static async getInitialProps({ res }) {
    const request = await fetch(EXTERNAL_DATA_URL);
    const posts = await request.json();

    res.setHeader('Content-Type', 'text/xml');
    res.write(createSitemap(posts));
    res.end();
  }
}

export default Sitemap;

Here's a condensed example of the output.

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <url>
        <loc>https://jsonplaceholder.typicode.com/posts/1</loc>
    </url>
    <url>
        <loc>https://jsonplaceholder.typicode.com/posts/2</loc>
    </url>
</urlset>

Robots.Txt#

Finally, we can create a static file at public/robots.txt to define which files can be crawled and where the sitemap is located.

User-agent: *
Sitemap: https://leerob.io/sitemap.xml
Discuss on TwitterEdit on GitHub
Spotify album cover

/tools/photos