Introduction
I wanted a programmatic way to generate and extract 'Table of Contents' HTML snippet from existing markdown text for my Next.js blogging website www.notionworkspaces.com.1
- These are the benefits of this approach:
- You don't have to have a
Table of Contents
section in all your markdown files or HTML files - You can cut-out/extract (or) keep the table of contents after generating them in the content HTML using
cheerio
In this tutorial, I'll teach you about a programmatic way to generate and extract 'Table of Contents' HTML snippet from existing markdown text in Next.js
TLDR; working code snippet here
Original Snippet before modification
I already had a function that converted markdown text to html text using remark
2 library.
export async function getPostData(id) { const fullPath = path.join(postsDirectory, `${id}.md`); const fileContents = fs.readFileSync(fullPath, 'utf8'); // Use gray-matter to parse the post metadata section const matterResult = matter(fileContents); // Use remark to convert markdown into HTML string const processedContent = await remark() .use(html) .process(matterResult.content); const contentHtml = processedContent.toString(); // Combine the data with the id and contentHtml return { id, contentHtml, ...matterResult.data, }; }
This above snippet was taken from the Next.js's official getting started tutorial3.
- What I wanted exactly:
But, it didn't do everything that I wanted. It didn't generate table of contents based on the structure of markdown data.
I googled around and found an existing library called remark-toc
4 but it didn't do exactly what I wanted.
It required a few conditions that I didn't want to entertain.
I later stumbled upon rehype
5 library a more recent take on processing html (also markdown) in Next.js.
The Working Code Snippet
This is final code I use to generate and extract table of contents from my markdown content.
export async function getPostData(id) { const fullPath = path.join(postsDirectory, `${id}.md`); const fileContents = fs.readFileSync(fullPath, 'utf8'); // Use gray-matter to parse the post metadata section const matterResult = matter(fileContents); const file = await unified() .use(remarkParse) .use(remarkRehype) .use(rehypeSlug) .use(rehypeDocument) .use(rehypeFormat) .use(rehypeTOC) .use(rehypeStringify) .process(matterResult.content) // Extract TOC dynamically const $ = cheerio.load(String(file)); const contentTOC = $("nav.toc").html(); $("nav.toc").remove(); const contentHtml = $.html(); // Combine the data with the id and contentHtml return { id, contentHtml, contentTOC, ...matterResult.data, }; }
I used the following imports to get it all working seamlessly,
The import requirements
import { unified } from 'unified' import remarkParse from 'remark-parse' import remarkRehype from 'remark-rehype' import rehypeDocument from 'rehype-document' import rehypeFormat from 'rehype-format' import rehypeStringify from 'rehype-stringify' import rehypeSlug from 'rehype-slug' import rehypeTOC from "@jsdevtools/rehype-toc"; import * as cheerio from 'cheerio';
I used cheerio
6 to build an DOM tree from html text for me to extract the TOC div component using the name nav.tov
and use it as a Table of Contents
snippet I used in my react components.
This is a screenshot of how I used this piece of code on www.notionworkspaces.com.
The dynamic table of contents section, in left section in the above screenshot.
I hope you found this useful!
Top comments (0)