Skip to content

al3xandru/html2md

Repository files navigation

html2md

html2md is a Python script that converts HTML (complete or fragments) into Markdown.

html2md was inspired by Aaron Swartz's html2text and is adding support for missing elements that are common in HTML pages without compromising the Markdown format.

Usage

html2md.py [-h] [-a] [-f] [--fenced_code {github,php}] [-e ENCODING] [infile] Transform HTML file to Markdown positional arguments: infile optional arguments: -h, --help show this help message and exit -a, --attrs Enable element attributes in the output (custom Markdown extension) -f, --footnotes Enabled footnote processing (custom Markdown extension) --fenced_code {github,php}, --fencedcode {github,php}, --fenced {github,php} Enabled fenced code output -e ENCODING, --encoding ENCODING Provide an encoding for reading the input

Using it from your code:

import html2md print html2md.html2md("<p>Getting rid of HTML with html2md. Yey!</p>")

You can pass in different options

  • footnotes: True|False (default False) convert footnotes
  • fenced_code: default|github|php (default: default) convert code snippets into fenced code
  • attrs: convert HTML attributes. This is a custom extension and should not be used.

License

Short version: OK for open source projects. OK for commercial projects with my signed agreement only.

Long version: see the License file in the project.

About

Convert HTML to Markdown. In a peaceful way.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published