Html Agility Pack (HAP)

This library is sponsorised by Entity Framework Extensions

Downloaded more than
0
times !
 
 // From File var doc = new HtmlDocument(); doc.Load(filePath); // From String var doc = new HtmlDocument(); doc.LoadHtml(html); // From Web var url = "http://html-agility-pack.net/"; var web = new HtmlWeb(); var doc = web.Load(url); 

What's Html Agility Pack?

What's Html Agility Pack?

HAP is an HTML parser written in C# to read/write DOM and supports plain XPATH or XSLT.

What's web scraping in C#?

What's web scraping in C#?

Web scraping is a technique used in any language such as C# to extract data from a website.

Is web scraping legal?

Is web scraping legal?

That's a gray zone! There is no official answer about it, and almost every company has some web scraping program. In short, do polite crawling and don't spam a website and everything will be fine.

When is the v2.x coming?

When is the v2.x coming?

There is no official date, but the work is in progress. A lot of improvement is already planned to make web scraping even easier!

Which 3rd party libraries?

Which 3rd party libraries?

You can enhance HAP with some third party libraries:

Where can I find Html Agility Pack examples?

Where can I find Html Agility Pack examples?

Online examples are now available!

Online Examples

We need your help to support this Html Agility Pack!

Html Agility Pack is FREE and always will be.

However, last year alone, we spent over 3000 hours maintaining our free projects! We need resources to keep developing our open-source projects.

We highly appreciate any contribution!


> 3,000+ Requests answered per year
> $100,000 USD investment per year
> 500 Commits per year
> 100 Releases per year


HTML Parser

Load and parse HTML

HAP - Parser Example
 // From File var doc = new HtmlDocument(); doc.Load(filePath); // From String var doc = new HtmlDocument(); doc.LoadHtml(html); // From Web var url = "http://html-agility-pack.net/"; var web = new HtmlWeb(); var doc = web.Load(url); 

HTML Selectors

Select HtmlNode, Element, and Attributes:

HAP - Selectors Examples
 // With XPath var value = doc.DocumentNode .SelectNodes("//td/input") .First() .Attributes["value"].Value; // With LINQ var nodes = doc.DocumentNode.Descendants("input") .Select(y => y.Descendants() .Where(x => x.Attributes["class"].Value == "box")) .ToList(); 

HTML Manipulation

Manipulate HtmlNode, Element, and Attributes:

HAP - Manipulation Example
 var doc = new HtmlDocument(); doc.LoadHtml(html); // InnerHtml var innerHtml = doc.DocumentNode.InnerHtml; // InnerText var innerText = doc.DocumentNode.InnerText; 

HTML Traversing

Traverse HtmlNode, Element, and Attributes:

HAP - Traversing Example
 var doc = new HtmlDocument(); htmlDoc.LoadHtml(html); // Descendants var nodes = doc.DocumentNode.Descendants("input");