суббота, 19 мая 2018 г.

HtmlDocumentFactory library - Working with HTML documents in .NET

Sometimes there's a need to create or parse HTML documents programmatically. Windows Forms framework in .NET has a nice HtmlDocument class for HTML manupulation. However, it can only work with documents from WebBrowser control, and there's no public constructor to create new HtmlDocument instance from scratch or from arbitrary HTML content. However, it can be done by accessing MSHTML COM interfaces directly and invoking private HtmlDocument constructor via reflection. I have implemented this approach in HtmlDocumentFactory library.

Source code: https://github.com/MSDN-WhiteKnight/HtmlDocumentFactory

Download binaries: https://yadi.sk/d/lPk5bGov3WCXsD

Usage

The library consists of a single static class HtmlLib.HtmlDocumentFactory that provides helper methods for creating and destroying HTML documents and converting them to strings. Use CreateHtmlDocument method to create a new document, modify its content via HtmlDocument methods, use HtmlDocumentToString method to obtain whole HTML string, then call ReleaseHtmlDocument when you no longer need it.