I need a function to determine whether or not a string has HTML in it or not so that I can know whether I'm dealing with a plain-text format or HTML format.
It seems simple enough in C#, using HTMLAgilityPack. Recursively go through the tree of nodes, and if any are an element node (or comment too) then we say "Yes, it's HTML"
public static class HTMLUtility
{
public static bool ContainsHTMLElements(string text)
{
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(text);
bool foundHTML = NodeContainsHTML(doc.DocumentNode);
return foundHTML;
}
private static bool NodeContainsHTML(HtmlNode node)
{
return node.NodeType == HtmlNodeType.Element
|| node.NodeType == HtmlNodeType.Comment
|| node.ChildNodes.Any(n => NodeContainsHTML(n));
}
}
Am I missing anything? Thanks!
text
be just any HTML element, or is it supposed to be an entire HTML document? – Ron Beyer Nov 17 '16 at 20:14<
. – t3chb0t Nov 17 '16 at 20:32