1
vote
2answers
260 views

How to convert HTML file into a hash in Perl?

Is there any simple way to convert a HTML file into a Perl hash? For example a working Perl modules or something? I was search on cpan.org but did'nt find anything what can do what I want. I wanna do ...
0
votes
1answer
356 views

How do I print the text content of HTML::Treebuilder nodes on separate lines?

I'm using TreeBuilder::XPath as shown below: use strict; use warnings; use LWP::Simple; use HTML::TreeBuilder::XPath; my $url='file:///C:/Users/Rockstar/workspace/abc/globals_func.html'; my $page ...
1
vote
0answers
175 views

How to use Perl HTML::TableExtract with rowspans and also spans within cells

I apologize for the length here, but I thought it made sense to include my small amount of progress in addition to a description of my problem! I want to extract data from some html pages that have ...
0
votes
1answer
66 views

Discovering the depth and count of tables on an HTML page with Perl

I have local copies of numerous downloaded web-pages. The pages almost certainly have only a few different types of table layouts, but before looking to extract data, I first want to print out the ...
0
votes
2answers
613 views

How to extract a column of a table from html page using perl modules?

I have the following html code of a part of a webpage. <h2 id="failed_process">Failed Process</h2> <table border="1"> <thead> <tr> <th> ...
1
vote
2answers
1k views

HTML parsing by perl script

I am trying to parse an HTML file through my perl script. I am using a module called HTML::TreeBuilder. Here is what I have so far: use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new; ...
2
votes
1answer
217 views

Perl HTML::Strip whitelist

Is there a way to give a whitelist to the module that it would preserve certain tags? Now markup as below <div><b>test</b></div> Stripped with this code my $hs = ...
0
votes
2answers
382 views

Ignore Text in HTML::TreeBuilder Output Perl

I need to ignore or remove all text in between all HTML elements so I can generate a blank template from a given web page. I am parsing using the perl module HTML::TreeBuilder and HTML::Element. I ...
2
votes
2answers
1k views

Fetch <td> text while using WWW::Mechanize to fetch <a> within that <td> tag

I'm new to Perl-HTML things. I'm trying to fetch both the texts and links from a HTML table. Here is the HTML structure: <td>Td-Text <br> <a href="Link-I-Want" ...