Tagged Questions
1
vote
2answers
260 views
How to convert HTML file into a hash in Perl?
Is there any simple way to convert a HTML file into a Perl hash? For example a working Perl modules or something?
I was search on cpan.org but did'nt find anything what can do what I want. I wanna do ...
0
votes
1answer
356 views
How do I print the text content of HTML::Treebuilder nodes on separate lines?
I'm using TreeBuilder::XPath as shown below:
use strict;
use warnings;
use LWP::Simple;
use HTML::TreeBuilder::XPath;
my $url='file:///C:/Users/Rockstar/workspace/abc/globals_func.html';
my $page ...
1
vote
0answers
175 views
How to use Perl HTML::TableExtract with rowspans and also spans within cells
I apologize for the length here, but I thought it made sense to include my small amount of progress in addition to a description of my problem!
I want to extract data from some html pages that have ...
0
votes
1answer
66 views
Discovering the depth and count of tables on an HTML page with Perl
I have local copies of numerous downloaded web-pages. The pages almost certainly have only a few different types of table layouts, but before looking to extract data, I first want to print out the ...
0
votes
2answers
613 views
How to extract a column of a table from html page using perl modules?
I have the following html code of a part of a webpage.
<h2 id="failed_process">Failed Process</h2>
<table border="1">
<thead>
<tr>
<th>
...
1
vote
2answers
1k views
HTML parsing by perl script
I am trying to parse an HTML file through my perl script. I am using a module called HTML::TreeBuilder.
Here is what I have so far:
use HTML::TreeBuilder;
my $tree = HTML::TreeBuilder->new;
...
2
votes
1answer
217 views
Perl HTML::Strip whitelist
Is there a way to give a whitelist to the module that it would preserve certain tags?
Now markup as below
<div><b>test</b></div>
Stripped with this code
my $hs = ...
0
votes
2answers
382 views
Ignore Text in HTML::TreeBuilder Output Perl
I need to ignore or remove all text in between all HTML elements so I can generate a blank template from a given web page.
I am parsing using the perl module HTML::TreeBuilder and HTML::Element.
I ...
2
votes
2answers
1k views
Fetch <td> text while using WWW::Mechanize to fetch <a> within that <td> tag
I'm new to Perl-HTML things. I'm trying to fetch both the texts and links from a HTML table.
Here is the HTML structure:
<td>Td-Text
<br>
<a href="Link-I-Want" ...