I am stuck with some regular expression problem.
I have a huge file in html and i need to extract some text (Model No.) from the file.
<table>......
<td colspan="2" align="center" class="thumimages"><b>SK10014</b></td></tr>
.......
<table>/.....
<td colspan="2" align="center" class="thumimages"><b>SK1998</b></td></tr>
.... so on
and this is a huge page with all webpage built in table and divless...
The class "thumimages" almost repeats in all td, so leaves no way to differentiate the require content from the page.
There are about 10000 model No and i need to extract them.
is there any way do do this with regrex... like
"/<td colspan="2" align="center" class="thumimages"><b>{[1-9]}</b></td></tr>/"
and return an array of all the matched results. Note I have tried HTML parsing but the document contains to many html validation errors.
any help would be greatly appreciated...