The process of extracting information from the HTML source code of live websites. Typically used by third-party applications to interact with a website that doesn't expose an API.
0
votes
0answers
6 views
Best Option for Scraping Usability & Performance?
I am going to create a web-based media center similar to xbmc, however, I am not sure as the best practice for handling scraping.
Options are:
Automatically download the latest categories, and ...
-2
votes
0answers
17 views
Scraping data from custom Google map?
I'm specifically looking at this site: http://www.matrixservice.com/locations-map
My goal is to find a way to scrape the addresses (contained in the popup bubbles), but since the bubbles are loaded ...
1
vote
1answer
17 views
How do I use Scrapy to crawl within pages?
I am using Python and Scrapy for this question.
I am attempting to crawl webpage A, which contains a list of links to webpages B1, B2, B3, ... Each B page contains a link to another page, C1, C2, C3, ...
0
votes
3answers
18 views
NoMethodError: ruby gem mechanize undefined method 'q= '
I'm trying to build a webscraping program for Amazon, but I'm getting tripped up on the very first step. I wrote my code like this, to just start to poke around and access Amazon and prettypage so I ...
-3
votes
0answers
28 views
Scraping Methods in PHP Tips
I was assigned to a task to scrape the data out of this website. The Company Name, the address, and the phone numbers. Previously i was using the software called, Visual Web Ripper, to scrape my way ...
1
vote
2answers
28 views
C# HTML Agility Pack Single Select Node returning null
I have a web scraper developed using C#, windows forms and the HTML Agility Pack.
I had it all working great when the site changed it's code and broke it. I know it happens often with web scrapers ...
-5
votes
0answers
28 views
Scraping a web page with nested forms
We are on a project where we need to scrap some data from another website.
So far we have been successful but now we need to scrap some data from a from which has another nested form. the form is used ...
-1
votes
2answers
19 views
Python Urllib2 module Error
Can somebody tell me what:
URLError: <urlopen error [Errno -3] Temporary failure in name resolution>
exactly means? Could not find it in the documentation.
Does it say that the URL is ...
0
votes
0answers
12 views
Problems scraping site unsing c# and webbrowser
I am trying to extract the table data from a site:
http://www.mini-iac.com/Home/Reports/tabid/147/Default.aspx?IMACNo=6713&ParameterBar=False
using the webbrowser control. The site appears ...
0
votes
1answer
18 views
Problems with character-encoding when webscraping with scrapy
I have problem with the encoding of the text, I am scraping from a website. Specifically the Danish letters æ, ø, and å are coming out wrong. I feel confident that the encoding of the webpage is ...
0
votes
0answers
11 views
Read phantom console.log without killing phantom (phantom.exit())
I am reading a webpage where I have to login and then pass my search string.
Every time I want to read the results, I have to call phantom.exit() I do get the results but for every query I have to ...
-4
votes
1answer
33 views
How do I scrape “Contact Info” from facebook public page? [closed]
I want to scrape contact info from some facebook public pages like phone, email, website, etc, which are listed bellow the "Contact Info" heading in the About section of the any facebook page.
I am ...
0
votes
1answer
37 views
Can't follow links when web-scraping
I realize that others have covered similar topics, but having read these posts, I still can't solve my problem.
I am using Scrapy to write a crawl spider that should scrape search results pages. One ...
0
votes
1answer
33 views
How to automate multiple requests to a web search form using R (Java function calls / triger)
I need to access the data in this page, the "Using lipids" link at the botom of it.
I already have seen:
How to automate multiple requests to a web search form using R
and:
What if I want to web ...
1
vote
0answers
26 views
Fetch UTF-8 Characters from a different language Website
I am trying to fetch data from a website which is in Hungarian language. This site also has some UTF-8 characters i.e. ő. So when i scrap the data from website, it changes special characters into some ...