[Python-au] Webscraping
Ishwor Gurung
ishwor.gurung at gmail.com
Wed Apr 27 01:08:27 UTC 2011
Hi.
On 23 April 2011 21:12, <trideceth12 at gawab.com> wrote:
> Hi all,
>
> Can anyone recommend me a python package for handling webscraping
> operations. I need to be able to log-in to an https site and crawl from
> there.
BeautifulSoup / Lxml for parsing
cURL / wget for doing RESTful stuffs (POST / GET)
> I have been trying to use HtmlUnit for java and have seen some people
> using HtmlUnit and Jython, but so far HtmlUnit seems a bit flaky -
> retaining logged-in status on some sites, not on others.
I have no experience using HtmlUnit. That said, what are you trying to achieve?
Perhaps others may be able to shed more light on it.
> Is this really so hard???? I'm sure this must be a common operation.
Break it down.
Cheers
[...]
--
Regards
Ishwor Gurung
Key id:0xa98db35e
Key fingerprint:FBEF 0D69 6DE1 C72B A5A8 35FE 5A9B F3BB 4E5E 17B5
More information about the python-au
mailing list