[Python-au] Webscraping

trideceth12 at gawab.com trideceth12 at gawab.com
Sat Apr 23 11:12:50 UTC 2011


Hi all,

Can anyone recommend me a python package for handling webscraping
operations. I need to be able to log-in to an https site and crawl from
there.

I have been trying to use HtmlUnit for java and have seen some people
using HtmlUnit and Jython, but so far HtmlUnit seems a bit flaky -
retaining logged-in status on some sites, not on others.

Is this really so hard????  I'm sure this must be a common operation.

Thanks in advance,
Jake





More information about the python-au mailing list