[Python-au] Webscraping
trideceth12 at gawab.com
trideceth12 at gawab.com
Wed Apr 27 08:05:02 UTC 2011
Thanks for all the help... I've got it now mechanize works a treat, no
problems with losing authentication :D
jake
On Wed, 2011-04-27 at 11:08 +1000, Ishwor Gurung wrote:
> Hi.
>
> On 23 April 2011 21:12, <trideceth12 at gawab.com> wrote:
> > Hi all,
> >
> > Can anyone recommend me a python package for handling webscraping
> > operations. I need to be able to log-in to an https site and crawl from
> > there.
>
> BeautifulSoup / Lxml for parsing
> cURL / wget for doing RESTful stuffs (POST / GET)
>
> > I have been trying to use HtmlUnit for java and have seen some people
> > using HtmlUnit and Jython, but so far HtmlUnit seems a bit flaky -
> > retaining logged-in status on some sites, not on others.
>
> I have no experience using HtmlUnit. That said, what are you trying to achieve?
> Perhaps others may be able to shed more light on it.
>
> > Is this really so hard???? I'm sure this must be a common operation.
> Break it down.
>
> Cheers
> [...]
More information about the python-au
mailing list