richard at sitescraper.net
Sun Apr 24 14:22:42 UTC 2011
note that BeautifulSoup<http://www.crummy.com/software/BeautifulSoup/3.1-problems.html>is
no longer maintained.
lxml <http://lxml.de/> is another good option.
On Sat, Apr 23, 2011 at 9:50 PM, Chris Neugebauer <chrisjrn at gmail.com>wrote:
> I hear that many of the cool kids use BeautifulSoup --
> On Sat, Apr 23, 2011 at 11:12, <trideceth12 at gawab.com> wrote:
> > Hi all,
> > Can anyone recommend me a python package for handling webscraping
> > operations. I need to be able to log-in to an https site and crawl from
> > there.
> > I have been trying to use HtmlUnit for java and have seen some people
> > using HtmlUnit and Jython, but so far HtmlUnit seems a bit flaky -
> > retaining logged-in status on some sites, not on others.
> > Is this really so hard???? I'm sure this must be a common operation.
> > Thanks in advance,
> > Jake
> > _______________________________________________
> > python-au maillist - python-au at starship.python.net
> > http://starship.python.net/mailman/listinfo/python-au
> --Christopher Neugebauer
> Jabber: chrisjrn at gmail.com -- IRC: chrisjrn on irc.freenode.net --
> AIM: chrisjrn157 -- MSN: chris at neugebauer.id.au -- WWW:
> http://chris.neugebauer.id.au -- Twitter/Identi.ca: @chrisjrn
> python-au maillist - python-au at starship.python.net
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the python-au