<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#ffffff">
    On 25/04/2011 12:22 AM, Richard Penman wrote:
    <blockquote
      cite="mid:BANLkTi=z8hS4EEMx8FQhF59P-wOqkoKfsQ@mail.gmail.com"
      type="cite">note that <a moz-do-not-send="true"
        href="http://www.crummy.com/software/BeautifulSoup/3.1-problems.html">BeautifulSoup</a>
      is no longer maintained.</blockquote>
    Not true. It is well maintained, albeit going through some issues on
    the way to Python 3.<br>
    <br>
    <blockquote
      cite="mid:BANLkTi=z8hS4EEMx8FQhF59P-wOqkoKfsQ@mail.gmail.com"
      type="cite">
      <div><br>
      </div>
      <div><a moz-do-not-send="true" href="http://lxml.de/">lxml</a>&nbsp;is
        another good option.</div>
      <div><br>
        Richard</div>
      <div><br>
        <br>
        <div class="gmail_quote">On Sat, Apr 23, 2011 at 9:50 PM, Chris
          Neugebauer <span dir="ltr">&lt;<a moz-do-not-send="true"
              href="mailto:chrisjrn@gmail.com">chrisjrn@gmail.com</a>&gt;</span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt
            0.8ex; border-left: 1px solid rgb(204, 204, 204);
            padding-left: 1ex;">
            I hear that many of the cool kids use BeautifulSoup --<br>
            <a moz-do-not-send="true"
              href="http://www.crummy.com/software/BeautifulSoup/"
              target="_blank">http://www.crummy.com/software/BeautifulSoup/</a><br>
            <br>
            --Chris<br>
            <div>
              <div class="h5"><br>
                On Sat, Apr 23, 2011 at 11:12, &nbsp;&lt;<a
                  moz-do-not-send="true"
                  href="mailto:trideceth12@gawab.com">trideceth12@gawab.com</a>&gt;
                wrote:<br>
                &gt; Hi all,<br>
                &gt;<br>
                &gt; Can anyone recommend me a python package for
                handling webscraping<br>
                &gt; operations. I need to be able to log-in to an https
                site and crawl from<br>
                &gt; there.<br>
                &gt;<br>
                &gt; I have been trying to use HtmlUnit for java and
                have seen some people<br>
                &gt; using HtmlUnit and Jython, but so far HtmlUnit
                seems a bit flaky -<br>
                &gt; retaining logged-in status on some sites, not on
                others.<br>
                &gt;<br>
                &gt; Is this really so hard???? &nbsp;I'm sure this must be a
                common operation.<br>
                &gt;<br>
                &gt; Thanks in advance,<br>
                &gt; Jake<br>
                &gt;<br>
                &gt;<br>
                &gt;<br>
                &gt; _______________________________________________<br>
                &gt; python-au maillist &nbsp;- &nbsp;<a moz-do-not-send="true"
                  href="mailto:python-au@starship.python.net">python-au@starship.python.net</a><br>
                &gt; <a moz-do-not-send="true"
                  href="http://starship.python.net/mailman/listinfo/python-au"
                  target="_blank">http://starship.python.net/mailman/listinfo/python-au</a><br>
                &gt;<br>
                <br>
                <br>
                <br>
              </div>
            </div>
            <font color="#888888">--<br>
              --Christopher Neugebauer<br>
              <br>
              Jabber: <a moz-do-not-send="true"
                href="mailto:chrisjrn@gmail.com">chrisjrn@gmail.com</a>
              -- IRC: chrisjrn on <a moz-do-not-send="true"
                href="http://irc.freenode.net" target="_blank">irc.freenode.net</a>
              --<br>
              AIM: chrisjrn157 -- MSN: <a moz-do-not-send="true"
                href="mailto:chris@neugebauer.id.au">chris@neugebauer.id.au</a>
              -- WWW:<br>
              <a moz-do-not-send="true"
                href="http://chris.neugebauer.id.au" target="_blank">http://chris.neugebauer.id.au</a>
              -- Twitter/Identi.ca: @chrisjrn<br>
            </font>
            <div>
              <div class="h5"><br>
                _______________________________________________<br>
                python-au maillist &nbsp;- &nbsp;<a moz-do-not-send="true"
                  href="mailto:python-au@starship.python.net">python-au@starship.python.net</a><br>
                <a moz-do-not-send="true"
                  href="http://starship.python.net/mailman/listinfo/python-au"
                  target="_blank">http://starship.python.net/mailman/listinfo/python-au</a><br>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
      <pre wrap="">
<fieldset class="mimeAttachmentHeader"></fieldset>
_______________________________________________
python-au maillist  -  <a class="moz-txt-link-abbreviated" href="mailto:python-au@starship.python.net">python-au@starship.python.net</a>
<a class="moz-txt-link-freetext" href="http://starship.python.net/mailman/listinfo/python-au">http://starship.python.net/mailman/listinfo/python-au</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>