[triangle-zpug] Regular expressions hanging or just taking a looonng time

Edmund Moseley edmund at unc.edu
Fri Aug 25 16:03:22 CEST 2006

Hi all,

I am writing a method to take a word perfect file and use RE to parse 
out the data.  I have also got a very simple unittest which tries it 
out on a few different files.
The regex is pretty long and basically looks for a field name, then 
captures everything after it, until the next field name. Sample:

pattern = re.compile(r"""
      NAME:              # look for name label
      (?P<name>.*?)      # capture name
      AGE:               # look for age label
      (?P<age>.*?)       # capture age
      RACE:              # look for race label
      (?P<race>.*?)      # capture race
      """, re.VERBOSE | re.DOTALL)

The actual pattern is much longer and as I develop it, if I make slight 
mistakes it seems to cause it to hang.  However, ctrl-C or ctrl-D won't 
break out of it. A few web searches suggested that it is not hung, but 
instead just taking a really long time. I've tried waiting for it over 
lunch, but nothing happens. I must quit the terminal and start again.
So, I was wondering: Would it be adviseable for me to add a time limit 
to my test? If so, how?
Am I doing something rather wrong with my reg ex?

Thanks for any advise. This is my first dabbles with both RE and 
unittest, thanks to PyCamp! :-)


