[triangle-zpug] Regular expressions hanging or just taking a looonng time

Adam Hupp hupp at upl.cs.wisc.edu
Sat Aug 26 19:39:06 CEST 2006


On Fri, Aug 25, 2006 at 10:03:22AM -0400, Edmund Moseley wrote:

> The regex is pretty long and basically looks for a field name, then 
> captures everything after it, until the next field name. Sample:
>
> pattern = re.compile(r"""
>      NAME:              # look for name label
>      (?P<name>.*?)      # capture name
...

If your pattern is so consistent in all cases you can probably do this
in a simpler way without a regex.  If each of these entries is on one
line then the following will work:

# 'lines' is a file-like object
attrib = dict(i.split(":") for  i in lines)
    
Or equivalently:

result = {} 
for i in lines:
    splat = i.split(":")
    result[splat[0]] = splat[1]

If they are not delimited per line you can use a general regex to
split them up and then process with the above:

re.compile(r"(([A-Z]: (.*?)))*")

-Adam



More information about the triangle-zpug mailing list