[triangle-zpug] how to match strings in python
Chris Calloway
cbc at unc.edu
Thu Apr 2 18:23:21 UTC 2009
On 4/2/2009 11:50 AM, Sam Brauer wrote:
> Sometimes I love little exercises like this.
I like doing this as a hive mind, because we can not only learn, but
come up with more optimal solutions from what we all know together.
I like Chris Rossi's zip solution, because that was especially Pythonic
functional thinking which we should all be doing as it produces the most
obviously elegant *and* efficient solutions.
I like Stephan pointing out the LCS alorithm, because it solves a
superset of what Joe said he was looking for (the longest part of one
string anywhere within the other).
I like what Sam did because it inspired me to do an LCS using the find
string method and slices to do the heavy lifting. It's not as O(nm)
efficient as the solution in the Algorithms book. But I think it's more
understandable:
def inCommonNG(string1, string2):
"""Finds first longest common string."""
result = ''
if len(string1) < len(string2):
shorter = string1
longer = string2
else:
shorter = string2
longer = string1
while shorter:
backup = shorter
start = longer.find(shorter)
while(start < 0):
shorter = shorter[:-1]
start = longer.find(shorter)
if len(shorter) > len(result):
result = shorter
shorter = backup[1:]
if len(shorter) <= len(result):
break
return result
print inCommonNG("foobar","foobaz")
print inCommonNG("foobarrrr","foobaz")
print inCommonNG("foobar","foobazzzz")
print inCommonNG("foobar","")
print inCommonNG("","foobaz")
print inCommonNG("","")
print inCommonNG("foobar","snafoobazsna")
print inCommonNG("snafoobarsna","foobaz")
print inCommonNG("snufoobarsnu","snafoobazsna")
print inCommonNG("foobar","foobar")
Produces:
>>>
fooba
fooba
fooba
fooba
fooba
fooba
foobar
>>>
Note, this algorithm finds the *first* LCS. There may be more than one.
Note, this algorithm only works because of the hack in the find string
method which returns 0 instead of -1 when the match string is empty. A
useful and expected hack, that is.
I bet Rossi can do this more efficiently with zip. :)
--
Sincerely,
Chris Calloway
http://www.secoora.org
office: 332 Chapman Hall phone: (919) 599-3530
mail: Campus Box #3300, UNC-CH, Chapel Hill, NC 27599
More information about the triangle-zpug
mailing list