Question: Find longest unique substring in string python

Question

Find longest unique substring in string python

Answers 2
Added at 2017-10-06 22:10
Tags
Question

I am trying that age old question (there are multitudes of versions around) of finding the longest substring of a string which doesn't contain repeated characters. I can't work out why my attempt doesn't work properly:

def findLongest(inputStr):
    resultSet = []
    substr = []

    for c in inputStr:
        print ("c: ", c)
        if substr == []:
            substr.append([c])
            continue

        print(substr)
        for str in substr:
            print ("c: ",c," - str: ",str,"\n")
            if c in str:
                resultSet.append(str)
                substr.remove(str)
            else:
                str.append(c)
        substr.append([c])



    print("Result set:")
    print(resultSet)
    return max(resultSet, key=len)

print (findLongest("pwwkewambb"))

When my output gets to the second 'w', it doesn't iterate over all the substr elements. I think I've done something silly, but I can't see what it is so some guidance would be appreciated! I feel like I'm going to kick myself at the answer...

The beginning of my output:

c:  p
c:  w
[['p']]
c:  w  - str:  ['p']

c:  w
[['p', 'w'], ['w']]
c:  w  - str:  ['p', 'w'] # I expect the next line to say c: w - str: ['w']

c:  k
[['w'], ['w']] # it is like the w was ignored as it is here
c:  k  - str:  ['w']

c:  k  - str:  ['w']
...

EDIT:

I replaced the for loop with

for idx, str in enumerate(substr):
    print ("c: ",c," - str: ",str,"\n")
    if c in str:
        resultSet.append(str)
        substr[idx] = []
    else:
        str.append(c)

and it produces the correct result. The only thing is that the empty element arrays get set with the next character. It seems a bit pointless; there must be a better way.

My expected output is kewamb.

e.g.

c:  p
c:  w
[['p']]
c:  w  - str:  ['p']

c:  w
[['p', 'w'], ['w']]
c:  w  - str:  ['p', 'w']

c:  w  - str:  ['w']

c:  k
[[], [], ['w']]
c:  k  - str:  []

c:  k  - str:  []

c:  k  - str:  ['w']

c:  e
[['k'], ['k'], ['w', 'k'], ['k']]
c:  e  - str:  ['k']

c:  e  - str:  ['k']

c:  e  - str:  ['w', 'k']

c:  e  - str:  ['k']
...
Answers to

Find longest unique substring in string python

nr: #1 dodano: 2017-10-06 22:10

Not sure what is wrong in your attempt, but it's complex and in:

    for str in substr:
        print ("c: ",c," - str: ",str,"\n")
        if c in str:
            resultSet.append(str)
            substr.remove(str)

you're removing elements from a list while iterating on it: don't do that, it gives unexpected results.

Anyway, my solution, not sure it's intuitive, but it's probably simpler & shorter:

  • slice the string with an increasing index
  • for each slice, create a set and store letters until you reach the end of the string or a letter is already in the set. Your index is the max length
  • compute the max of this length for every iteration & store the corresponding string

Code:

def findLongest(s):
    maxlen = 0
    longest = ""
    for i in range(0,len(s)):
        subs = s[i:]
        chars = set()
        for j,c in enumerate(subs):
            if c in chars:
                break
            else:
                chars.add(c)
        else:
            # add 1 when end of string is reached (no break)
            # handles the case where the longest string is at the end
            j+=1
        if j>maxlen:
            maxlen=j
            longest=s[i:i+j]
    return longest

print(findLongest("pwwkewambb"))

result:

kewamb
nr: #2 dodano: 2017-10-06 22:10
from itertools import groupby

s = set() ## for mutable access

''.join(max((list(g) for _, g in groupby('pwwkewambb', key=lambda x: not ((s and x == s.pop()) or s.add(x)))), key=len))
'kewamb'

groupby returns an iterator grouped based on the function provided in the key argument, which by default is lambda x: x. Instead of the default we are utilizing some state by using a mutable structure (which could have been done a more intuitive way if using a normal function)

lambda x: not ((s and x == s.pop()) or s.add(x))

What is happening here is since I can't reassign a global assignment in a lambda (again I can do this, using a proper function), I just created a global mutable structure that I can add/remove. The key (no pun) is that I only keep elements that I need by using a short circuit to add/remove items as needed.

max and len are fairly self explanatory, to get the longest list produced by groupby

Another version without the mutable global structure business:

def longest(x):
     if hasattr(longest, 'last'):
         result = not (longest.last == x)
         longest.last = x
         return result
     longest.last = x
     return True


''.join(max((list(g) for _, g in groupby('pwwkewambb', key=longest)), key=len))
'kewamb'
Source Show
◀ Wstecz