Longest Substring Without Repeating Characters
Hi guys! đ” In this blog, I will go over my work on the above algorithm. Feel free to jump to the TLDR đto see my solution and explanation for each piece of the algorithm.
Description of the problem:
Initial Approach:
đĄ Utilize a sliding window to keep track of the substring where windowStart denotes the the start and windowEnd denotes the end of the substring. I will utilize a hash to keep track of the stringâs characters and count how many times a certain character has appeared. From there, I will loop through the string and do the following at each iteration:
If the character is a duplicate, increase the characterâs count and reset the substringâs length to 1. If itâs not a duplicate, add the new character to the hash with a value of 1, and increase the substring length by 1. Lastly, check whether max or the substring is larger and update the max length. After weâve iterated through the string, return the max variable
Why the above doesnât work?
âWhile the condition makes sense for certain sliding window cases, this approach becomes problematic in cases when unique characters separate duplicates like in the string âdvdfâ.
Based on the logic above, when we hit duplicate âdâ, we reset the substring to 1. This makes sense in some cases like âaaaaâ or âpwwkewâ because the substring should be reset to 1 if its preceding value is a duplicate. But in âdvdfâ weâve forgotten about âvâ! The current substring should be âvdâ, not âdâ.
One way to solve this is to keep track of the windowStart. Instead of resetting the substring to 1, we can move the windowStart whenever we find duplicate values. When we find duplicate values, move the windowStart based on the number of duplicates weâve previously found. For instance in âpwwkewâ, we can move the windowStart two places to set the window to âwâ (index 2) . Our while loop below will move windowStart one space for {w:1} and one more space to remove {w:0}.
đ The code below uses the solution described above. This solution works in some cases but not ALL cases. Further it wastes unnecessary time decrementing characterâs in our hash.
TLDR:
đEssentially the issue comes from what we store in our hash. Counting characters obviously doesnât help us here. We still need to update our windowStart when we find duplicates but instead of storing the character counts, we can store the index of the most recent occurrence of a character. Thus when we hit duplicates, we can shrink our window by reassigning our windowStart to the character after the found duplicate OR if our windowStart is past the duplicate, leave it in place and just update the characterâs index in our hash. This solves the issue we had with strings like âdvdfâ because now we have logic that shrinks our window but doesnât skip over too many characters.
đTime Complexity: O(n)
=> In all cases we need to iterate over the entire string.
đŸSpace Complexity: O(n)
=> In the worst case, our hash will need to store indices of the entire string.
Thanks for reading! Feel free to reach out if you have any questions or suggestions on ways to optimize! đœ Stay tuned for my next blog on the Best Flight Paths Algorithm âïž!!
âLinkedIn: https://www.linkedin.com/in/sean-balayan/