Longest Substring Without Repeating Characters

4 min readSep 25, 2022

Hi guys! 🐵 In this blog, I will go over my work on the above algorithm. Feel free to jump to the TLDR 👇to see my solution and explanation for each piece of the algorithm.

Description of the problem:

Initial Approach:

💡 Utilize a sliding window to keep track of the substring where windowStart denotes the the start and windowEnd denotes the end of the substring. I will utilize a hash to keep track of the string’s characters and count how many times a certain character has appeared. From there, I will loop through the string and do the following at each iteration:

If the character is a duplicate, increase the character’s count and reset the substring’s length to 1. If it’s not a duplicate, add the new character to the hash with a value of 1, and increase the substring length by 1. Lastly, check whether max or the substring is larger and update the max length. After we’ve iterated through the string, return the max variable

Why the above doesn’t work?

❌While the condition makes sense for certain sliding window cases, this approach becomes problematic in cases when unique characters separate duplicates like in the string “dvdf”.

Based on the logic above, when we hit duplicate “d”, we reset the substring to 1. This makes sense in some cases like “aaaa” or “pwwkew” because the substring should be reset to 1 if its preceding value is a duplicate. But in “dvdf” we’ve forgotten about “v”! The current substring should be “vd”, not “d”.

One way to solve this is to keep track of the windowStart. Instead of resetting the substring to 1, we can move the windowStart whenever we find duplicate values. When we find duplicate values, move the windowStart based on the number of duplicates we’ve previously found. For instance in “pwwkew”, we can move the windowStart two places to set the window to “w” (index 2) . Our while loop below will move windowStart one space for {w:1} and one more space to remove {w:0}.

😞 The code below uses the solution described above. This solution works in some cases but not ALL cases. Further it wastes unnecessary time decrementing character’s in our hash.

TLDR:

🍏Essentially the issue comes from what we store in our hash. Counting characters obviously doesn’t help us here. We still need to update our windowStart when we find duplicates but instead of storing the character counts, we can store the index of the most recent occurrence of a character. Thus when we hit duplicates, we can shrink our window by reassigning our windowStart to the character after the found duplicate OR if our windowStart is past the duplicate, leave it in place and just update the character’s index in our hash. This solves the issue we had with strings like “dvdf” because now we have logic that shrinks our window but doesn’t skip over too many characters.

🕐Time Complexity: O(n)
=> In all cases we need to iterate over the entire string.

👾Space Complexity: O(n)
=> In the worst case, our hash will need to store indices of the entire string.

Thanks for reading! Feel free to reach out if you have any questions or suggestions on ways to optimize! 👽 Stay tuned for my next blog on the Best Flight Paths Algorithm ✈️!!

❓LinkedIn: https://www.linkedin.com/in/sean-balayan/

Longest Substring Without Repeating Characters

TLDR:

Written by Sean Balayan

No responses yet