Imagine you’re working with a system that needs to send lists of strings across a network. Each string may contain special characters, including commas, numbers, or even spaces, which makes it tricky to separate one string from another. The task is to design an encoding algorithm that converts a list of strings into a single string and a decoding algorithm to reconstruct the original list from the encoded string.
This problem teaches hot to encode and decode strings efficiently while handling edge cases like empty strings or special characters.
Brute Force Approach
public class EncodeDecodeStrings {
public String encode(List<String> inputList) {
return String.join(":;", inputList); // Join elements with ":;"
}
// Decodes a single string to a list of strings
public List<String> decode(String inputString) {
return Arrays.asList(inputString.split(":;")); // Split by ":;"
}
}
• This approach fails if the input strings contain the delimiter “:;”, as it causes ambiguity during decoding.
• For robust solutions, consider encoding using length-prefix or escape sequences.
Optimized Approach: Using Two-Pass Parsing
The algorithm uses prefix encoding to separate strings clearly. A length-based prefix avoids conflicts with special characters in the strings, ensuring that decoding is unambiguous.
• Two-Pointer Technique:
• Efficiently handles the parsing of lengths and substrings.
• StringBuilder:
• Used for efficient string concatenation during encoding.
1. By adding the string’s length as a prefix, the decoder knows exactly how many characters to extract for each substring.
2. The use of a delimiter (#) separates the prefix from the string, ensuring clarity even when the string contains digits.
public class Codec {
public String encode(List<String> strs) {
StringBuilder encoded = new StringBuilder();
for (String str : strs) {
encoded.append(str.length()).append("#").append(str);
}
return encoded.toString();
}
public List<String> decode(String s) {
List<String> result = new ArrayList<>();
int i = 0;
while (i < s.length()) {
int j = i;
while (s.charAt(j) != '#') {
j++;
}
int length = Integer.parseInt(s.substring(i, j));
i = j + 1;
result.add(s.substring(i, i + length));
i += length;
}
return result;
}
}
Line-By-Line Explanation
Step 1: Convert list of strings with encode
public String encode(List<String> strs) {
StringBuilder encoded = new StringBuilder();
for (String str : strs) {
encoded.append(str.length()).append("#").append(str);
}
return encoded.toString();
}
- For each string in the list:
- Append the length of the string followed by a “#” delimiter
- Append the string itself
- Example: For [“hello”, “world”], it creates: 5#hello5#world
- Using a length prefix ensures the encoding is unambiguous, even if the strings contain special characters (like “#”).
Step 2: Decode Strings
public List<String> decode(String s) {
List<String> result = new ArrayList<>();
int i = 0;
while (i < s.length()) {
int j = i;
while (s.charAt(j) != '#') {
j++;
}
int length = Integer.parseInt(s.substring(i, j));
i = j + 1;
result.add(s.substring(i, i + length));
i += length;
}
return result;
}
- Initializes i to 0, which acts as a pointer to track the current position in the encoded string s.
- Finds the position j of the # delimiter, which separates the length prefix from the string content.
- Extracts the substring between i and j (the length prefix) and converts it into an integer length.
- Extracts the substring of length length starting at i and adds it to the result list.
Why This Code Works
The use of a length prefix makes this encoding method robust and unambiguous. The presence of # or numbers in the string content does not affect the algorithm, as it treats them as regular characters during both encoding and decoding. This approach adheres to LeetCode best practices by being clear, efficient, and handling all edge cases effectively.