Looking for good programming challenges?

Use the search below to find our solutions for selected questions!

SimpleUTF8 reverse challenge

Sharing is caring!

Problem statement
Consider the SimpleUTF8 format that consists of characters of the following length:
– 1 byte which have the format 0XXXXXXX
– 2 bytes which have the format 110XXXXX 10XXXXXX
– 3 bytes which have the format 1110XXXX 10XXXXXX 10XXXXXX

You will receive a byte[] str. Your task is to reverse the array.

Sample input

Sample output

Solution
A solution, similar to the reverse words in a string problem, is to implement a helper function reverse(int start, int end, byte[] array) that reverses array[start:end]. Then we can do the following:
Step 1:
reverse(0, str.length, str) to reverse the entire byte array:

becomes

Step 2:
Iterate over the (reversed)array using index i and check the prefixes:
– If byte starts with 0 we know that the current byte builds up a character of length 1 byte. Increment i: i = i + 1.
– If byte starts with 10 we know that the current byte belongs to either a 2-byte or 3-byte character. Note its index with start. We need to check the next byte. If it starts with 110 we know that we have a 2-byte character. Call reverse(start,i+1,str) to correct the character. Increment i: i = i + 2. Else, if the next byte also starts with 10 we know that we have a 3-byte character. Call reverse(start,i+2,str) to correct the character. Increment i: i = i + 3.