Take a look at Markus Kuhn’s UTF-8 decoder capability and stress test file
You’ll find examples of many UTF-8 irregularities, including lonely start bytes, continuation bytes missing, overlong sequences, etc.
Take a look at Markus Kuhn’s UTF-8 decoder capability and stress test file
You’ll find examples of many UTF-8 irregularities, including lonely start bytes, continuation bytes missing, overlong sequences, etc.