UTF-8 support #52

LilithSilver · 2022-07-29T01:34:40Z

Currently, there is a bug with parsing UTF-8 or ASCII Extended characters: the C call isspace() doesn't accept negative char values. The simple fix is to cast the value to an unsigned char, which is fine because no ASCII spaces can appear in the negatives of a char anyways.

This PR also adds a test based on a modified version of Markus Kuhn's UTF-8 Demo Page, to ensure that it can parse a variety of characters. The demo is under the CC BY license which allows unrestricted use with attribution, and the attribution is at the top of the file, so we should be good there.

JBenda · 2022-07-29T16:42:10Z

Oh, is this really the only thing that breaks with utf-8? quity handy.

I will try it my self this weekend, but it looks promissing.

Thanks for the input

LilithSilver · 2022-07-29T19:20:06Z

Yep, I was surprised as well, but it makes sense considering that UTF-8 was designed for full ASCII compatibility!

Note that if you want the UTF-8 to display properly, you'll have to reinterpret the byte data as UTF-8. Visual Studio for example doesn't support UTF-8 and outputs strings as garbled ASCII extended. But the test confirms that the byte data produced by ink is indeed correct.

utf8

c338f44

JBenda merged commit b8b36b8 into JBenda:master Jul 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

UTF-8 support #52

UTF-8 support #52

Uh oh!

LilithSilver commented Jul 29, 2022 •

edited

Loading

Uh oh!

JBenda commented Jul 29, 2022

Uh oh!

LilithSilver commented Jul 29, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

UTF-8 support #52

UTF-8 support #52

Uh oh!

Conversation

LilithSilver commented Jul 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JBenda commented Jul 29, 2022

Uh oh!

LilithSilver commented Jul 29, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

LilithSilver commented Jul 29, 2022 •

edited

Loading