Formatter Serialize by gulugulubing · Pull Request #67 · schveiguy/jsoniopipe

gulugulubing · 2025-09-23T18:21:40Z

Reimplemented the formatjson example program with new Formatter and tested some json files locally.

schveiguy

Good start.

source/iopipe/json/serialize.d

schveiguy · 2025-11-12T04:09:26Z

Hi @gulugulubing how much time do you have to finish this up? I want to get the serialization stuff in the books for release. If you have time, I can do some more review. If you don't have time, I can take what you have so far and get it to a mergeable state. No worries if you don't have time, I would completely understand.

gulugulubing · 2025-11-12T04:45:56Z

Sorry @schveiguy, I just checked the conservations but I could't recall what still need to do in this PR.

schveiguy · 2025-11-12T04:47:52Z

Yes, and that's my fault. I was going to go through this again, but wanted to first make sure you still are able to work on it. I'll review this again and get more feedback for you. Thanks!

schveiguy · 2025-11-22T05:18:54Z

source/iopipe/json/serialize.d

+    private bool isValidJSONNumber(string str) {
+        import std.regex;
+        static auto numberRegex = regex(r"^-?(0|[1-9]\d*)(\.\d+)?([eE][+-]?\d+)?$");
+        return !str.matchFirst(numberRegex).empty;


I don't love using regex here. We do have a number validator in the parser, but it's awkward to use. Probably should factor out the validation.

This is OK for now.

schveiguy · 2025-11-22T05:41:33Z

I had a whole thing written up for how I want this to look, and I'm still mulling over it in my head, so I erased it all.

Let me get this all written down, and I will get back to you. I think this is the right path, but maybe tweak some things. The formatter needs either options for everything, or we need specific formatter types (this is what I'm leaning towards).

Will work on a review this weekend.

schveiguy

In addition to all the comments below, we need a state system.

Many of these functions are only valid if you are in the right state. For example, it only makes sense to accept string data inside a string. It only makes sense to accept an object key if you are in an object, etc.

If we add the comma function, there's another state that needs handling.

schveiguy · 2025-11-24T02:09:58Z

source/iopipe/json/serialize.d

+
+    void addMember(T)(T key, MemberOptions options) {
+
+        if (key.length == 0 ) {


I don't like this requirement to call addMember for arrays. I had expected the different value starting functions (beginObject, beginArray, beginString, addNumericData, addKeywordValue) to handle the comma.

Thinking about comments, I think actually we probably want to add a specific function to handle adding the comma. Because comments can go anywhere.

source/iopipe/json/serialize.d

schveiguy · 2025-11-24T03:01:23Z

Needs a rebase and resolve the conflict.

gulugulubing · 2025-11-25T03:33:01Z

In addition to all the comments below, we need a state system.

Many of these functions are only valid if you are in the right state. For example, it only makes sense to accept string data inside a string. It only makes sense to accept an object key if you are in an object, etc.

If we add the comma function, there's another state that needs handling.

It makes sense, but I haven't thought previously. Is this something like below in struct JSONFormatter:

enum FormatterState {
        TopLevel,
        InObjectKey,    // expecting a key (addMember)
        InObjectValue,  // expecting a value after colon
        InArray,        // inside an array, expecting a value
        InString,       // inside string data
    }

    FormatterState state = FormatterState.TopLevel;

    // track nested containers: Obj/Array
    enum ContainerKind { Obj, Arr }
    ContainerKind[] containerStack;

schveiguy · 2025-11-25T04:00:56Z

That's close I think. You can determine the current aggregate type (object or array) by the containerStack.

You can actually probably just reuse the state enum from the parser, though there is no "colon" state as that's already handled. At least you can reuse the state names that do apply. Note the BitArray stack as well, please copy that design.

Oh, one more thing, the formatter needs to go in its own file. This is not part of the serializer, just like the parser isn't. I think an iopipe.json.formatter module.

gulugulubing · 2025-11-26T16:36:03Z

You can actually probably just reuse the state enum from the parser, though there is no "colon" state as that's already handled. At least you can reuse the state names that do apply. Note the BitArray stack as well, please copy that design.

In the latest commit, I find comma state is not needed neither.

schveiguy

Stopped looking, the state machine seems wrong. Each thing that adds data to the stream should change the state.

Maybe diagram it out? I'm not sure how it's supposed to work.

source/iopipe/json/formatter.d

schveiguy · 2025-11-28T04:29:56Z

source/iopipe/json/serialize.d

    // Parse array elements
    size_t elementCount = 0;
-    while(true) {
+    while(tokenizer.peekSkipComma() != JSONToken.ArrayEnd) {


Unsure why this is here. Maybe you need a rebase?

Since here was a conflict between this branch and master, I guess I need to accept the master version before rebase or merge.

source/iopipe/json/formatter.d

schveiguy · 2025-11-29T02:12:53Z

source/iopipe/json/formatter.d

+            // subsequent member: comma + indent
+            putStr(",");
+            putIndent();
+            // stay in Member


Seems incorrect:

startObjectMember(); startObjectMember();

will output 2 consecutive commas.

gulugulubing · 2025-11-30T03:40:59Z

Stopped looking, the state machine seems wrong. Each thing that adds data to the stream should change the state.

Maybe diagram it out? I'm not sure how it's supposed to work.

Yeah, the process is a little bit subtle. Some functions of adding data seems no need to change state.

Let's check a simple example:

{
  "name": "John",
  "age": 30,
  "hobbies": ["reading", "swimming"]
}

Begin → beginObject(), output"{" → First
First → addMember() → Value

in addMember(): startObjectMember(), don't add comma since now is First ->Member->beginString(), addStringData(), endStringRaw(), output "name:"

Value → beginString(), addStringData(), endString(), output "John"→ Member
Member → addMember() -> Value

in addMember(): → startObjectMember(), add comma since now is Member→Member->beginString(), addStringData(), endStringRaw(), output "age:"

Value → addNumericData(), output 30 → Member
Member→ addMember() -> Value

in addMember(): → startObjectMember(), add comma since now is Member->Member->beginString(), addStringData(), endStringRaw(), output "hobbies:"

Value → beginArray(), output"["→ First
First → beginArrayValue(), don't add comma since now is First → Member
Member → beginString(), addStringData(), endString(), output "reading" → Member
Member->beginArrayValue(), add comma since now is Member→ Member
Member → beginString(), addStringData(), endString(), output "swimming" → Member
Member → endArray(), output "]"→ stack is not empty, so Member
Member → endObject() , output "}"→ stack is empty, so End

schveiguy

So there are a few states which allow adding a "value":

Begin
First for arrays
Member for arrays
Value for objects

And a "value" can either be the start of a string, a number, a keyword, an object start, an array start.

I suggest to write a helper method called canAddValue which returns true if the next thing can be a value. Then use this in all your code where you are validating state. This will help make the code cleaner also.

In fact, helper methods to validate the state in all cases, named appropriately, would make things a lot clearer. Methods like:

canAddValue -> adding a value is allowed
canAddStringData -> adding string data (or ending string) is allowed
canAddComment -> adding a comment is allowed
canCloseAggregate -> closing the aggregate is allowed

These can be private methods. They will just contain state validation and return a boolean value. The code that uses it will read a lot cleaner. It also allows the same validation code to be used for both throwing an exception and asserting.

source/iopipe/json/formatter.d

schveiguy · 2025-11-30T04:20:10Z

source/iopipe/json/formatter.d

+    // // add a comment (JSON5 only), must be a complete comment (validated)
+    void addComment(T)(T commentData) {
+        static if (is(T == string)) {
+            if (!commentData.startsWith("//") && !commentData.startsWith("/*") && !commentData.endsWith("*/")) {


// comments need to end with a newline

This also requires // comments to end with */

The logic is hard to parse as well. I suggest:

bool isValidComment = (commentData.startsWith("//") && commentData.endsWith("\n")) || (commentData.startsWith("/*") && commentData.endsWith("*/")); if(!isValidComment) throw new ...

schveiguy · 2025-11-30T04:37:58Z

One thing here is that there are no unittests.

The serialize code has formatting unittests, but we haven't hooked up serialization yet. So maybe a few simple ones, and maybe a few that validate the state handling is correct (i.e. you get an exception if you try doing things in the wrong order).

source/iopipe/json/formatter.d

schveiguy

Sorry for the delay, I've been super busy. More changes. I haven't reviewed the unittests yet.

source/iopipe/json/formatter.d

schveiguy · 2025-12-23T03:14:39Z

source/iopipe/json/formatter.d

+        static if (validate) {
+            // Validate that the string doesn't contain invalid characters
+            foreach(char c; value) {
+                if (c < 0x20 && c != '\t' && c != '\n' && c != '\r') {


This isn't exactly correct, it needs to check for quote characters and lone backslashes.

In addition, I would expect both add escapes and validate as true should work with characters like newline, which should replace with a literal \n

I know I made this API in the discussion, I think it's wrong now. You can validate, add escapes, or do neither. You can't do both (adding escapes would ensure a valid string).

Maybe change to an enum?

passThru -> don't do anything, assume the string data is correct
addEscapes -> any invalid character is escaped into a valid sequence (this should be the default)
validate -> verify escapes are correctly used in the data.

source/iopipe/json/formatter.d

Formatter Serialize

e3f2528

schveiguy requested changes Sep 24, 2025

View reviewed changes

polish the Formatter

d209378

schveiguy reviewed Sep 26, 2025

View reviewed changes

source/iopipe/json/serialize.d Outdated Show resolved Hide resolved

schveiguy reviewed Sep 26, 2025

View reviewed changes

source/iopipe/json/serialize.d Outdated Show resolved Hide resolved

schveiguy reviewed Sep 26, 2025

View reviewed changes

source/iopipe/json/serialize.d Outdated Show resolved Hide resolved

delete unrelated modification

6a3173d

schveiguy reviewed Nov 22, 2025

View reviewed changes

schveiguy requested changes Nov 24, 2025

View reviewed changes

gulugulubing added 2 commits November 24, 2025 18:36

Refine comma, string and numeric handling

e668d25

Remove duplicate peekSkipComma, use master version

71aab95

gulugulubing added 2 commits November 26, 2025 08:43

split formatter from serialize and add state in formatter

b28db94

delete duplicated peekComma() definition

956bd94

gulugulubing requested a review from schveiguy November 26, 2025 16:36

schveiguy requested changes Nov 29, 2025

View reviewed changes

schveiguy requested changes Nov 30, 2025

View reviewed changes

use canDoXXX before doing something

4c670cc

benjones reviewed Dec 3, 2025

View reviewed changes

source/iopipe/json/formatter.d Show resolved Hide resolved

gulugulubing requested a review from schveiguy December 3, 2025 04:44

schveiguy requested changes Dec 23, 2025

View reviewed changes

Correct addStringData

090a5e0

gulugulubing requested a review from schveiguy December 26, 2025 04:07


		void addMember(T)(T key, MemberOptions options) {

		if (key.length == 0 ) {

Conversation

gulugulubing commented Sep 23, 2025

Uh oh!

schveiguy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

schveiguy commented Nov 12, 2025

Uh oh!

gulugulubing commented Nov 12, 2025

Uh oh!

schveiguy commented Nov 12, 2025

Uh oh!

schveiguy Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

schveiguy commented Nov 22, 2025

Uh oh!

schveiguy left a comment

Choose a reason for hiding this comment

Uh oh!

schveiguy Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

schveiguy commented Nov 24, 2025

Uh oh!

gulugulubing commented Nov 25, 2025

Uh oh!

schveiguy commented Nov 25, 2025

Uh oh!

gulugulubing commented Nov 26, 2025

Uh oh!

schveiguy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

schveiguy Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

gulugulubing Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

schveiguy Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

gulugulubing commented Nov 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

schveiguy left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

schveiguy Nov 30, 2025

Choose a reason for hiding this comment

Uh oh!

schveiguy commented Nov 30, 2025

Uh oh!

gulugulubing commented Nov 30, 2025 •

edited

Loading

schveiguy left a comment •

edited

Loading