Skip to content

Conversation

@fingolfin
Copy link
Member

This simplifies the code overall.

The next step will be to move the ReaderState out of GAPState and onto the stack, too. That's a much bigger change, though, both in terms of the size of the diff as well as the complications it involves. But it also has a bigger payoff, as it will allow us to remove some code that deal with "recursion" in the reader/coder/... and will open up further refactoring opportunities towards a more modular codebase that one can reason about ore easily as well.

@fingolfin fingolfin added topic: kernel release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes labels Dec 28, 2019
@wilfwilson
Copy link
Member

Does this PR deliberately have the kernel: move IsIdent to sysstr.h commit in it?

@fingolfin
Copy link
Member Author

@wilfwilson yes, because it removes scanner.h from gapstate.h. But a bunch of files indirectly relied on scanner.h being included to get sysstr.h (this is by accident, not by design). Since PR #3807 fixes this as well, it seemed silly to repeat that work here. Once PR #3807 is merged, I can rebase this PR here.

@wilfwilson
Copy link
Member

Rebase at your pleasure!


ScannerState scanner;
ScannerState * volatile s = &scanner;
memset(s, 0, sizeof(ScannerState));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this PR causes a diff in testspecial; this is because we are now always using a fresh ScannerState; as a result, the check a few lines below for s->Symbol == S_EOF never succeeds, and so we never bail out earlier if the input file is already at EOF. The fix will be to add some other means to test for EOF in the input. I just didn't have time yet to take care of that; will do later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now had time to dig in, and my initial assumption about this being related to EOF was both right and wrong. I'll try to explain (also for my own sake, to see if I understood it...). The problematic test executed this code:

Read(InputTextString("1/0"));
quit;

Note that there are two issues in the string we Read: A division by zero; and then missing semicolon. When reading the code, it is immediately executed, so the first thing that happens is an error for the division by zero. The quit exits the resulting break loop. Now parsing continues, and we hit EOF when we expected a semicolon; so now a syntax error is raised.

Now normally, SyntaxError prints a suitable error message, then prints the offending input line and in another line prints some stuff to "underline" the exact position of the syntax error. To do this, it needs to determine the (content of the) current input line, and also the start and end offset of the problematic token.

However, in this particular case, the stream actually reached EOF. It seems that this wipes the input line buffer (though I have not yet figured out why exactly). Hence SyntaxError fails to re-print the offending input. Which was always the case. What happens next is were things are different: The code in SyntaxError which tries to "underline" the code with a syntax error did not take this possibility into account, and as a result, erroneously printed a superfluous empty line.

It did so because while on master, we see SymbolStartPos[0] == 3 when SyntaxError() runs, with this PR we see SymbolStartPos[0] == -1. That causes the variables startPos and pos to be both -1; and then SyntaxError prints an empty line when it should not have printed anything. I've now fixed this oversight by adding a check that verifies that startPos is non-negative.

So where is the difference coming from? Well, the "3" actually comes from scanning the quit in the user's input. Note that on master, we reuse the same ScannerState instance for everything, and we never bothered to clear it between uses... Hence the SymbolStartPos entries from parsing the user's input from stdin stayed around, even when we switched back to parsing input from that string stream...

Clearly that's a bug, which this PR fixed "by accident", and that unmasked another dormant bug in SyntaxError, which I've now also fixed. Phew :-).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to slightly modify my fix: startPos is allowed to be negative (!) but pos must be non-negative, so we check that now instead.

And it turns out that this also "fixed" some existing tests that had weird extra empty lines... Nice :-)

@fingolfin fingolfin force-pushed the mh/ScannerState branch 4 times, most recently from 95f98f7 to a430cb4 Compare December 29, 2019 23:55
@coveralls
Copy link

coveralls commented Dec 30, 2019

Coverage Status

Coverage increased (+0.4%) to 85.057% when pulling 803a31e on fingolfin:mh/ScannerState into 892aad8 on gap-system:master.

@fingolfin
Copy link
Member Author

Another unexpected benefit of this PR is that it fixes some incorrect interpreter profiling data, as can be seen by comparing https://codecov.io/gh/gap-system/gap/src/master/lib/init.g and https://codecov.io/gh/gap-system/gap/src/803a31eb2b9317bfea2ee12cd29dba1d0f75b321/lib/init.g

@ChrisJefferson ChrisJefferson merged commit 7364695 into gap-system:master Dec 31, 2019
@fingolfin fingolfin deleted the mh/ScannerState branch January 2, 2020 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes topic: kernel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants