-
Notifications
You must be signed in to change notification settings - Fork 176
kernel: move ScannerState out of GAPState #3805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
980d91b to
cd10804
Compare
|
Does this PR deliberately have the |
|
@wilfwilson yes, because it removes |
|
Rebase at your pleasure! |
cd10804 to
296f460
Compare
|
|
||
| ScannerState scanner; | ||
| ScannerState * volatile s = &scanner; | ||
| memset(s, 0, sizeof(ScannerState)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now this PR causes a diff in testspecial; this is because we are now always using a fresh ScannerState; as a result, the check a few lines below for s->Symbol == S_EOF never succeeds, and so we never bail out earlier if the input file is already at EOF. The fix will be to add some other means to test for EOF in the input. I just didn't have time yet to take care of that; will do later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now had time to dig in, and my initial assumption about this being related to EOF was both right and wrong. I'll try to explain (also for my own sake, to see if I understood it...). The problematic test executed this code:
Read(InputTextString("1/0"));
quit;Note that there are two issues in the string we Read: A division by zero; and then missing semicolon. When reading the code, it is immediately executed, so the first thing that happens is an error for the division by zero. The quit exits the resulting break loop. Now parsing continues, and we hit EOF when we expected a semicolon; so now a syntax error is raised.
Now normally, SyntaxError prints a suitable error message, then prints the offending input line and in another line prints some stuff to "underline" the exact position of the syntax error. To do this, it needs to determine the (content of the) current input line, and also the start and end offset of the problematic token.
However, in this particular case, the stream actually reached EOF. It seems that this wipes the input line buffer (though I have not yet figured out why exactly). Hence SyntaxError fails to re-print the offending input. Which was always the case. What happens next is were things are different: The code in SyntaxError which tries to "underline" the code with a syntax error did not take this possibility into account, and as a result, erroneously printed a superfluous empty line.
It did so because while on master, we see SymbolStartPos[0] == 3 when SyntaxError() runs, with this PR we see SymbolStartPos[0] == -1. That causes the variables startPos and pos to be both -1; and then SyntaxError prints an empty line when it should not have printed anything. I've now fixed this oversight by adding a check that verifies that startPos is non-negative.
So where is the difference coming from? Well, the "3" actually comes from scanning the quit in the user's input. Note that on master, we reuse the same ScannerState instance for everything, and we never bothered to clear it between uses... Hence the SymbolStartPos entries from parsing the user's input from stdin stayed around, even when we switched back to parsing input from that string stream...
Clearly that's a bug, which this PR fixed "by accident", and that unmasked another dormant bug in SyntaxError, which I've now also fixed. Phew :-).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to slightly modify my fix: startPos is allowed to be negative (!) but pos must be non-negative, so we check that now instead.
And it turns out that this also "fixed" some existing tests that had weird extra empty lines... Nice :-)
95f98f7 to
a430cb4
Compare
This simplifies the code overall.
a430cb4 to
803a31e
Compare
|
Another unexpected benefit of this PR is that it fixes some incorrect interpreter profiling data, as can be seen by comparing https://codecov.io/gh/gap-system/gap/src/master/lib/init.g and https://codecov.io/gh/gap-system/gap/src/803a31eb2b9317bfea2ee12cd29dba1d0f75b321/lib/init.g |
This simplifies the code overall.
The next step will be to move the
ReaderStateout of GAPState and onto the stack, too. That's a much bigger change, though, both in terms of the size of the diff as well as the complications it involves. But it also has a bigger payoff, as it will allow us to remove some code that deal with "recursion" in the reader/coder/... and will open up further refactoring opportunities towards a more modular codebase that one can reason about ore easily as well.