-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Currently an empty cell in the buffer is represented by a single white space, which leads to several issues for any functionality operating on the string representations. At least affected by this are:
- selection manager
- linkifier
- search addon
- prolly copy & paste (not tested)
- reflow resize, once we support this
Most of these rely on Buffer.translateBufferLineToString, that tries to deal with empty cells with the trimRight flag. Still there are circumstances where it cannot be decided at buffer level, how to create the correct string represenation, see #791 (comment).
Some workarounds are in place that try to fix those wrongly gathered string data by peeking again into the buffer, others will simply break at the edge cases (esp. the last right border cell thing is really nasty).
To get a more uniform handling of empty cells without the need of quirky patches here and there I suggest to define an empty cell to be a value that cannot be part of buffer string by normal means - any of the control chars would do (since the input handler will filter/trigger actions for control chars those will not end up as cell content values in the buffer). Imho the "hottest" candidate is the null byte '\x00' for several reasons:
- a 0 value kinda implies there is nothing
- easy translation with upcoming content pointer (where integer value stands for the UTF32 value, ergo 0 translates to '\u0000')
Why not simply use an empty string or null for an empty cell? Imho this would complicate things even further, since empty cells between others would "collapse" in a JS string while a placeholder can preserve the cell padding (There is another reason - we have to cover a third state and use this for cells after fullwidth chars, that would return an empty string). Imagine this input:
- string 'ab'
- cursor move one right
- string 'c'
- cursor move one right
- string ' ' (one whitespace)
leads to these buffer states:
current
['a', 'b', ' ', 'd', ' ', ' ', ' '] ==> 'ab d ' // trim would cut input whitespace
vs. ''
['a', 'b', '', 'd', '', ' ', ''] ==> 'abd ' // collapsed, padding broken
vs. '\x00'
['a', 'b', '\x00', 'd', '\x00', ' ', '\x00'] ==> 'ab\x00d\x00 \x00'
The third string resembles the buffer state better than the others. The right border problem now can be solved by simply trimming '\x00' which correctly leads to
'ab\x00d\x00 '
In a last step the placeholder could be replaced by whatever is needed for further processing (most likely with whitespace).
Up for discussion. There might be other representation tricks to leverage fast built in string methods with the fullwidth chars too. Also it might have a negative impact on the renderer speed due to an additional check against the placeholder.