Skip to content

fix empty cell representation #1685

@jerch

Description

@jerch

Currently an empty cell in the buffer is represented by a single white space, which leads to several issues for any functionality operating on the string representations. At least affected by this are:

  • selection manager
  • linkifier
  • search addon
  • prolly copy & paste (not tested)
  • reflow resize, once we support this

Most of these rely on Buffer.translateBufferLineToString, that tries to deal with empty cells with the trimRight flag. Still there are circumstances where it cannot be decided at buffer level, how to create the correct string represenation, see #791 (comment).
Some workarounds are in place that try to fix those wrongly gathered string data by peeking again into the buffer, others will simply break at the edge cases (esp. the last right border cell thing is really nasty).

To get a more uniform handling of empty cells without the need of quirky patches here and there I suggest to define an empty cell to be a value that cannot be part of buffer string by normal means - any of the control chars would do (since the input handler will filter/trigger actions for control chars those will not end up as cell content values in the buffer). Imho the "hottest" candidate is the null byte '\x00' for several reasons:

  • a 0 value kinda implies there is nothing
  • easy translation with upcoming content pointer (where integer value stands for the UTF32 value, ergo 0 translates to '\u0000')

Why not simply use an empty string or null for an empty cell? Imho this would complicate things even further, since empty cells between others would "collapse" in a JS string while a placeholder can preserve the cell padding (There is another reason - we have to cover a third state and use this for cells after fullwidth chars, that would return an empty string). Imagine this input:

  • string 'ab'
  • cursor move one right
  • string 'c'
  • cursor move one right
  • string ' ' (one whitespace)

leads to these buffer states:

current
['a', 'b', ' ', 'd', ' ', ' ', ' ']          ==> 'ab d   ' // trim would cut input whitespace
vs. ''
['a', 'b', '', 'd', '', ' ', '']             ==> 'abd '    // collapsed, padding broken
vs. '\x00'
['a', 'b', '\x00', 'd', '\x00', ' ', '\x00'] ==> 'ab\x00d\x00 \x00'

The third string resembles the buffer state better than the others. The right border problem now can be solved by simply trimming '\x00' which correctly leads to

'ab\x00d\x00 '

In a last step the placeholder could be replaced by whatever is needed for further processing (most likely with whitespace).

Up for discussion. There might be other representation tricks to leverage fast built in string methods with the fullwidth chars too. Also it might have a negative impact on the renderer speed due to an additional check against the placeholder.

cc: @Tyriar, @mofux, @bgw

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/bugSomething is misbehaving

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions