Skip to content

Commit cacdc92

Browse files
authored
further relaxation of 1590.04 and 1590.07; base R ordering of identical strings in different encodings (#4494)
1 parent f6bc553 commit cacdc92

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

inst/tests/tests.Rraw

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8140,17 +8140,17 @@ test(1590.02, x1==x2)
81408140
test(1590.03, forderv( c(x2,x1,x1,x2)), integer()) # desirable consistent result given identical(x1, x2)
81418141
# ^^ data.table consistent over time regardless of which version of R or locale
81428142
baseR = base::order(c(x2,x1,x1,x2))
8143-
# Even though C locale and identical(x1,x2), base R considers the encoding too; i.e. orders the same-encoding together.
8144-
# In R <= 4.0.0, base R put x2 (UTF-8) before x1 (latin1).
8145-
# Then in R-devel around May 2020, R-devel on Windows started putting x1 before x2.
8146-
# Jan emailed R-devel on 23 May 2020. PR#4492 retained this test of base R but relaxed the encoding to be in either order.
8147-
# It's good to know that baseR changed. We still want to know in future if base R changes again (so we relaxed 1590.04 and 1590.07 rather than remove them).
8148-
test(1590.04, identical(baseR, INT(1,4,2,3)) || identical(baseR, INT(2,3,1,4)))
8143+
# Even though C locale and identical(x1,x2), base R<=4.0.0 considers the encoding too; i.e. orders the encoding together x2 (UTF-8) before x1 (latin1).
8144+
# Then around May 2020, R-devel (but just on Windows) started either respecting identical() like data.table has always done, or put latin1 before UTF-8.
8145+
# Jan emailed R-devel on 23 May 2020.
8146+
# We relaxed 1590.04 and 1590.07 (tests of base R behaviour) rather than remove them, PR#4492 and its follow-up. But these two tests
8147+
# are so relaxed now that they barely testing anything. It appears base R behaviour is undefined in this rare case of identical strings in different encodings.
8148+
test(1590.04, identical(baseR, INT(1,4,2,3)) || identical(baseR, INT(2,3,1,4)) || identical(baseR, 1:4))
81498149
Encoding(x2) = "unknown"
81508150
test(1590.05, x1!=x2)
81518151
test(1590.06, forderv( c(x2,x1,x1,x2)), INT(1,4,2,3)) # consistent with Windows-1252 result, tested further below
81528152
baseR = base::order(c(x2,x1,x1,x2))
8153-
test(1590.07, identical(baseR, INT(1,4,2,3)) || identical(baseR, INT(2,3,1,4)))
8153+
test(1590.07, identical(baseR, INT(1,4,2,3)) || identical(baseR, INT(2,3,1,4)) || identical(baseR, 1:4))
81548154
Sys.setlocale("LC_CTYPE", ctype)
81558155
Sys.setlocale("LC_COLLATE", collate)
81568156
test(1590.08, Sys.getlocale(), oldlocale) # checked restored locale fully back to how it was before this test

0 commit comments

Comments
 (0)