Skip to content

Conversation

@karwa
Copy link
Contributor

@karwa karwa commented Nov 14, 2020

Currently: Creating a buffer with a negative count, or a nil pointer and non-zero count (in violation of the initialiser's documentation) traps in every build mode. This is unfortunate, since these conditions are difficult for the compiler to prove and eliminate.

After this change: Violating the documentation is a user error and will only trap in debug mode. As with other operations on unsafe buffers (e.g. out of bounds subscripting), they will not trap in release mode.

Every other Unsafe(Mutable)BufferPointer precondition in this file is debug-mode only.

This information is already part of the documentation for this function, and it's really hard to get rid of these branches and traps otherwise.
@karwa
Copy link
Contributor Author

karwa commented Nov 14, 2020

CC @airspeedswift as stdlib code owner. I'm not sure how to properly "request a review".

@airspeedswift
Copy link
Member

I’ve added @atrick and @lorentey to give an initial opinion.

Copy link
Contributor

@atrick atrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems consistent with the intention of the UBP types. I can't come up with an argument against it.

@airspeedswift
Copy link
Member

@swift-ci please test

@airspeedswift
Copy link
Member

@swift-ci please benchmark

@swift-ci
Copy link
Contributor

Build failed before running benchmark.

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - e0b4228

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - e0b4228

Copy link
Member

@lorentey lorentey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! There should not be any performance drawback to using buffer pointers instead of naked pointers.

A tiny part of me wants to argue against this on some vague principle, but I'm failing to come up with any reasonable argument that wouldn't also obviously apply to subscripts. 😉

@lorentey
Copy link
Member

******************** TEST 'Swift(macosx-x86_64) :: stdlib/ArrayTraps.swift.gyb' FAILED ********************
...
[ RUN      ] ArrayTraps_release.unsafeLength
expecting a crash, but the test did not crash
[     FAIL ] ArrayTraps_release.unsafeLength

@lorentey
Copy link
Member

******************** TEST 'Swift(macosx-x86_64) :: SILOptimizer/utf8_decoding_fastpath.swift' FAILED ********************
/Users/buildnode/jenkins/workspace/swift-PR-osx/branch-main/swift/test/SILOptimizer/utf8_decoding_fastpath.swift:71:11: error: CHECK: expected string not found in input
// CHECK: function_ref {{.*}}_fromUTF8Repairing
          ^
<stdin>:505:72: note: scanning from here
sil @$s22utf8_decoding_fastpath17decodeUMRBPAsUTF8ySSSwF : $@convention(thin) (UnsafeMutableRawBufferPointer) -> @owned String {
                                                                       ^
<stdin>:513:3: note: possible intended match here
 // function_ref closure #2 in String.init<A, B>(decoding:as:)
  ^

Input file: <stdin>
Check file: /Users/buildnode/jenkins/workspace/swift-PR-osx/branch-main/swift/test/SILOptimizer/utf8_decoding_fastpath.swift

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
          500:  dealloc_stack %2 : $*_HasContiguousBytes // id: %59
          501:  return %57 : $String // id: %60
          502: } // end sil function '$s22utf8_decoding_fastpath16decodeURBPAsUTF8ySSSWF'
          503: 
          504: // decodeUMRBPAsUTF8(_:)
          505: sil @$s22utf8_decoding_fastpath17decodeUMRBPAsUTF8ySSSwF : $@convention(thin) (UnsafeMutableRawBufferPointer) -> @owned String {
check:71'0                                                                            X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
          506: // %0 "ptr" // users: %4, %1
check:71'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          507: bb0(%0 : $UnsafeMutableRawBufferPointer):
check:71'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          508:  debug_value %0 : $UnsafeMutableRawBufferPointer, let, name "ptr", argno 1 // id: %1
check:71'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          509:  %2 = alloc_stack $_HasContiguousBytes // users: %13, %12, %5, %3
check:71'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          510:  %3 = init_existential_addr %2 : $*_HasContiguousBytes, $UnsafeMutableRawBufferPointer // user: %4
check:71'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          511:  store %0 to %3 : $*UnsafeMutableRawBufferPointer // id: %4
check:71'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          512:  %5 = open_existential_addr immutable_access %2 : $*_HasContiguousBytes to $*@opened("AF5B145A-29BB-11EB-A45E-003EE1C8D498") _HasContiguousBytes // user: %7
check:71'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          513:  // function_ref closure #2 in String.init<A, B>(decoding:as:)
check:71'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:71'1       ?                                                            possible intended match
          514:  %6 = function_ref @$sSS8decoding2asSSx_q_mtcSlRzs16_UnicodeEncodingR_8CodeUnitQy_7ElementRtzr0_lufcSSSWXEfU0_ : $@convention(thin) (UnsafeRawBufferPointer) -> @owned String // user: %11
check:71'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          515:  %7 = unchecked_addr_cast %5 : $*@opened("AF5B145A-29BB-11EB-A45E-003EE1C8D498") _HasContiguousBytes to $*UnsafeMutableRawBufferPointer // user: %8
check:71'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          516:  %8 = load %7 : $*UnsafeMutableRawBufferPointer // user: %10
check:71'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          517:  // function_ref specialized UnsafeRawBufferPointer.init(_:)
check:71'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          518:  %9 = function_ref @$sSWySWSwcfCTf4nd_n : $@convention(thin) (UnsafeMutableRawBufferPointer) -> UnsafeRawBufferPointer // user: %10
check:71'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            .
            .
            .
>>>>>>

@lorentey
Copy link
Member

The array test failure is triggered by this change -- it is actually testing one of these preconditions. It needs to have its skip condition updated to _isDebugAssertConfiguration().

@lorentey
Copy link
Member

It's somewhat disconcerting that there aren't more test failures -- it looks like we may not be adequately testing these checks.

@lorentey
Copy link
Member

The failing SILOptimizer test is this one:

// UnsafeMutableRawBufferPointer
//
// CHECK-LABEL: sil {{.*}}decodeUMRBPAsUTF8{{.*}} : $@convention
// CHECK-NOT:   function_ref {{.*}}_fromNonContiguousUnsafeBitcastUTF8Repairing
// CHECK-NOT:   function_ref {{.*}}_fromCodeUnits
// CHECK:       function_ref {{.*}}_fromUTF8Repairing                     ⟸ FAILURE HERE
// CHECK-NOT:   function_ref {{.*}}_fromNonContiguousUnsafeBitcastUTF8Repairing
// CHECK-NOT:   function_ref {{.*}}_fromCodeUnits
// CHECK-LABEL: end sil function{{.*}}decodeUMRBPAsUTF8
public func decodeUMRBPAsUTF8(_ ptr: UnsafeMutableRawBufferPointer) -> String {
  return String(decoding: ptr, as: Unicode.UTF8.self)
}

This is supposed to check that String(decoding:as:) correctly compiles down to its UTF-8 fast path. The previous test succeeds with an UnsafeRawBufferPointer argument, but for some reason this one fails with UnsafeMutableRawBufferPointer.

I wonder if the removal of the precondition led to the compiler randomly deciding to inline the String._fromUTF8Repairing(_:) call here.

@milseman Would it be okay if we disabled this particular check for _fromUTF8Repairing? FWIW, it looks like a subsequent check (for Substring.UTF8View) has already run into a similar problem.

@karwa
Copy link
Contributor Author

karwa commented Nov 19, 2020

I misread the comment at the top of the unsafe-buffer validation tests and thought these weren't able to be tested. Of course, since this isn't an optimisation-only behaviour right now, it can be and apparently is tested 🤦‍♂️

I don't know what to do about the SILOptimizer test, so I'll wait for @milseman 's advice on that.

It's nice to get rid of this check because basically every way of creating an UBP goes through it. Slice<UBP> also doesn't have any useful extra APIs (you're supposed to rebase the slice via UBP.init(rebasing:), which triggers these hard-to-remove checks). It seems strange to have these checks when we don't even do bounds-checking on the resulting buffer.

The one possible exception that I've just thought of is the .allocate function: presumably if we fail to allocate, we will get a nil pointer with a non-zero count, which would have trapped but now won't. That's not the change that I intend to make here, so I think it's a good idea to have an independent check inside of .allocate.

@milseman
Copy link
Member

_fromUTF8Repairing is not @inlinable, so at first blush it looks like this test is correctly detecting a failure to compile down to the fast path. What is the SIL for that function which is generated?

@milseman
Copy link
Member

@swift-ci please benchmark

@swift-ci
Copy link
Contributor

Performance: -O

Regression OLD NEW DELTA RATIO
SuffixSequenceLazy 173 1783 +930.6% 0.10x
SuffixAnySequence 176 1780 +911.4% 0.10x
SuffixSequence 173 1717 +892.5% 0.10x
FlattenListFlatMap 2359 19793 +739.0% 0.12x
ArrayAppendRepeatCol 400 670 +67.5% 0.60x
CharIteration_tweet_unicodeScalars 3720 4840 +30.1% 0.77x
CharIteration_ascii_unicodeScalars 1880 2440 +29.8% 0.77x (?)
Data.init.Sequence.64kB.Count.RE.I 16 19 +18.7% 0.84x (?)
Data.init.Sequence.64kB.Count.RE 16 19 +18.7% 0.84x (?)
PrefixSequenceLazy 22 26 +18.2% 0.85x (?)
PrefixSequence 22 26 +18.2% 0.85x
SortIntPyramid 410 460 +12.2% 0.89x (?)
Data.init.Sequence.809B.Count.RE 41 46 +12.2% 0.89x (?)
Data.init.Sequence.809B.Count.RE.I 41 46 +12.2% 0.89x (?)
DropLastAnySequence 336 367 +9.2% 0.92x (?)
EqualSubstringSubstring 22 24 +9.1% 0.92x (?)
SortSortedStrings 46 50 +8.7% 0.92x (?)
 
Improvement OLD NEW DELTA RATIO
ArrayAppendLazyMap 1580 850 -46.2% 1.86x
ArrayAppendLatin1Substring 19260 11268 -41.5% 1.71x (?)
ArrayAppendAsciiSubstring 18864 11160 -40.8% 1.69x (?)
ArrayAppendUTF16Substring 18864 11196 -40.6% 1.68x
RemoveWhereSwapInts 12 10 -16.7% 1.20x
ArrayLiteral2 86 74 -14.0% 1.16x (?)
DataAppendDataSmallToSmall 3220 2780 -13.7% 1.16x (?)
LazilyFilteredArrayContains 11900 10300 -13.4% 1.16x (?)
CharIndexing_punctuated_unicodeScalars_Backwards 880 800 -9.1% 1.10x (?)
CSVParsingAltIndices2 616 561 -8.9% 1.10x (?)
Set.isDisjoint.Empty.Box 94 87 -7.4% 1.08x (?)

Code size: -O

Regression OLD NEW DELTA RATIO
Suffix.o 22341 26373 +18.0% 0.85x
DiffingMyers.o 6085 6501 +6.8% 0.94x
Diffing.o 7931 8341 +5.2% 0.95x
StringRemoveDupes.o 5373 5629 +4.8% 0.95x
ReduceInto.o 9787 10123 +3.4% 0.97x
RomanNumbers.o 5404 5580 +3.3% 0.97x
StringEdits.o 9713 10017 +3.1% 0.97x
LazyFilter.o 7688 7864 +2.3% 0.98x
DropLast.o 20350 20750 +2.0% 0.98x
StringWalk.o 32951 33519 +1.7% 0.98x
CSVParsing.o 54371 55283 +1.7% 0.98x
SetTests.o 121037 122405 +1.1% 0.99x
UTF8Decode.o 23483 23723 +1.0% 0.99x
 
Improvement OLD NEW DELTA RATIO
RemoveWhere.o 15667 14259 -9.0% 1.10x
FlattenList.o 3834 3674 -4.2% 1.04x

Performance: -Osize

Regression OLD NEW DELTA RATIO
SuffixAnySequence 628 1994 +217.5% 0.31x
SuffixSequenceLazy 624 1966 +215.1% 0.32x
SuffixSequence 625 1895 +203.2% 0.33x
SuffixArray 4 9 +125.0% 0.44x
PrefixWhileArrayLazy 20 40 +100.0% 0.50x
ArrayAppendLazyMap 520 1020 +96.2% 0.51x
PrefixWhileCountableRange 14 26 +85.7% 0.54x
DropWhileCountableRange 14 26 +85.7% 0.54x
DropFirstCountableRange 14 26 +85.7% 0.54x
PrefixCountableRange 15 26 +73.3% 0.58x
PrefixCountableRangeLazy 15 26 +73.3% 0.58x
SuffixCountableRange 6 9 +50.0% 0.67x
PrefixWhileAnyCollection 113 166 +46.9% 0.68x
DropFirstAnyCollection 86 126 +46.5% 0.68x
ArrayAppendSequence 670 850 +26.9% 0.79x (?)
DropLastCountableRangeLazy 4 5 +25.0% 0.80x (?)
DropWhileAnyCollection 99 121 +22.2% 0.82x
UTF8Decode_InitFromCustom_noncontiguous_ascii 623 750 +20.4% 0.83x
SuffixCountableRangeLazy 5 6 +20.0% 0.83x (?)
DropWhileArrayLazy 49 58 +18.4% 0.84x
DropWhileAnySeqCRangeIterLazy 135 156 +15.6% 0.87x
UTF8Decode_InitFromCustom_noncontiguous_ascii_as_ascii 720 828 +15.0% 0.87x (?)
DropLastAnyCollection 35 40 +14.3% 0.88x
PrefixWhileAnySeqCRangeIterLazy 106 121 +14.2% 0.88x
CharIndexing_ascii_unicodeScalars_Backwards 6120 6960 +13.7% 0.88x (?)
UTF8Decode_InitFromCustom_noncontiguous 288 326 +13.2% 0.88x (?)
PrefixWhileAnySeqCntRangeLazy 107 121 +13.1% 0.88x
DropWhileAnySeqCntRange 97 109 +12.4% 0.89x (?)
PrefixWhileAnyCollectionLazy 108 121 +12.0% 0.89x
SortAdjacentIntPyramids 950 1050 +10.5% 0.90x (?)
DropWhileAnySeqCntRangeLazy 135 148 +9.6% 0.91x (?)
CharIndexing_japanese_unicodeScalars_Backwards 9000 9840 +9.3% 0.91x (?)
 
Improvement OLD NEW DELTA RATIO
DropLastCountableRange 9 4 -55.5% 2.25x
DropLastArrayLazy 9 4 -55.5% 2.25x
DropFirstArrayLazy 26 13 -50.0% 2.00x
DropFirstArray 26 13 -50.0% 2.00x
StaticArray 2 1 -50.0% 2.00x
ArrayPlusEqualSingleElementCollection 564 423 -25.0% 1.33x (?)
CharIteration_ascii_unicodeScalars 2320 1760 -24.1% 1.32x
CharIteration_tweet_unicodeScalars 4560 3480 -23.7% 1.31x
PrefixAnySeqCntRange 120 93 -22.5% 1.29x (?)
DropWhileSequenceLazy 80 62 -22.5% 1.29x
PrefixWhileArray 67 54 -19.4% 1.24x
PrefixAnySeqCRangeIter 120 98 -18.3% 1.22x
DropWhileCountableRangeLazy 71 58 -18.3% 1.22x
Dictionary4 254 210 -17.3% 1.21x
CharIteration_punctuated_unicodeScalars 520 440 -15.4% 1.18x
CharIteration_korean_unicodeScalars 2760 2400 -13.0% 1.15x
CharIteration_chinese_unicodeScalars 2240 1960 -12.5% 1.14x
CharIteration_ascii_unicodeScalars_Backwards 3560 3120 -12.4% 1.14x
CharIteration_tweet_unicodeScalars_Backwards 7120 6280 -11.8% 1.13x (?)
Data.append.Sequence.809B.Count0 240 212 -11.7% 1.13x (?)
Data.append.Sequence.809B.Count0.I 240 214 -10.8% 1.12x (?)
CharIteration_russian_unicodeScalars 2680 2400 -10.4% 1.12x (?)
CharIteration_japanese_unicodeScalars 3720 3360 -9.7% 1.11x (?)
CharIteration_punctuated_unicodeScalars_Backwards 840 760 -9.5% 1.11x
CharIteration_japanese_unicodeScalars_Backwards 5960 5400 -9.4% 1.10x (?)
CharIteration_korean_unicodeScalars_Backwards 4720 4280 -9.3% 1.10x (?)
CharIteration_chinese_unicodeScalars_Backwards 3480 3160 -9.2% 1.10x (?)
CharIteration_punctuatedJapanese_unicodeScalars_Backwards 880 800 -9.1% 1.10x (?)
Set.isSubset.Seq.Empty.Int 88 80 -9.1% 1.10x (?)
CharIteration_utf16_unicodeScalars 3120 2840 -9.0% 1.10x (?)
Set.isDisjoint.Seq.Int.Empty 56 51 -8.9% 1.10x (?)
Set.isDisjoint.Seq.Empty.Int 92 84 -8.7% 1.10x (?)
DataAccessBytesMedium 58 53 -8.6% 1.09x (?)
Set.isStrictSuperset.Seq.Empty.Int 189 173 -8.5% 1.09x (?)
DropWhileAnyCollectionLazy 157 144 -8.3% 1.09x (?)
Set.isStrictSubset.Seq.Int.Empty 134 123 -8.2% 1.09x (?)
Data.append.Sequence.64kB.Count0.I 167 154 -7.8% 1.08x (?)
CharIteration_punctuatedJapanese_unicodeScalars 520 480 -7.7% 1.08x (?)
Set.isDisjoint.Empty.Int 93 86 -7.5% 1.08x
SetUnionInt50 67 62 -7.5% 1.08x (?)
DropFirstAnySequenceLazy 1768 1637 -7.4% 1.08x (?)
Set.isDisjoint.Empty.Box 95 88 -7.4% 1.08x (?)
Set.isSuperset.Seq.Int.Empty 95 88 -7.4% 1.08x (?)
Set.isDisjoint.Seq.Empty.Box 95 88 -7.4% 1.08x (?)
RemoveWhereQuadraticString 210 195 -7.1% 1.08x (?)
Set.isSuperset.Seq.Empty.Int 56 52 -7.1% 1.08x (?)
Data.hash.Medium 28 26 -7.1% 1.08x (?)
MapReduceShortString 14 13 -7.1% 1.08x (?)
Data.init.Sequence.64kB.Count.I 29 27 -6.9% 1.07x (?)
Data.append.Sequence.64kB.Count.I 29 27 -6.9% 1.07x (?)
Data.append.Sequence.64kB.Count 29 27 -6.9% 1.07x (?)
Set.isSubset.Seq.Int.Empty 134 125 -6.7% 1.07x (?)
Data.append.Sequence.64kB.Count0 164 153 -6.7% 1.07x (?)

Code size: -Osize

Regression OLD NEW DELTA RATIO
Suffix.o 21199 24741 +16.7% 0.86x
StringRemoveDupes.o 4374 4616 +5.5% 0.95x
ReduceInto.o 8398 8735 +4.0% 0.96x
RomanNumbers.o 5022 5216 +3.9% 0.96x
DiffingMyers.o 6090 6286 +3.2% 0.97x
Diffing.o 7925 8121 +2.5% 0.98x
StringWalk.o 32059 32424 +1.1% 0.99x
 
Improvement OLD NEW DELTA RATIO
RemoveWhere.o 14083 13302 -5.5% 1.06x
StringMatch.o 4144 4027 -2.8% 1.03x
LazyFilter.o 7437 7314 -1.7% 1.02x

Performance: -Onone

Regression OLD NEW DELTA RATIO
DataAppendDataSmallToSmall 3980 4380 +10.1% 0.91x (?)
 
Improvement OLD NEW DELTA RATIO
ArrayOfGenericPOD2 586 523 -10.8% 1.12x (?)

Code size: -swiftlibs

Improvement OLD NEW DELTA RATIO
libswiftFoundation.dylib 1359872 1343488 -1.2% 1.01x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: 6-Core Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

@lorentey
Copy link
Member

lorentey commented Dec 4, 2020

@milseman @atrick I finally managed to reproduce this locally -- here is the emitted sil:

// decodeUMRBPAsUTF8(_:)
sil @$s22utf8_decoding_fastpath17decodeUMRBPAsUTF8ySSSwF : $@convention(thin) (UnsafeMutableRawBufferPointer) -> @owned String {
// %0 "ptr"                                       // users: %4, %1
bb0(%0 : $UnsafeMutableRawBufferPointer):
  debug_value %0 : $UnsafeMutableRawBufferPointer, let, name "ptr", argno 1 // id: %1
  %2 = alloc_stack $_HasContiguousBytes           // users: %13, %12, %5, %3
  %3 = init_existential_addr %2 : $*_HasContiguousBytes, $UnsafeMutableRawBufferPointer // user: %4
  store %0 to %3 : $*UnsafeMutableRawBufferPointer // id: %4
  %5 = open_existential_addr immutable_access %2 : $*_HasContiguousBytes to $*@opened("08EFEDA4-360C-11EB-9D6A-D0817AD8D2AD") _HasContiguousBytes // user: %7
  // function_ref closure #2 in String.init<A, B>(decoding:as:)
  %6 = function_ref @$sSS8decoding2asSSx_q_mtcSlRzs16_UnicodeEncodingR_8CodeUnitQy_7ElementRtzr0_lufcSSSWXEfU0_ : $@convention(thin) (UnsafeRawBufferPointer) -> @owned String // user: %11
  %7 = unchecked_addr_cast %5 : $*@opened("08EFEDA4-360C-11EB-9D6A-D0817AD8D2AD") _HasContiguousBytes to $*UnsafeMutableRawBufferPointer // user: %8
  %8 = load %7 : $*UnsafeMutableRawBufferPointer  // user: %10
  // function_ref specialized UnsafeRawBufferPointer.init(_:)
  %9 = function_ref @$sSWySWSwcfCTf4nd_n : $@convention(thin) (UnsafeMutableRawBufferPointer) -> UnsafeRawBufferPointer // user: %10
  %10 = apply %9(%8) : $@convention(thin) (UnsafeMutableRawBufferPointer) -> UnsafeRawBufferPointer // user: %11
  %11 = apply %6(%10) : $@convention(thin) (UnsafeRawBufferPointer) -> @owned String // user: %14
  destroy_addr %2 : $*_HasContiguousBytes         // id: %12
  dealloc_stack %2 : $*_HasContiguousBytes        // id: %13
  return %11 : $String                            // id: %14
} // end sil function '$s22utf8_decoding_fastpath17decodeUMRBPAsUTF8ySSSwF'

The failing test is looking for function_ref {{.*}}_fromUTF8Repairing which indeed doesn't occur in this.

As a reminder, the test is definitely just calling the decoding initializer with the UTF8 encoding:

public func decodeUMRBPAsUTF8(_ ptr: UnsafeMutableRawBufferPointer) -> String {
  return String(decoding: ptr, as: Unicode.UTF8.self)
}

The problem is that UnsafeMutableRawBufferPointer doesn't implement withContiguousStorageIfAvailable (because it can't do that without rebinding memory), so the emitted sil corresponds to the third branch below, which deals with _HasContiguousBytes. Closure #2 lacks the Builtin.onFastPath() workaround that is on the withContiguousStorageIfAvailable path, so it doesn't get inlined -- which is why the test doesn't find _fromUTF8Repairing.

  @inlinable
  @inline(__always) // Eliminate dynamic type check when possible
  public init<C: Collection, Encoding: Unicode.Encoding>(
    decoding codeUnits: C, as sourceEncoding: Encoding.Type
  ) where C.Iterator.Element == Encoding.CodeUnit {
    guard _fastPath(sourceEncoding == UTF8.self) else {
      self = String._fromCodeUnits(
        codeUnits, encoding: sourceEncoding, repair: true)!.0
      return
    }

    // Fast path for user-defined Collections and typed contiguous collections.
    //
    // Note: this comes first, as the optimizer nearly always has insight into
    // wCSIA, but cannot prove that a type does not have conformance to
    // _HasContiguousBytes.
    if let str = codeUnits.withContiguousStorageIfAvailable({
      (buffer: UnsafeBufferPointer<C.Element>) -> String in
      Builtin.onFastPath() // encourage SIL Optimizer to inline this closure :-(
      let rawBufPtr = UnsafeRawBufferPointer(buffer)
      return String._fromUTF8Repairing(
        UnsafeBufferPointer(
          start: rawBufPtr.baseAddress?.assumingMemoryBound(to: UInt8.self),
          count: rawBufPtr.count)).0
    }) {
      self = str
      return
    }

    // Fast path for untyped raw storage and known stdlib types
    if let contigBytes = codeUnits as? _HasContiguousBytes,
      contigBytes._providesContiguousBytesNoCopy
    {
      self = contigBytes.withUnsafeBytes { rawBufPtr in
        ****************
        return String._fromUTF8Repairing(
          UnsafeBufferPointer(
            start: rawBufPtr.baseAddress?.assumingMemoryBound(to: UInt8.self),
            count: rawBufPtr.count)).0
      }
      return
    }

    self = String._fromNonContiguousUnsafeBitcastUTF8Repairing(codeUnits).0
}

I'm sure there is a good reason why changing a couple of _preconditions to _debugPrecondition prevents the compiler from inlining closure #2 in optimized builds. Perhaps the shorter UBP/URBP initializers trigger the inlining of another function that pushes things over a limit. ¯_(ツ)_/¯

What if we added the _onFastPath to the second closure, too? (Or removed it from both?)

@lorentey
Copy link
Member

lorentey commented Dec 4, 2020

With _onFastPath in both closures, the emitted function looks quite a bit more scary, but it does have the call we want:

// decodeUMRBPAsUTF8(_:)
sil @$s22utf8_decoding_fastpath17decodeUMRBPAsUTF8ySSSwF : $@convention(thin) (UnsafeMutableRawBufferPointer) -> @owned String {
// %0 "ptr"                                       // users: %4, %1
bb0(%0 : $UnsafeMutableRawBufferPointer):
  debug_value %0 : $UnsafeMutableRawBufferPointer, let, name "ptr", argno 1 // id: %1
  %2 = alloc_stack $_HasContiguousBytes           // users: %29, %28, %5, %3
  %3 = init_existential_addr %2 : $*_HasContiguousBytes, $UnsafeMutableRawBufferPointer // user: %4
  store %0 to %3 : $*UnsafeMutableRawBufferPointer // id: %4
  %5 = open_existential_addr immutable_access %2 : $*_HasContiguousBytes to $*@opened("09F1FD1C-3613-11EB-8F91-D0817AD8D2AD") _HasContiguousBytes // user: %6
  %6 = unchecked_addr_cast %5 : $*@opened("09F1FD1C-3613-11EB-8F91-D0817AD8D2AD") _HasContiguousBytes to $*UnsafeMutableRawBufferPointer // user: %7
  %7 = load %6 : $*UnsafeMutableRawBufferPointer  // user: %9
  // function_ref specialized UnsafeRawBufferPointer.init(_:)
  %8 = function_ref @$sSWySWSwcfCTf4nd_n : $@convention(thin) (UnsafeMutableRawBufferPointer) -> UnsafeRawBufferPointer // user: %9
  %9 = apply %8(%7) : $@convention(thin) (UnsafeMutableRawBufferPointer) -> UnsafeRawBufferPointer // users: %23, %11
  %10 = builtin "onFastPath"() : $()
  %11 = struct_extract %9 : $UnsafeRawBufferPointer, #UnsafeRawBufferPointer._position // user: %12
  switch_enum %11 : $Optional<UnsafeRawPointer>, case #Optional.some!enumelt: bb1, case #Optional.none!enumelt: bb2 // id: %12

// %13                                            // user: %14
bb1(%13 : $UnsafeRawPointer):                     // Preds: bb0
  %14 = struct_extract %13 : $UnsafeRawPointer, #UnsafeRawPointer._rawValue // user: %15
  %15 = struct $UnsafePointer<UInt8> (%14 : $Builtin.RawPointer) // user: %16
  %16 = enum $Optional<UnsafePointer<UInt8>>, #Optional.some!enumelt, %15 : $UnsafePointer<UInt8> // user: %17
  br bb3(%16 : $Optional<UnsafePointer<UInt8>>)   // id: %17

bb2:                                              // Preds: bb0
  %18 = enum $Optional<UnsafePointer<UInt8>>, #Optional.none!enumelt // user: %19
  br bb3(%18 : $Optional<UnsafePointer<UInt8>>)   // id: %19

// %20                                            // user: %24
bb3(%20 : $Optional<UnsafePointer<UInt8>>):       // Preds: bb2 bb1
  %21 = metatype $@thin String.Type               // user: %26
  // function_ref UnsafeRawBufferPointer.count.getter
  %22 = function_ref @$sSW5countSivg : $@convention(method) (UnsafeRawBufferPointer) -> Int // user: %23
  %23 = apply %22(%9) : $@convention(method) (UnsafeRawBufferPointer) -> Int // user: %24
  %24 = struct $UnsafeBufferPointer<UInt8> (%20 : $Optional<UnsafePointer<UInt8>>, %23 : $Int) // user: %26
  // function_ref static String._fromUTF8Repairing(_:)
  %25 = function_ref @$sSS18_fromUTF8RepairingySS6result_Sb11repairsMadetSRys5UInt8VGFZ : $@convention(method) (UnsafeBufferPointer<UInt8>, @thin String.Type) -> (@owned String, Bool) // user: %26
  %26 = apply %25(%24, %21) : $@convention(method) (UnsafeBufferPointer<UInt8>, @thin String.Type) -> (@owned String, Bool) // user: %27
  %27 = tuple_extract %26 : $(String, Bool), 0    // user: %30
  destroy_addr %2 : $*_HasContiguousBytes         // id: %28
  dealloc_stack %2 : $*_HasContiguousBytes        // id: %29
  return %27 : $String                            // id: %30
} // end sil function '$s22utf8_decoding_fastpath17decodeUMRBPAsUTF8ySSSwF'

I submitted PR #34961 that includes this change along with #34951 and what we learned from #34879.

@lorentey
Copy link
Member

Closing, as these changes have already been merged as part of #34961. Thanks @karwa!

@lorentey lorentey closed this Jan 20, 2021
@karwa karwa deleted the patch-5 branch March 23, 2021 05:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants