Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 110 additions & 0 deletions active/0000-unsafe-api-location.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
- Start Date: (fill me in with today's date, 2014-09-15)
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary

This is a *conventions RFC* for settling the location of `unsafe` APIs relative
to the types they work with, as well as the use of `raw` submodules.

The brief summary is:

* Unsafe APIs should be made into methods or static functions in the same cases
that safe APIs would be.

* `raw` submodules should be used only to provide APIs directly on low-level
representations.

# Motivation

Many data structures provide unsafe APIs either for avoiding checks or working
directly with their (otherwise private) representation. For example, `string`
provides:

* An `as_mut_vec` method on `String` that provides a `Vec<u8>` view of the
string. This method makes it easy to work with the byte-based representation
of the string, but thereby also allows violation of the utf8 guarantee.

* A `raw` submodule with a number of free functions, like `from_parts`, that

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are problems with using a raw module that are not addressed. There are modules are btree treemap, trie exposing more than one type and a raw module would fail to distinguish between these. A raw module also interacts very poorly with re-exports and adds verbosity to usage of the API.

The motivation for making unsafe code uglier in these cases is not laid out. The point of unsafe is to draw a boundary between safe and unsafe code without the need for conventions to distinguish between these. The rule about construction from raw pointer representation is very obscure, and doesn't change the fact that it makes the API less consistent and more painful to use without gaining anything in return.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thestinger

There are problems with using a raw module that are not addressed. There are modules are btree treemap, trie exposing more than one type and a raw module would fail to distinguish between these. A raw module also interacts very poorly with re-exports and adds verbosity to usage of the API. The motivation for making unsafe code uglier in these cases is not laid out.

I'm a little confused by this comment. The RFC recommends making unsafe functions into methods or static functions in the same cases you'd do so for safe APIs. It cites as an explicit benefit that importing/using these APIs becomes easier by doing so:

The benefit to moving unsafe APIs into methods (resp. static functions) is the usual one: you can gain easy access to these APIs merely by having a value of the type (resp. importing the type).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You keep asserting that this makes the API harder to use, but that's extremely subjective. I've always found the raw modules in str/slice/string/vec make the API much easier to use, because any time I need to construct a value from raw pointers, I know precisely where to look.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aturon: Sorry, I misinterpreted the proposed guidelines. The reason I didn't find the details about how to handle those edge cases and a detailed rationale is because it's not what you're proposing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think much of the "know where to look" problem would be fixed by segmenting the things in the source, and also allowing rustdoc to divide functions/methods into user-defined subsections. This would mean it is clear both in the .rs file and in the docs.

construct `String` instances from various raw-pointer-based representations,
as well as a `from_utf8` variant that does not actually check for utf8
validity.

The problem is that currently, there is no consistent guideline about which of
these APIs should live as methods/static functions associated with a type, and
which should live in a `raw` submodule. Both forms appear throughout the
standard library.

# Detailed design

The proposed convention is:

* When an unsafe function/method is clearly "about" a certain type (as a way of
constructing, destructuring, or modifying values of that type), it should be a
method or static function on that type. This is the same as the convention for
placement of safe functions/methods.

* When an unsafe function/method is specifically for producing or consuming the
underlying representation of a data structure (which is otherwise private),
the API should use `raw` in its name. Specifically, `from_raw_parts` is the
typical name used for constructing a value from e.g. a pointer-based
representation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused here. Vec::from_raw_parts seems to be covered under the convention for raw submodules, as it is constructing a Vec from the low-level representation (pointer, length, capacity). If it's put in the raw submodule then it shouldn't have raw in the name, because that's unnecessarily verbose (raw::from_raw_parts?),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kballard It seems that the RFC text is unclear. I meant for from_raw_parts to be a static function (according to the first bullet). The second bullet was just a clarification about the name.

I'm going to revise the RFC text and push; will ping when done.


* When an unsafe function/method is an unchecked variant of an otherwise safe
API, it should be marked using an `_unchecked` suffix.

* `raw` submodules should only be used to provide APIs directly on low-level
representations, separately from functions/methods for converting to/from such
raw representations.

The benefit to moving unsafe APIs into methods (resp. static functions) is the
usual one: you can gain easy access to these APIs merely by having a value of
the type (resp. importing the type).

The perspective here is that marking APIs `unsafe` is enough to deter their use
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thestinger Perhaps the RFC text is unclear, but this is not what I'm proposing: this RFC proposes that from_raw_parts should be a static function on e.g. String.

in ordinary situations; they don't need to be further distinguished by placement
into a separate module.

# Drawbacks

One potential drawback of these conventions is that the documentation for a
module will be cluttered with rarely-used `unsafe` APIs, whereas the `raw`
submodule approach neatly groups these APIs. But rustdoc could easily be
changed to either hide or separate out `unsafe` APIs by default.

More generally, these conventions give `unsafe` APIs more equal status with safe
APIs. Whether this is a *drawback* depends on your philosophy about the status
of unsafe programming. But on a technical level, the key point is that the APIs
are marked `unsafe`, so users still have to opt-in to using them. *Ed note: from
my perspective, low-level/unsafe programming is important to support, and there
is no reason to penalize its ergonomics given that it's opt-in anyway.*

# Alternatives

There are two main alternatives:

* Rather than providing unsafe APIs directly as methods/static functions, they
could be grouped into a single extension trait. For example, the `String` type
could be accompanied by a `StringRaw` extension trait providing APIs for
working with raw string representations. This would allow a clear grouping of
unsafe APIs, while still providing them as methods/static functions and
allowing them to easily be imported with e.g. `use std::string::StringRaw`.
On the other hand, it still further penalizes the raw APIs (beyond marking
them `unsafe`), and given that rustdoc could easily provide API grouping, it's
unclear exactly what the benefit is.

* Use `raw` submodules to group together *all* manipulation of low-level

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The raw modules that exist in the standard library were mostly recent additions without consensus based on the same push as this RFC. The previous consensus was to turn everything into methods, and only a few old raw:: functions remained because no one had taken the time to port away from them yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous consensus was to turn everything into methods

You keep stating that, but where are you getting that from? As far as I'm aware there was no consensus at all, because there had been no real discussion about this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was plenty of real discussion about methods. You would have to rewrite history to ignore the work myself, @cmr, @huonw and @Kimundi spent implementing the consensus from the mailing list. There were also many issues about this on the bug tracker like rust-lang/rust#6045 and rust-lang/rust#2868 but most of the discussion occurred on rust-dev.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thestinger Why are you being so antagonistic? I'm not trying to "rewrite history". I've asked repeatedly for you to provide some source for your claim that there has been general consensus, and this is the first time you've actually provided any source whatsoever, and it doesn't even include the alleged rust-dev conversation.

rust-lang/rust#6045 is a very general "don't use functions where methods are appropriate" issue. It's completely non-controversial and doesn't say anything in particular about the raw case. rust-lang/rust#2868 is even less relevant, that's just "don't have functions that duplicate methods".

As for the rust-dev conversation, you still haven't actually linked it. If you want to use a rust-dev conversation to support your claim, you need to actually link it so we can read it.

Also, FWIW, 4 people does not a "general consensus" make.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You only have to look in the git history to see that we migrated away from raw modules as part of replacing free functions with methods. It wasn't treated as a special case because no one thought unsafe was inadequate. I was only the listing some of the people involved in implementing the consensus, not everyone involved in the discussion.

If you don't believe what I'm saying, then I have no interest in continuing to talk. I'm not going to waste time digging up all of the mailing list threads about functions vs. methods and UFCS to satisfy you so you can move on to nitpicking or misrepresenting something else I said. You can confirm it yourself with a search engine.

representations. No module in `std` currently does this; existing modules
provide some free functions in `raw`, and some unsafe methods, without a clear
driving principle. The ergonomics of moving *everything* into free functions

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ergonomics of moving anything into free functions is poor. Downgrading more methods is worse, but doing it to a few is still more verbose and harder to navigate - especially due to re-exports. It also means introducing type prefixes in modules exposing more than one type.

in a `raw` submodule are quite poor.

# Unresolved questions

The `core::raw` module provides structs with public representations equivalent

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Box, Closure and Procedure will end up being removed since they're becoming obsolete. That will only leave behind Slice (could just be in the slice module) and TraitObject so it probably does need to be replaced.

to several built-in and library types (boxes, closures, slices, etc.). It's not
clear whether the name of this module, or the location of its contents, should
change as a result of this RFC. The module is a special case, because not all of
the types it deals with even have corresponding modules/type declarations -- so
it probably suffices to leave decisions about it to the API stabilization
process.