Skip to content

Conversation

@colin-kiegel
Copy link
Contributor

implement #10

The design principle is very simple, e.g. quote!(A #{ B } C) will expand to

let tokens = Tokens::new();
tokens.append(A);
tokens.append({B});
tokens.append(C);

Here { B } can be any valid Rust code block as long as its result implements ToTokens.

This simple implementation just ignores all quote-variables #x in B. Therefore #{ B } will just behave like a black box inside of e.g. a loop.

This black-box behaviour theoretically allows to nest a quote!(#x) or other macros inside of interpolation blocks and allows for "local reasoning" about the meaning of some tokens. Whether deep nesting would be good practice is on a different page.

To answer the design questions

  1. A computation inside of a repetition is just treated as a constant expression! Something like quote! { #( #{...} )* } will therefore evaluate to zero tokens, because the loop does not find any iterator. If #{...} evaluates to an iterator you still need to bind the result to a variable first and loop over that. Sorry. (*)
  2. The heuristic to find loop variables ignores the inside of interpolation blocks.
  3. No limit on what computation supports.

(*) In theory we could try to interpret #{...} in loops as iterators, as we do with any #foo. But we will lack a proper identifier for #{...}, since the current macro_rules system does not allow generation of identifiers like __my_local_iterator_005. We could implement a helper macro with a global state (yes that is possible via higher order macros) like pop_iterator_ident(). The problem with that is, that it will only support a finite amount of calls until the supply is exhausted. I'd say we circumvent that trouble and don't support this pattern until Rust support generation of identifiers in macros.

@colin-kiegel colin-kiegel changed the title computation in implementation computation in interpolation Mar 5, 2017
@colin-kiegel
Copy link
Contributor Author

colin-kiegel commented Mar 5, 2017

A disadvantage of this approach is this: If support for quote! { #( #{...} )* } is added later, this would be a breaking change. It would alter the behaviour of computation #{...} inside of a loop #( ... )*.

To keep forward compatibility, we could disallow computation inside of loops altogether.


The following example compiles with the current PR, but would not compile if #{...} inside a loop is treated as an iterator in the future, because "+" is not an iterator:

    let foo = ["1", "2"];
    let iter = foo.iter();
    let tokens = quote!(#( #iter, #{"+"}, )*);
    let expected = r#""1" , "+" , "2" , "+" ,"#;

    assert_eq!(expected, tokens.as_str());

@colin-kiegel
Copy link
Contributor Author

The current design has one advantage, say a is an Iterator and b is not, then something like
quote!( (#( #a, #b ),*) ) will not work (because it requires a and b to be an Iterator), but quote!( (#( #a, #{b} ),*) ) indeed works and just repeats b.

I added this documentation

You can use #{...} for arbitrary computations inside of quotations — the result will then be spliced into the token stream:

  • #{a[0]} - index arrays
  • #{x.foo} - access fields
  • #{format!("{:?}", ::std::time::Instant::now())} - arbitrary computations and nested macros, which are valid Rust

Note:

  • interpolation inside of #foo inside of #{...} is disabled by design. #{#foo} is illegal, but on the other hand you can #{quote!(#foo)} (if you really want to).
  • computations #{...} inside of repetitions #(...)* are evaluated each time.

Let's assume for one moment, that we adopt the alternatively and scan for #foo inside of #{ ... }.

  • we would replace it with foo and push it to our iterator stack if we are inside of an repetition #( ... )*
  • replacing #foo inside of #{ B } is possible but not as simple, because we need to collect the changed tokens B' before we call tokens.append({B'}); - where I'm using B as a shorthand meta-variable for something like $($x:tt)* in macro-syntax. It's a bit difficult, but possible with the push-down accumulation pattern.
  • but if inside of the computation #{ ... } we have another nested macro call which interprets #foo - it can't, because we'd replace it. The nested macro could either be another quote!( #( #foo )* ) or a completely foreign macro, which makes things much more complicated.
  • we could try to mitigate the problem of nested macros with double-escaping ##foo, but this would make everything even more complex.

Disadvantages:

  • macro call stack blowing up (not a big deal)
  • macro getting more complex (not a problem for people just using the crate - but for maintaining it)
  • weird effects on otherwise valid nested Rust code (or complicated escaping like ##foo)

IMO these are a lot of disadvantages only to support iterators foo inside of computations, like quote!( (#( #{#foo.name} ),*) )..

If you look at the code, the implementation of the current proposal is pretty straight forward.

@dtolnay how do you currently feel about this approach?

@colin-kiegel
Copy link
Contributor Author

colin-kiegel commented Mar 5, 2017

PS: To wrap things up I think we have three alternatives

  1. the proposal at hand - no magic interpolation inside of computation and #{...} is just spliced into the token stream. See first comment
  2. like 1., but treat computations inside of repetitions as iterators, e.g. quote! { #( #{foo} )* } would be exactly identical to quote! { #( #foo )* } - no magic interpolation inside of computation but #{...} is treated like #foo. See discussion in second comment / first half.
  3. like 1., but interpolation happens inside of computation. See third comment / second half.

When I'd ignore the limitations of the current macro system for one moment, I'd rate them 2. > 1. > 3.. Because I think 2. is more consistent than 1.. But IMO 2. is impracticable with the current macro system. That leaves 1. as my current preference.

@dtolnay
Copy link
Owner

dtolnay commented Mar 5, 2017

@dtolnay how do you currently feel about this approach?

As far as I can tell, this approach does not simplify the use case linked to at the top of #10. Is that correct?

let mut field_names = fields.iter().map(|f| &f.name);
let mut field_idents = fields.iter().map(|f| &f.ident);

quote! {
    for field in fields {
        match field.name() {
            #(
                #field_names => /* ... */,
            )*
            _ => unreachable!(),
        }
    }

    ::std::result::Result::Ok(#ident {
        #(
            #field_idents: /* ... */,
        )*
    })
}

@dtolnay
Copy link
Owner

dtolnay commented Mar 5, 2017

My experience has been that code that could be written using computation inside of interpolation is almost always clearer factored a different way. The single exception to this is computations of the form #{a.b.c.d}. The nice thing about a.b.c.d is that we can make them work nicely inside of repetitions, as in the rust-postgres-derive case. What do you think of implementing that restriction?

@colin-kiegel
Copy link
Contributor Author

colin-kiegel commented Mar 5, 2017

As far as I can tell, this approach does not simplify the use case linked to at the top of #10. Is that correct?

Correct, with Option 1 that's not possible by design.

With my 'secret favourite' Option 2 however, this would be simply #{ /*.. create an iterator .. */ }.

quote! {
    for field in fields {
        match field.name() {
            #(
                #{ fields.iter().map(|f| &f.name) } => /* ... */,
            )*
            _ => unreachable!(),
        }
    }

    ::std::result::Result::Ok(#ident {
        #(
            #{ fields.iter().map(|f| &f.ident) } : /* ... */,
        )*
    })
}

I feel that's intuitive since the compiler will give hints, that it expected an iterator if #{ ... } returns something different.

That would really be the best option IMO, but #{ ... } is an anonymous expression from the perspective of quote. We lack an identifier (to iterate over it) and that's a big pain with the current macro system, because we can't generate identifiers. It might be possible to hack around it with some limitations. But that would require pushdown accumulation and higher order macros and I expect that to be quite a mess in comparison with the net benefit. :-)

With Option 3 we would need to inject two independent iterator instances, but then we could manipulate their values inside of computation blocks:

let iter1 = fields.iter();
let iter2 = fields.iter();

quote! {
    for field in fields {
        match field.name() {
            #(
                #{ #iter1.name } => /* ... */,
            )*
            _ => unreachable!(),
        }
    }

    ::std::result::Result::Ok(#ident {
        #(
            #{ #iter2.ident } : /* ... */,
        )*
    })
}

@colin-kiegel
Copy link
Contributor Author

My experience has been that code that could be written using computation inside of interpolation is almost always clearer factored a different way. The single exception to this is computations of the form #{a.b.c.d}. The nice thing about a.b.c.d is that we can make them work nicely inside of repetitions, as in the rust-postgres-derive case. What do you think of implementing that restriction?

Sorry, I am completely lost here. Lol. :-)

  • Can you give an example for code being clearer factored a different way?
  • Could you give some context for #{a.b.c.d}? It looks like field access on nested structs. Is one of them supposed to be an iterator over struct instances? Which one,a? Or are these supposed to be chained method calls? What would you expect from quote!(#(#{a.b.c.d})*)?
  • Whats the connection to the previous example from postgres-sql?

@colin-kiegel
Copy link
Contributor Author

PS: My use case is mostly the following:

struct Foo { /* some fields elided */ }

impl ToTokens for Foo {
    fn to_tokens(&self, tokens: &mut Tokens) {
        tokens.append(quote!(
            /* ... */ #{self.lorem}
            /* ... */ #{self.ipsum}
            /* ... etc. ... */
        ))
   }
}

I think it's a significant pain that we can't use #{self.lorem}, given that there is a trait ToTokens. For me that's really the number one use case. The next relevant use case would be access by some index #{a[0]}.

I think the customized iterator use case is not as important as struct/array access. Because in the iterator use case, there is a significant benefit in readability to pull that logic out of quote!(...) IMO.

@dtolnay
Copy link
Owner

dtolnay commented Mar 5, 2017

Can you give an example for code being clearer factored a different way?

Sure, here is an example from real Serde code. It assigns a bunch of variables, then collects them into a struct or tuple result. Any place I see quote!, the inside of it closely resembles what part of the resulting output code will look like.

let result = if is_struct {
    let names = fields.iter().map(|f| &f.ident);
    quote! {
        #type_path { #( #names: #vars ),* }
    }
} else {
    quote! {
        #type_path ( #(#vars),* )
    }
};

quote! {
    /* assign some variables */

    _serde::export::Ok(#result)
}

Compare this to a hypothetical use of this PR where the generated code and generating code are jumbled together. I would not want to encourage this style because to me it seems optimized for the writer and not for the reader, which is not how things should be.

quote! {
    /* assign some variables */

    _serde::export::Ok(#{
        if is_struct {
            let names = fields.iter().map(|f| &f.ident);
            quote! {
                #type_path { #( #names: #vars ),* }
            }
        } else {
            quote! {
                #type_path ( #(#vars),* )
            }
        }
    })
}

@dtolnay
Copy link
Owner

dtolnay commented Mar 5, 2017

Could you give some context for #{a.b.c.d}? It looks like field access on nested structs. Is one of them supposed to be an iterator over struct instances? Which one,a? Or are these supposed to be chained method calls? What would you expect from quote!(#(#{a.b.c.d})*)?

It represents field access on nested structs, for example #{self.lorem} or #{field.ident}. I would expect a[0].b.c.d + a[1].b.c.d + a[2].b.c.d.

@dtolnay
Copy link
Owner

dtolnay commented Mar 5, 2017

Whats the connection to the previous example from postgres-sql?

The postgres-derive code would be simplified to:

quote! {
    for field in fields {
        match field.name() {
            #(
                #{fields.name} => /* ... */,
            )*
            _ => unreachable!(),
        }
    }

    ::std::result::Result::Ok(#ident {
        #(
            #{fields.ident}: /* ... */,
        )*
    })
}

@colin-kiegel
Copy link
Contributor Author

colin-kiegel commented Mar 5, 2017

Ok, thanks - now I get it. :-)

Regarding overly confusing nested stuff: Is it a hard constraint for you, to block this, or is it just a nice-to-have to be a bit more restrictive?

Regarding quote!(#(#{a.b.c.d})*) I feel it's arbitrary to pick a as iterator. Why not d as in a.b.c.d[0] + a.b.c.d[1] + a.b.c.d[2]? Again my use-case would be self.some_iterator or self.foo.bar.baz.iter().

We could of course special-case self or a trailing method(), but that still feels a bit magic to me. It seems that #{...} would come with some very special rules, as opposed to 'just put any valid Rust code here'.

In comparison Option 3 from my list is a bit similar to your proposal, but it would make it explicit what is iterated over:

quote! {/* ... */
    $(
        #{#fields.ident}: /* ... */,
    )*
/* ... */}

Note: In your last comment you would iterate twice over fields which is only possible if that iterator implements Copy. Otherwise you would need something like fields1 and fields2 - or to be able to clone.

I also corrected my example for Option 2 - it now reads like this: EDIT: Here is the original Option 2 again:

quote! {/* ... */
    #(
        #{ fields.iter().map(|f| &f.ident) }: /* ... */,
    )*
/* ... */}

It would expand to something like this:

{
    let mut _s = $crate::Tokens::new();
    for _x in { fields.iter().map(|f| &f.ident) } {
        $crate::ToTokens::to_tokens(&_x, &mut _s);
        _s.append(":");
        _s.append(/* ... */);
        _s.append(",");
    }
    _s
}

Note the introduction of a local identifier _x. While writing it out this clearly I realized that it should be possible to implement Option 2 without a headache. I'll try it and hopefully I can convince you that this is elegant IMO. ;-)

@colin-kiegel
Copy link
Contributor Author

PS: Sorry for my super long comments. ^^

@colin-kiegel
Copy link
Contributor Author

Ok, I'm starting to make mistakes - I'll stop after this comment for now. I just re-discovered the problem which I think is blocking Option 2:

Say we have multiple computations in a repetition like

quote! {/* ... */
    #(
        #{ fields1.iter().map(|f| &f.ident) }: #{ fields2.iter().map(|f| &f.ident) },
    )*
/* ... */}

It would expand to something like this:

{
    let mut _s = $crate::Tokens::new();
    for (_x, _y) in { fields1.iter().map(|f| &f.ident) }.into_iter().zip({ fields2.iter().map(|f| &f.ident) }) {
        $crate::ToTokens::to_tokens(&_x, &mut _s);
        _s.append(":");
        $crate::ToTokens::to_tokens(&_y, &mut _s);
        _s.append(",");
    }
    _s
}

The headache is, how would we get distinct identifiers _x, _y. Also the postgres use case for fields on iterator items does not really look nice in-lined.

@dtolnay
Copy link
Owner

dtolnay commented Mar 15, 2017

Thanks for all your work on this and for exploring the design space so thoroughly. At this point I think that while some of the options improve readability in some cases, on balance this feature would be detrimental to readability and maintainability of libraries using quote!. The tradeoffs and unintuitive behavior in edge cases overflows the complexity budget I had in mind for this feature. I would prefer to address some of the use cases for computation instead with better idioms and conventions atop the existing functionality.

@colin-kiegel
Copy link
Contributor Author

colin-kiegel commented Mar 15, 2017

FWIW the reasoning behind my preference against a sandboxed approach was based on two concerns

  • implementation complexity
  • and yet still surprising/inconsistent behaviour depending on context

When I was giving complex examples for the non sandboxed version I was trying to argue that it doesn't have these problems. In fact the implementation is really straight forward and it's API surface can be explained well in a terms of existing concepts. It was never my intention to say that complex computations in interpolation would be desirable.

Also I disagree that computations in repetitions are particularly important. IMO support for the pattern #{self.x} should be the prime target to better support structs that implement ToTokens. But I am biased on this, since my crate derive_builder basically goes all-in on this pattern and the most boilerplate is stuff outside of repetitions. If I had an iterateable self.x it would be of some type Vec<T> where T: ToTokens, which would have worked as expected with Option 2. But due to implementation complexity of unnamed iterators I argued not to support it.

I used complex examples but I wouldn't advocate anything beyond #{self.x}. In fact better supporting structs which implement ToTokens might even help to promote more idiomatic code, don't you think?

@colin-kiegel
Copy link
Contributor Author

colin-kiegel commented Mar 15, 2017

Instead of sandboxing it would be easier to implement a heuristic validation pass on #{...} and only accept &, ., (, ) and identifiers. That should be straightforward to implement and stop complex computations, which would require |,+, etc.

@colin-kiegel
Copy link
Contributor Author

The error message for trying to use e.g. a closure inside a computation would then be something like unexpected token |

@colin-kiegel
Copy link
Contributor Author

We wouldn't even need to whitelist & since we can always insert that automatically.

@colin-kiegel
Copy link
Contributor Author

I think I am onto something - I will update this PR later. :-)

@colin-kiegel colin-kiegel deleted the feature/computation_in_interpolation branch March 19, 2017 17:09
Repository owner locked and limited conversation to collaborators Jan 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants