Skip to content

Conversation

@rjtobin
Copy link
Contributor

@rjtobin rjtobin commented Apr 22, 2019

This is supposed to address #2915 (though it covers Python functions only). I'm new to the Cython codebase, so any guidance is appreciated. I've marked it as WIP since the tests need a little more work (see below). The core code is supposed to be functional though, and passes the current set of tests.

The tests are adapted from those in the reference CPython implementation. I wasn't sure how to express some of these tests in Cython's framework (eg. test_invalid_syntax_lambda): certain syntax errors cause the parser to exit immediately (eg. lambda *args, /: None), and so subsequent errors don't get tested. I know I could make many different test files, but I don't want to clutter up the tests directory with many files for one new feature. Is there a better way to implement these tests? These tests have been commented out, along with some others that don't seem applicable to Cython.

I noticed when implementing this that for single-argument methods with no default value, Cython already compiles these to functions that do not allow keyword arguments. Eg.

def foo(x):
    return x
f(x=2) # raises no exception in CPython, but does in Cython

I'm guessing since this allows for an optimization that it is not considered a bug? (ie. the resulting method can be METH_O instead of METH_VARARGS | METH_KEYWORDS and needs no argument parsing).

Thanks for maintaining such a great project!

@scoder
Copy link
Contributor

scoder commented Apr 22, 2019

Thanks, I'll look through it soon.

single-argument methods with no default value

There's a compiler directive that can disable this behaviour, but yes, it's done by default for performance reasons.

@scoder
Copy link
Contributor

scoder commented Apr 23, 2019

subsequent errors don't get tested

You can declare some or all of the errors as non-fatal to keep the parser going by passing fatal=False.

Copy link
Contributor

@scoder scoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, sorry for the heap of comments.

"""
return a + b + c

# TODO: remove the test below? would need to hard-code the function with > 255 posonly args
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OT1H, why not? OTOH, this tests for a specific limitation that CPython used to have at some point and that was lifted. Cython never had it.

Still, I think it would be nice to have a test for a function with a large set of arguments. Just generate a long signature line with arguments a1 through a300, text-wrap it, and paste it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


import cython

# TODO: add the test below to an 'error' test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can create an error test in the tests/errors/directory (and also use # mode: error for it). See the examples there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, but with some tests commented out (see my overall comment above)

# tag: posonly

# TODO: remove posonly tag before merge (and maybe remove this test,
# since it seems covered by the runs/ test)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the runnable tests seem enough. You can always use the run tests as translate-only tests by passing --cython-only to the test runner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the compile only tests

pass

#def test_change_default_pos_only():
# TODO: probably remove this, since we have no __defaults__ in Cython?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we do, if you set the compiler directive binding=True (which would be ok for this test).

Copy link
Contributor Author

@rjtobin rjtobin Apr 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know, done. Some of these are tests are still commented out for two reasons:
(1) In the CPython tests, __defaults__ is modified during the test and this affects later calls to the function. Is there a way to enable this feature (writability of __defaults__) in Cython?
(2) The tests refer to the member f.__code__.co_posonlyargcount. This is a new addition to PyCodeObject as part of the posonly change. Since the posonly change has yet to be merged into CPython, this member is missing currently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think writing to __defaults__ is a separate feature. Not worth testing here.
  2. Sounds like a hasattr() might help. We are testing a CPython feature here, after all, not really a Cython feature. It's more like "if CPython supports it, then Cython should do the right thing. If not, then that's fine, too".

import cython

# TODO: add the test below to an 'error' test
#def test_invalid_syntax_errors():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I don't see an error test that uses more than one slash in the signature. That seems worth testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like such tests have been added to the CPython implementation tests too, and I've copied them here. They are all commented out currently though.

Traceback (most recent call last):
TypeError: test_use_positional_as_keyword1() takes no keyword arguments
"""
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't actually need the pass here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted, I've removed this in many places.

# lambda *, a, /: None
# lambda *, /, a: None

class Example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class Example(object):, same for class X(object) below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

def f(self, a, b, /):
return a, b

def test_posonly_methods():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also test this in the docstring of the class, no need for a separate test function here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

def global_pos_only_f(a, b, /):
pass

def test_module_function():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use the docstring of the function itself, no need for a separate test function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@rjtobin
Copy link
Contributor Author

rjtobin commented Apr 25, 2019

Thanks for the feedback! I've pushed a commit which addresses your comments. There is also a change to the main logic itself to fix an indexing issue resulting from the change to py_kwargnames in the first commit (this array is no longer the same length as values, so various old assumptions were broken). Also, since I wrote the first commit there have been quite a few new tests added to the CPython implementation, which I have now added here too.

The remaining issues:
(1) I'm still working on the error tests, trying to follow your suggestion of using fatal=False to avoid a bunch of separate error test files. The main issue are examples like def f(a,/,b,/): pass. This trips an Expected ')', found '/' error, which I cannot easily make non-fatal.

(2) One of the tests refers to __code__.co_posonlyargcount. To support this we'll need to wait for the posonly PR to be merged into CPython 3.8. At that point the test will work in CPython 3.8 but not earlier versions. This raises another issue: the calls to PyCode_New in Cython/Utility/{ModuleSetupCode.c,Profile.c} are going to have to be updated for 3.8 (since the posonly change adds a new argument to this function). Should these calls be updated as part of this PR or can it come later?

(3) Lastly I want to carefully check all possible code-generation paths for argument unpacking are flexed by the tests.

@rjtobin rjtobin force-pushed the pos_args branch 2 times, most recently from 0b0042b to 206565e Compare April 26, 2019 05:18
@pablogsal
Copy link
Contributor

pablogsal commented Apr 29, 2019

(2) One of the tests refers to code.co_posonlyargcount. To support this we'll need to wait for the posonly PR to be merged into CPython 3.8. At that point the test will work in CPython 3.8 but not earlier versions. This raises another issue: the calls to PyCode_New in Cython/Utility/{ModuleSetupCode.c,Profile.c} are going to have to be updated for 3.8 (since the posonly change adds a new argument to this function). Should these calls be updated as part of this PR or can it come later?

(Author of PEP570 here): I have merged the implementation of PEP570 in upstream Python today, so this should work against the latest master.

Thank you very much for working on this and ping me if you need anything :)

@scoder
Copy link
Contributor

scoder commented Apr 29, 2019

@rjtobin, regarding 1), I think we can afford having some additional error test files for special cases. Just give them a common name prefix and some _suffix to distinguish them.
2) sounds like we need to do something anyway in order to keep things working at all. If you can do it now, the better.
3) That seems a bit of an undertaking. :) I'm sure there are cases that are tested several times throughout the test suite, and my guess is that there are also cases that are not touched by any of the test cases. Although the sheer size of the test suite makes that appear somewhat less likely. Either way, if you find something that's not sufficiently tested, I'd warmly welcome a new test. The argument parsing code is fairly specialised and complex.

Copy link
Contributor

@scoder scoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! I think we can fix the C-API issue separately.

"""
return a + b + c

# TODO: this causes a line that is too long for old versions of Clang
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. See? It was worth testing. :)

@scoder scoder merged commit f0081bd into cython:master Apr 29, 2019
@scoder
Copy link
Contributor

scoder commented Apr 30, 2019

Very nice improvement. I added one little optimisation, which is to stick to a METH_O signature for single argument pos-only functions when the always_allow_keywords directive would normally disable it. That makes that directive more of a legacy option, because users can now declare METH_O signatures directly with /.

Another thing I noticed: The positional argument parsing code looks less efficient now. For example, this signature:

def test_use_positional_as_keyword3(a, b, /):

generates this code:

    if (unlikely(__pyx_kwds)) {
      Py_ssize_t kw_args;
      const Py_ssize_t pos_args = PyTuple_GET_SIZE(__pyx_args);
      switch (pos_args) {
        case  2: values[1] = PyTuple_GET_ITEM(__pyx_args, 1);
        CYTHON_FALLTHROUGH;
        case  1: values[0] = PyTuple_GET_ITEM(__pyx_args, 0);
        CYTHON_FALLTHROUGH;
        case  0: break;
        default: goto __pyx_L5_argtuple_error;
      }
      kw_args = PyDict_Size(__pyx_kwds);
      switch (pos_args) {
        case  0:
        CYTHON_FALLTHROUGH;
        case  1:
        goto __pyx_L5_argtuple_error;
      }
      if (unlikely(kw_args > 0)) {
        const Py_ssize_t kwd_pos_args = (pos_args < 2) ? 0 : (pos_args - 2);
        if (unlikely(__Pyx_ParseOptionalKeywords(__pyx_kwds, __pyx_pyargnames, 0, values, kwd_pos_args, "test_use_positional_as_keyword3") < 0)) __PYX_ERR(0, 124, __pyx_L3_error)
      }
    } else if (PyTuple_GET_SIZE(__pyx_args) != 2) {
      goto __pyx_L5_argtuple_error;
    } else {
      values[0] = PyTuple_GET_ITEM(__pyx_args, 0);
      values[1] = PyTuple_GET_ITEM(__pyx_args, 1);
    }

First of all, no keywords are allowed here, so passing them can raise an error straight away. Secondly, note how the switch accepts a tuple of length 1. That's a clear error case, and we don't need to generate code for that at all. We can basically stop the switch-cases when we hit the first pos-only argument from the back, and raise an arg-tuple error for all other cases.

Would you want to write up a separate PR that optimises this?

@rjtobin
Copy link
Contributor Author

rjtobin commented Apr 30, 2019

Sure, I’m happy to make another PR. Will work on it during the week. Thanks for the all the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Python Semantics General Python (3) language semantics

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants