Skip to content

Rewind body filehandle before reading JSON#186

Open
bigpresh wants to merge 3 commits intoperl-catalyst:masterfrom
bigpresh:bigpresh/rewind_filehandle_in_default_data_handlers
Open

Rewind body filehandle before reading JSON#186
bigpresh wants to merge 3 commits intoperl-catalyst:masterfrom
bigpresh:bigpresh/rewind_filehandle_in_default_data_handlers

Conversation

@bigpresh
Copy link

If something else had already read from the filehandle, we'd get
nothing, and end up throwing an error like:

  [error] Caught exception in engine "Error Parsing POST 'undef',
    Error: malformed JSON string, neither tag, array, object, number,
    string or atom, at character offset 0 (before "(end of string)")
    at /home/davidp/perl5/lib/perl5/Catalyst.pm line 4092, <$fh> chunk 11."

... that sounds like we got an empty POST or something, but in fact the
problem was that a plugin had caused the request body to have already
been read, so the filehandle wasn't at the beginning.

This seek means that we will read and parse the whole body content as
intended, even if something had already read from the filehandle.

I think this will also likely solve #183.

bigpresh added a commit to bigpresh/Catalyst-Plugin-HTML-Scrubber that referenced this pull request Sep 14, 2023
If Catalyst::Action::REST / Catalyst::Controller::REST is in use, the
request object will have a `data()` method for deserialised data as added by
the Catalyst::TraitFor::Request::REST role which ought to be scrubbed too.

To support this,

(a) the scrubbing needs to happen later in the request flow - now
    `hooking dispatch()` instead of `prepare_parameters()`
(b) to avoid the data not being read if the request body had already
    been read by `$c->req->body_data`, the fix in this PR is needed:
    perl-catalyst/catalyst-runtime/pull/186
    Until such time, dirtily monkey-patch the `seek()` in.
bigpresh added a commit to bigpresh/Catalyst-Plugin-HTML-Scrubber that referenced this pull request Sep 14, 2023
If Catalyst::Action::REST / Catalyst::Controller::REST is in use, the
request object will have a `data()` method for deserialised data as added by
the Catalyst::TraitFor::Request::REST role which ought to be scrubbed too.

To support this,

(a) the scrubbing needs to happen later in the request flow - now
    `hooking dispatch()` instead of `prepare_parameters()`
(b) to avoid the data not being read if the request body had already
    been read by `$c->req->body_data`, the fix in this PR is needed:
    perl-catalyst/catalyst-runtime/pull/186
    Until such time, dirtily monkey-patch the `seek()` in.
bigpresh added a commit to bigpresh/Catalyst-Plugin-HTML-Scrubber that referenced this pull request Sep 14, 2023
If Catalyst::Action::REST / Catalyst::Controller::REST is in use, the
request object will have a `data()` method for deserialised data as added by
the Catalyst::TraitFor::Request::REST role which ought to be scrubbed too.

To support this,

(a) the scrubbing needs to happen later in the request flow - now
    `hooking dispatch()` instead of `prepare_parameters()`
(b) to avoid the data not being read if the request body had already
    been read by `$c->req->body_data`, the fix in this PR is needed:
    perl-catalyst/catalyst-runtime/pull/186
    Until such time, dirtily monkey-patch the `seek()` in.
bigpresh added a commit to bigpresh/Catalyst-Plugin-HTML-Scrubber that referenced this pull request Sep 18, 2023
If Catalyst::Action::REST / Catalyst::Controller::REST is in use, the
request object will have a `data()` method for deserialised data as added by
the Catalyst::TraitFor::Request::REST role which ought to be scrubbed too.

To support this,

(a) the scrubbing needs to happen later in the request flow - now
    `hooking dispatch()` instead of `prepare_parameters()`
(b) to avoid the data not being read if the request body had already
    been read by `$c->req->body_data`, the fix in this PR is needed:
    perl-catalyst/catalyst-runtime/pull/186
    Until such time, dirtily monkey-patch the `seek()` in.
@dboehmer
Copy link
Contributor

Your PR looks to me like a step into the right direction but it breaks for requests with header: Content-Type: application/json and an empty message body (what is a bad request, yes) once the app calls $c->req->body_data. The exception is:

[error] Error Parsing POST 'undef', Error: Can't locate object method "seek" via package "0" (perhaps you forgot to load "0"?) at .../lib/Catalyst.pm line 4092.

The data_handler seems not to receive a filehandle if the HTTP message body is empty. So it can’t seek.

@bigpresh
Copy link
Author

Hmm - good find. I think this would have just exploded afterwards, though, at the next line - $slurped = $fh->getline; - so it would have failed anyway, just in a different manner - and either way will be caught by the eval and thrown, as you saw.

If it's reasonable to send a JSON request with no body, then it'd be easy enough to add a special case - e.g. return {} unless $fh; before trying to seek - but I suspect it's more a case of it isn't reasonable to send such a request, and we should detect that and return a 4xx response indicating that the request wasn't sensible.

@dboehmer
Copy link
Contributor

I think your PR should not change the behavior for invalid JSON request data. If the message body is empty Catalyst should do the same thing, even after trying to seek().

For a general discussion what to do with empty JSON request body see my issue #196.

@jjn1056
Copy link
Member

jjn1056 commented Nov 8, 2024

@bigpresh I'm not 100% sure about the seek there but I think I should change Catalyst::Request to detect when $fh is empty. Can you work up a broken test case, where something reads body and then tries to use ->body_data for example?

I suspect the best answer here is for C::Request to do the rewind so that it works with all body handlers. But I need a test case to ponder it.

If something else had already read from the filehandle, we'd get
nothing, and end up throwing an error like:

```
  [error] Caught exception in engine "Error Parsing POST 'undef',
    Error: malformed JSON string, neither tag, array, object, number,
    string or atom, at character offset 0 (before "(end of string)")
    at /home/davidp/perl5/lib/perl5/Catalyst.pm line 4092, <$fh> chunk 11."
```

... that sounds like we got an empty POST or something, but in fact the
problem was that a plugin had caused the request body to have already
been read, so the filehandle wasn't at the beginning.

This seek means that we will read and parse the whole body content as
intended, even if something had already read from the filehandle.

I think this will also likely solve perl-catalyst#183.
If we got a JSON request with an empty body, then we won't have a
filehandle to seek.

That's not a valid request, but we ought to avoid blowing up.
For PR perl-catalyst#186 - make sure that we can POST body data (both normal HTTP
POST and a JSON body), have a hook run before the handler which reads
from the body filehandle (simulating e.g. a plugin that looks at the
request body before the handler runs) and verify that the handler still
gets the data via body_data successfully.
@bigpresh bigpresh force-pushed the bigpresh/rewind_filehandle_in_default_data_handlers branch from 635b273 to 69b1f56 Compare January 27, 2026 00:14
@bigpresh
Copy link
Author

Right - this one was waiting on me for a little while, apologies!

@dboehmer the rewind now only happens if there was a filehandle to rewind, solving the invalid empty JSON request you describe; t/consumes.t already has a test for that, which now passes with that.

@jjn1056 I've added a test file that demonstrates this working. Run against master, it fails in the expected way; run against this branch, it passes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants