Astro middleware #555

ematipico · 2023-04-14T14:21:56Z

This RFC introduce a new middleware API in Astro

Links

Fully rendered proposal
Stage 2 proposal
Stage 1 proposal
- Astro middleware #469
- Astro SSR middleware #174

proposals/0032-middleware-api.md

matthewp · 2023-04-14T19:48:07Z

One other comment, if you call next() but don't care about the result (because you don't need to do anything to the response), are you expected to return the return value? Ie do you have to do return next(); or is just next(); ok? I don't have an opinion on this and am kind of assuming that the first way is how it's currently implemented. If so I could see this as being a good thing, that you must return some response.

ematipico · 2023-04-18T13:10:20Z

One other comment, if you call next() but don't care about the result (because you don't need to do anything to the response), are you expected to return the return value? Ie do you have to do return next(); or is just next(); ok? I don't have an opinion on this and am kind of assuming that the first way is how it's currently implemented. If so I could see this as being a good thing, that you must return some response.

I suppose it really depends on how the user will it. next(); and return next() will yield two different results. Without the return, the next middleware won't access to the Response object. If the user knows that that middleware won't need it, I suppose next(); is a correct pattern.

lilnasy · 2023-04-29T15:07:35Z

Could a middleware tell if it's running in dev mode?

The use case I'm thinking of is where a middleare caches responses in memory, similar how to how Next's ISR works. The middleware should not act during dev.

cc: #228

lilnasy · 2023-04-29T15:09:18Z

Could a middleware communicate with an integration?

The use case I'm thinking of where an integration hashes inline resources, and a middleare uses the hashes to create a strict Content-Security-Policy header.

I understand letting middleware be added by integrations was made out of scope for now, but receiving data from an integration alone would be valuable.

cc: #377

ematipico · 2023-05-01T07:27:09Z

Could a middleware tell if it's running in dev mode?

The use case I'm thinking of is where a middleare caches responses in memory, similar how to how Next's ISR works. The middleware should not act during dev.

cc: #228

Not at this stage. It wasn't part of the original RFC. Currently, the middleware is code that goes in production, and the mode is a piece of information that is lost after the build.

ematipico · 2023-05-01T07:29:40Z

Could a middleware communicate with an integration?

The use case I'm thinking of where an integration hashes inline resources, and a middleare uses the hashes to create a strict Content-Security-Policy header.

I understand letting middleware be added by integrations was made out of scope for now, but receiving data from an integration alone would be valuable.

cc: #377

We can explore this valid use case after the middleware is released under an experimental flag. During the exploration phase, I evaluated creating a middleware as an integration, but that was too far from the initial RFC.

lilnasy · 2023-05-01T14:05:14Z

Could a middleware tell if it's running in dev mode?
The use case I'm thinking of is where a middleare caches responses in memory, similar how to how Next's ISR works. The middleware should not act during dev.
cc: #228

Not at this stage. It wasn't part of the original RFC. Currently, the middleware is code that goes in production, and the mode is a piece of information that is lost after the build.

@ematipico I looked around a little more, and matthewp gave an example for a dev-only route in the issue. If that example works, prod-only ISR could too.

matthewp · 2023-05-01T14:45:12Z

Yeah, I would expect import.meta.env.MODE to work just like it does in any other file in the project.

ematipico · 2023-05-03T14:32:57Z

I wasn't aware of that metadata, and I personally haven't tried. But if that metadata is available in endpoints, it will definitely be available in middleware too!

snimavat · 2023-05-05T06:25:18Z

I come from Java background, and middleware word has always been confusing to me.
If it does the job of intercepting requests/response - "Interceptors" seems more meaningful

ematipico · 2023-05-05T15:32:49Z

I come from Java background, and middleware word has always been confusing to me. If it does the job of intercepting requests/response - "Interceptors" seems more meaningful

Yeah, there are different names to address the same pattern. SvelteKit calls it "hooks", like fastify.

raulfdm · 2023-05-05T17:00:59Z

I come from Java background, and middleware word has always been confusing to me. If it does the job of intercepting requests/response - "Interceptors" seems more meaningful

Yeah, there are different names to address the same pattern. SvelteKit calls it "hooks", like fastify.

In Next.js world they also call middleware, but their middleware is less capable than the pattern we're seeing here.

https://nextjs.org/docs/pages/building-your-application/routing/middleware

gkatsanos · 2023-05-12T10:09:47Z

how safe are we to use a middleware in production that intercepts a request and sets a global variable or a value in a nano store based on the request domain name?

ematipico · 2023-05-12T12:46:06Z

how safe are we to use a middleware in production that intercepts a request and sets a global variable or a value in a nano store based on the request domain name?

The feature is under an experimental flag for different reasons:

it might contain bugs
the APIs might still change, which are not semver complaint

Although, the code written inside the middleware is stored in the backend (like the rest of the Astro code). It's advised to ensure you don't write any sensible data inside Astro.locals

pilcrowonpaper · 2023-05-15T00:00:18Z

Quick question, why should locals be serializable?

The information must be serializable because storing information that evaluates at runtime is unsafe. If, for example, we were able to store a JavaScript function, an attacker could exploit the victim’s website and execute some unsafe code

How would an attacker inject or execute unsafe code?

pilcrowonpaper · 2023-05-15T01:24:56Z

It also looks like API routes run before the middleware?

ematipico · 2023-05-17T13:54:17Z

Quick question, why should locals be serializable?

The information must be serializable because storing information that evaluates at runtime is unsafe. If, for example, we were able to store a JavaScript function, an attacker could exploit the victim’s website and execute some unsafe code

How would an attacker inject or execute unsafe code?

It's mostly about third-party libraries. Let's suppose that there's an astro-magic-auth library that does something with locals; then, this library gets hacked and decides to push some sketchy code in thelocals:

locals.email = "<script>eval(alert('hacked'))</script>"

Nobody prevents a third-party library from doing so.

ematipico · 2023-05-17T13:55:49Z

It also looks like API routes run before the middleware?

That was a bug :) I believe it's been fixed in withastro/astro#7106

pilcrowonpaper · 2023-05-17T15:35:28Z

@ematipico

I'm not sure serializing data prevents this?

locals.email = returnsUnsafeScript(); // <script>eval(alert('hacked'))</script>

if we were able to store a JavaScript function, an attacker could exploit the victim’s website and execute some unsafe code

But if you're using third party dependencies, they can already run unsafe code:

// node_modules/some-library/index.js

export someFunction = () => {};

console.log("unsafe code");

Not having the ability to store any data type seems to be a massive limitation IMO.

ematipico · 2023-05-17T15:54:24Z

I'm not sure serializing data prevents this?

It indeed doesn't; probably, we should also escape the strings, just to be sure.

But if you're using third-party dependencies, they can already run unsafe code:

It's more about data that gets rendered on the page, not about the code per se.

Not having the ability to store any data type seems to be a massive limitation IMO.

It is, I know! What would you suggest instead? How can we support more data types without overlooking the security?

tropperstyle · 2023-05-17T18:35:59Z

I have a request to expose the exports of the matched route module to the middleware.

Since auth guards are a primary example for this proposal, this could help avoid path logic growing in the middleware and allow explicitly flagging pages as needed:

---
export const requireAdmin = true;
---

// Make all non-Astro defined exports available to context
export const onRequest = ({ exports, request, redirect, locals }, next) => {
  if (exports.requireAdmin) {
    locals.admin = await fetchAdmin(request);
    if (!locals.admin) return redirect('/login');
  }
  return next();
}

pilcrowonpaper · 2023-05-17T23:37:50Z

@ematipico

Doesn't Astro escape strings inside html templates anyway? If you’re using set:html, it should be your responsibility to make sure your dependencies are safe. I really think this is a non issue that doesn’t need to be addressed. SvelteKit has locals for like 2 years and I’m not aware of any html injection, and Express has had it for even longer.

matthewp · 2023-05-18T00:52:16Z

I think the primary driver for the serializable restriction is that we plan to make it possible for adapters (like Vercel) to split the middleware into a separate bundle to deploy to an Edge. This way you can have things like auth-checks happen at edge locations and then your app code in serverless.

For this to work we need to be able to serialize locals to send from the edge worker to the origin. We weren't sure that people would even notice this!

But it sounds like a lot of people are wanting to put complex objects onto locals. This would just mean you couldn't use the edge feature. But for a lot of people that's a fine tradeoff. So I think this user feedback is a good reason to lift the restriction.

matthewp · 2023-05-19T15:50:03Z

cc @ematipico Curious what you think about the above 👆

gkatsanos · 2023-05-22T07:30:13Z

Sorry to jump into the RFC with a slightly different topic. We have a use-case that combines nanostores with the middleware and we wanted to share it with you to get some feedback ("will it work"). :

import { setBrand } from "@store/brand";

export function onRequest(context, next) {
  if (context.request.includes === "europages") {
    setBrand("ep");
  } else {
    setBrand("wlw");
  }

  return next();
}

store:

// store/brand.js
import { atom, action } from "nanostores";

export const brand = atom("wlw");

export const setBrand = action(brand, "setBrand", (store, brand) => {
  store.set(brand);
});

then use the store value in a root Layout to provide a 'theme' based on the route base url:

<html lang="en" data-brand={brand.value}>

That's in a node/standalone SSR mode. I was told it's not a reliable solution over Discord.

ematipico · 2023-05-22T12:41:22Z

cc @ematipico Curious what you think about the above 👆

We could lift off the check, I don't mind at all, as long as it's always secure to store "complex" types.

Regarding the serialisation and the edge worker, maybe we should move the check inside the integration itself.

matthewp · 2023-05-22T13:13:32Z

@ematipico Ah yeah, if that's a restriction of that use-case it does make senes to move the check there. I'm not sure exactly how that will be implemented, but I imagine that the adapter will add it's own middleware at the end of the chain. It can do the check there.

matthewp · 2023-05-22T13:37:04Z

@gkatsanos I'm not very familiar with nanostores, but looking at the code it appears that brand is a global which doesn't work well in a server environment, you have multiple requests all being processed at the same time.

I would recommend putting branch onto the locals object.

pilcrowonpaper · 2023-05-22T13:43:48Z

For the Vercel adapter, if we're going to add it, please make deploying to Vercel's Edge middleware optional. The biggest limitation with Next.js' Middleware is that you can't deploy it to just a regular serverless server(?).

wassfila · 2023-05-22T15:42:51Z

@gkatsanos I'm not very familiar with nanostores, but looking at the code it appears that brand is a global which doesn't work well in a server environment, you have multiple requests all being processed at the same time.

I would recommend putting branch onto the locals object.

right, that's where the discord support post is going to, thanks for confirmation https://discord.com/channels/830184174198718474/1109101214802653224

matthewp · 2023-05-23T12:24:17Z

@pilcrowonpaper Yeah, that will definitely be an opt-in feature.

matthewp · 2023-05-23T14:50:03Z

The serialization restriction has been lifted from the RFC.

With that being the case, I'd like to do a call for consensus on this RFC. This will be the final comment period, 3 days, if there are still issues that need to be addressed please bring those forward now.

chore: first draft

0ba226d

ematipico force-pushed the rfc/stage-3-middleware branch from 85a3b3d to d13e252 Compare April 14, 2023 14:23

Middleware API

9395cb9

ematipico force-pushed the rfc/stage-3-middleware branch from d13e252 to 9395cb9 Compare April 14, 2023 14:27

ematipico marked this pull request as ready for review April 14, 2023 14:31