Skip to content

Fix data leak between mux.cool connections#3718

Merged
mmmray merged 2 commits intoXTLS:mainfrom
mmmray:mux-routing-leak
Aug 25, 2024
Merged

Fix data leak between mux.cool connections#3718
mmmray merged 2 commits intoXTLS:mainfrom
mmmray:mux-routing-leak

Conversation

@mmmray
Copy link
Contributor

@mmmray mmmray commented Aug 22, 2024

Fix #116

This issue does not appear at all if mux.cool is disabled. Maybe the issue can be avoided if routing is disabled too, but it's possible that the same issue appears elsewhere.

I have debugged this issue in prod with a lot of print statements, and can confirm that this patch fixes it. Tests should be added. The mux server has no tests at all!

Step-by-step:

  1. The mux server is calling handleStatusNew with the same context multiple times. That function then calls w.dispatcher.Dispatch

  2. Dispatch assigns some values specific to the connection onto the outbound here:

    ob.OriginalTarget = destination
    ob.Target = destination

  3. Because the outbounds array on the connection object is the same across multiple connections (it's the same context), the value of outbound.Target flips back and forth between values such as "1.1.1.1" or "chat.facebook.com" (IPAddress vs DomainAddress)

  4. Many layers deeper in routing, as part of this Dispatch call, this function is called:

    if ctx.Outbound.Target.Address.Family().IsIP() {
    return []net.IP{ctx.Outbound.Target.Address.IP()}
    }

    It does two things, in order:

    1. Checks if Target is an IPAddress (IsIP)
    2. If it's an IP, call IP()

    Because Target keeps mutating back and forth between multiple kinds
    of address, it sometimes is an IP for Step 1, and a DomainAddress for
    Step 2

  5. Calling IP() on a DomainAddress panics:

     panic: Calling IP() on a DomainAddress.
    
     goroutine 123018 [running]:
     github.com/xtls/xray-core/common/net.domainAddress.IP(...)
         github.com/xtls/xray-core/common/net/address.go:172
     github.com/xtls/xray-core/features/routing/session.(*Context).GetTargetIPs(0xc004175428)
         github.com/xtls/xray-core/features/routing/session/context.go:54 +0x5f
     github.com/xtls/xray-core/app/router.(*MultiGeoIPMatcher).Apply(0xc0003a63e0, {0x156c770?, 0xc004175428?})
         github.com/xtls/xray-core/app/router/condition.go:143 +0x3e
     github.com/xtls/xray-core/app/router.(*ConditionChan).Apply(0x495c9d?, {0x156c770, 0xc004175428})
         github.com/xtls/xray-core/app/router/condition.go:32 +0x5c
     github.com/xtls/xray-core/app/router.(*Rule).Apply(...)
         github.com/xtls/xray-core/app/router/config.go:30
     github.com/xtls/xray-core/app/router.(*Router).pickRouteInternal(0xc000367a40, {0x156c770, 0xc004175428})
         github.com/xtls/xray-core/app/router/router.go:196 +0x176
     github.com/xtls/xray-core/app/router.(*Router).PickRoute(0x1562810?, {0x156c770?, 0xc004175428?})
         github.com/xtls/xray-core/app/router/router.go:84 +0x25
     github.com/xtls/xray-core/app/dispatcher.(*DefaultDispatcher).routedDispatch(0xc000398fc0, {0x1562810, 0xc0050cd350}, 0xc00418dea0, {{0x15625a8, 0xc005974c10}, 0x1bb, 0x2})
         github.com/xtls/xray-core/app/dispatcher/default.go:420 +0x307
     created by github.com/xtls/xray-core/app/dispatcher.(*DefaultDispatcher).Dispatch in goroutine 123015
         github.com/xtls/xray-core/app/dispatcher/default.go:252 +0x56c
    

Fix XTLS#116

This issue does not appear at all if mux.cool is disabled. Maybe the
issue can be avoided if routing is disabled too, but it's possible that
the same issue appears elsewhere.

Tests should be added. ***The mux server has no tests at all!***

Step-by-step:

1. The mux server is calling handleStatusNew with the same context
   multiple times. That function then calls w.dispatcher.Dispatch

2. Dispatch assigns some values specific to the connection onto the
   outbound here:
   https://github.com/XTLS/Xray-core/blob/83eef6bc1f554be84aeb799417688a070cd32ab8/app/dispatcher/default.go#L241-L242

3. Because the outbounds array on the connection object is the same
   across multiple connections (it's the same context), the value of
   outbound.Target flips back and forth between values such as "1.1.1.1"
   or "chat.facebook.com" (IPAddress vs DomainAddress)

4. Many layers deeper in routing, as part of this Dispatch call, this
   function is called: https://github.com/XTLS/Xray-core/blob/83eef6bc1f554be84aeb799417688a070cd32ab8/features/routing/session/context.go#L53-L55

   It does two things, in order:

   1. Checks if Target is an IPAddress (IsIP)
   2. If it's an IP, call IP()

   Because Target keeps mutating back and forth between multiple kinds
   of address, it sometimes is an IP for Step 1, and a DomainAddress for
   Step 2

5. Calling IP() on a DomainAddress panics:

        panic: Calling IP() on a DomainAddress.

        goroutine 123018 [running]:
        github.com/xtls/xray-core/common/net.domainAddress.IP(...)
            github.com/xtls/xray-core/common/net/address.go:172
        github.com/xtls/xray-core/features/routing/session.(*Context).GetTargetIPs(0xc004175428)
            github.com/xtls/xray-core/features/routing/session/context.go:54 +0x5f
        github.com/xtls/xray-core/app/router.(*MultiGeoIPMatcher).Apply(0xc0003a63e0, {0x156c770?, 0xc004175428?})
            github.com/xtls/xray-core/app/router/condition.go:143 +0x3e
        github.com/xtls/xray-core/app/router.(*ConditionChan).Apply(0x495c9d?, {0x156c770, 0xc004175428})
            github.com/xtls/xray-core/app/router/condition.go:32 +0x5c
        github.com/xtls/xray-core/app/router.(*Rule).Apply(...)
            github.com/xtls/xray-core/app/router/config.go:30
        github.com/xtls/xray-core/app/router.(*Router).pickRouteInternal(0xc000367a40, {0x156c770, 0xc004175428})
            github.com/xtls/xray-core/app/router/router.go:196 +0x176
        github.com/xtls/xray-core/app/router.(*Router).PickRoute(0x1562810?, {0x156c770?, 0xc004175428?})
            github.com/xtls/xray-core/app/router/router.go:84 +0x25
        github.com/xtls/xray-core/app/dispatcher.(*DefaultDispatcher).routedDispatch(0xc000398fc0, {0x1562810, 0xc0050cd350}, 0xc00418dea0, {{0x15625a8, 0xc005974c10}, 0x1bb, 0x2})
            github.com/xtls/xray-core/app/dispatcher/default.go:420 +0x307
        created by github.com/xtls/xray-core/app/dispatcher.(*DefaultDispatcher).Dispatch in goroutine 123015
            github.com/xtls/xray-core/app/dispatcher/default.go:252 +0x56c
@mmmray
Copy link
Contributor Author

mmmray commented Aug 22, 2024

Without mux, the outbounds array is reset entirely for every connection: https://github.com/mmmray/Xray-core/blob/7a80df07b2aa83685d41d72885d201ab47ef9c05/app/proxyman/inbound/worker.go#L83

maybe it's a more appropriate fix?

@mmmray
Copy link
Contributor Author

mmmray commented Aug 23, 2024

@yuhan6665 or @RPRX please review, if you have ideas on how to test, tell me as well.

@yuhan6665
Copy link
Member

yuhan6665 commented Aug 24, 2024

Thanks for your great investigation @mmmray!
I have another thought:
These addresses has been confusing. As far as I know, we will only have three different target addresses (at most): original from protocol, sniffed, and for routing. Why do we need to mutate? We can refactor so that these addresses should never change?
Maybe, add a convenient function like check if original is fake IP and return sniffed instead..
What do you think @RPRX?

@mmmray
Copy link
Contributor Author

mmmray commented Aug 24, 2024

Going even further, I don't know why the context is used to carry Target at all. I suppose that somewhere, it was too annoying to pass as function argument, but I did not notice it. But i'm also new to the code.

@yuhan6665
Copy link
Member

In general, context should contain (as much as) session level information. I think inbound, outbounds so on and so forth info is ok. XTLS also use it pass raw conn object. A bit strange but I think it is also ok.

@mmmray
Copy link
Contributor Author

mmmray commented Aug 25, 2024

I think some refactoring in this area is really needed. The other one is around casting to PacketConnWrapper (see #3711 #3559). Anyway, maybe for another time.

@mmmray mmmray merged commit 3dd3bf9 into XTLS:main Aug 25, 2024
@mmmray mmmray deleted the mux-routing-leak branch August 25, 2024 19:02
@ssmetall
Copy link

ssmetall commented Aug 25, 2024

I think some refactoring in this area is really needed. The other one is around casting to PacketConnWrapper (see #3711 #3559). Anyway, maybe for another time.

thanks for investigating and fixing , please build new version of core soon as possible
It has been almost a month that the core with new merges has not been released
@RPRX not planning for build new versions?

@mmmray
Copy link
Contributor Author

mmmray commented Aug 25, 2024

Indeed I think a new version should be released soon. There are a few bugfixes, and this new fragment PR sitting around in master doesn't help with its lifetime. I was hoping to wait for #3711 though. Let's see.

I think the next version was planned to be related to splithttp mux settings, but... it's ready when it's ready IMO and shouldn't hold up critical bugfixes.

@ssmetall
Copy link

ssmetall commented Aug 25, 2024

Indeed I think a new version should be released soon. There are a few bugfixes, and this new fragment PR sitting around in master doesn't help with its lifetime. I was hoping to wait for #3711 though. Let's see.

The new fragment pr has no problem with ws or httpupgrade any operator in I r a n i used now, please don't make a cover for it 😆

and splithttp almost blocked in iran , beacuse udp blocked in many isp

@mikeesierrah
Copy link

@sssagsag nope , thats not true

splithttp isnt based on quic , udp mode - quic - h3 is optional

@ssmetall
Copy link

ssmetall commented Aug 25, 2024

@sssagsag nope , thats not true

splithttp isnt based on quic , udp mode - quic - h3 is optional

yes iknow , but i cant used without h3 , with many isp

@ssmetall
Copy link

ssmetall commented Aug 25, 2024

Indeed I think a new version should be released soon. There are a few bugfixes, and this new fragment PR sitting around in master doesn't help with its lifetime. I was hoping to wait for #3711 though. Let's see.

I think the next version was planned to be related to splithttp mux settings, but... it's ready when it's ready IMO and shouldn't hold up critical bugfixes.

maybe this pr helped #3677

zxspirit pushed a commit to zxspirit/Xray-core that referenced this pull request Aug 30, 2024
@dyhkwong
Copy link
Contributor

Why not just

func (ctx *Context) GetTargetIPs() []net.IP {
	if ctx.Outbound == nil {
		return nil
	}
	target := ctx.Outbound.Target
	if !target.IsValid() {
		return nil
	}

	if target.Address.Family().IsIP() {
		return []net.IP{target.Address.IP()}
	}

	return nil
}

to prevent ctx.Outbound.Target to be called for multiple times? Is a complex "fix" necessary?

And I think this is introduced in 017f53b, not a bug report from 2020.

@mmmray
Copy link
Contributor Author

mmmray commented Oct 13, 2024

To me the problem is the racing write in Dispatch, not the racing read in GetTargetIPs. If I only fix GetTargetIPs as you said, the following sequence is still possible. (Sub-)Connection A and B are part of the same physical mux.cool connection:

  • Dispatch from connection A is writing
  • Dispatch from connection B is writing
  • GetTargetIPs from connection A is reading (it reads the result from Dispatch B)
  • GetTargetIPs from connection B i reading (it reads the result from Dispatch B)

So now, xray may route connection A based on information found in connection B! And beside GetTargetIPs, there are many other reads for multiple routing rules. Really, the most problematic part of this bug is not the crash, but the potential mis-routing as a result of the race condition.

leninalive pushed a commit to amnezia-vpn/amnezia-xray-core that referenced this pull request Oct 29, 2024
it2konst pushed a commit to it2konst/gametunnel-core that referenced this pull request Mar 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

panic: Calling IP() on a DomainAddress

5 participants