Strange reverse proxy behaviour, error 502, sometimes

Describe the problem

Reverse Proxy access to my service (domain) is encountering 502, but randomly / not always / not always from all clients. The problem is not on a request base, more one a auth/time and client basis.

After a while, it may be working, after another while not anymore.

When it is not working from my client 1 (same network), it can be working from client 2. Sometimes it works from both. Its the same with different networks (like local and mobile).

When my target service works on lets say my desktop-clients firefox tab, i can open up a incognito tab on the same browser, give my password and I get netbird error pages with 502. The other tab continues working. If i open up another browser - again 502, whatever I do.

Sometimes deleting browser-cache and restart browser helps, sometimes not.

It happens on a quite regular basis. My feeling is some kind of limitations on auth/connection, once it works and I keep using the target resource, it keeps working. As soon as I pause using it some time, chance for failure is high. Maybe some kind of rate limiting, auth session, some other timing thing on network or dns.

To Reproduce

Steps to reproduce the behavior:

  1. Target service docker (bridge mode) on same Synology (IP 172.22.0.6)

  2. netbird peer on Docker bridge mode (rootless) on a Synology (IP 172.17.0.3)

I added the netbird peer to the target service network as well to make them see each other.

(IP 172.22.0.7)

  1. netbird cloud management setup:

2 peers (only one active, the other is for testing purposes)

Network “xyz” with ressource “172.22.0.6/32”

Policy for tcp, port 8080 only

custom domain zzz.mydomain .de (validated)

a reverse proxy service for a subdomain myservice.zzz.mydomain .de/ with target the ressource above on port 8080 and authentication password.

Whereevery I can give a group to an entity, i added one, let say “yyy” to bind it together.

I can always reach the target service with wget in the netbird peer console! Synology Firewall is disabled for now.

The target service is also available on the docker host network, lets say 192.168.178.17:8080 with no problem at all.

The target service log is not showing any problems.

Expected behavior

After authentication I expect the target service to show up continuously. Ideally even after coming back to my browser tab 3 days later I expect the auth page again and then the target service to show up. Also i expect the target service to be working multiple times on the same client in different browser/incognito tabs (after successful auth).

Are you using NetBird Cloud?

Yes

NetBird version

latest / today: 0.6.8 (but same problem on 0.6.7, did a lot of troubleshooting on that version before)

(rootless and non-rootless)

Is any other VPN software installed?

No.

Debug output

To help us resolve the problem, please attach the following anonymized status output

netbird status -dA:

Peers detail:                                                                                                                                                      
 proxy-<<<removed>>>-28-69.netbird.cloud:                                                                                                                   
  NetBird IP: <<<removed>>>                                                                                                                                         
  Public key: Tapmxq9B/0+2m+jEIlHCW36kAXxCUBZMfYcLfaF5rmM=                                                                                                         
  Status: Connected                                                                                                                                                
  -- detail --                                                                                                                                                     
  Connection type: P2P                                                                                                                                             
  ICE candidate (Local/Remote): host/srflx                                                                                                                         
  ICE candidate endpoints (Local/Remote): <<<removed>>>                                                                                      
  Relay server address: rels://streamline-de-fra1-3.relay.netbird.io:443                                                                                           
  Last connection update: 23 minutes, 18 seconds ago                                                                                                               
  Last WireGuard handshake: 1 minute, 45 seconds ago                                                                                                               
  Transfer status (received/sent) 9.7 KiB/13.2 KiB                                                                                                                 
  Quantum resistance: false                                                                                                                                        
  Networks: -                                                                                                                                                      
  Latency: 10.993254ms                                                                                                                                             
                                                                                                                                                                   
 m-pxl9.netbird.cloud:                                                                                                                                             
  NetBird IP: <<<removed>>>                                                                                                                                         
  Public key: <<<removed>>>                                                                                                         
  Status: Connecting                                                                                                                                               
  -- detail --                                                                                                                                                     
  Connection type: -                                                                                                                                               
  ICE candidate (Local/Remote): -/-                                                                                                                                
  ICE candidate endpoints (Local/Remote): -/-                                                                                                                      
  Relay server address:                                                                                                                                            
  Last connection update: 26 minutes, 26 seconds ago                                                                                                               
  Last WireGuard handshake: -                                                                                                                                      
  Transfer status (received/sent) 0 B/0 B                                                                                                                          
  Quantum resistance: false                                                                                                                                        
  Networks: -                                                                                                                                                      
  Latency: 0s                                                                                                                                                      
                                                                                                                                                                   
Events:                                                                                                                                                            
  [INFO] SYSTEM (601ad28f-5660-466d-b50e-1ccf1eb4478a)                                                                                                             
    Message: Network map updated                                                                                                                                   
    Time: 23 minutes, 18 seconds ago                                                                                                                               
  [...<<<removed>>>....] 
  [INFO] SYSTEM (a7784121-13be-4130-a6de-29f1d8ac9978)                                                                                                             
    Message: Network map updated                                                                                                                                   
    Time: 13 minutes, 43 seconds ago                                                                                                                               
  [INFO] SYSTEM (15ce3c25-5bd6-4c93-998f-694cf705f1f4)                                                                                                             
    Message: Network map updated                                                                                                                                   
    Time: 13 minutes, 43 seconds ago                                                                                                                               
OS: linux/amd64                                                                                                                                                    
Daemon version: 0.68.0                                                                                                                                             
CLI version: 0.68.0                                                                                                                                                
Profile: default                                                                                                                                                   
Management: Connected to https://api.netbird.io:443                                                                                                                
Signal: Connected to https://signal.netbird.io:443                                                                                                                 
Relays:                                                                                                                                                            
  [stun:stun.netbird.io:443] is Checking...                                                                                                                        
  [stun:stun.netbird.io:5555] is Checking...                                                                                                                       
  [turns:turn.netbird.io:443?transport=tcp] is Checking...                                                                                                         
  [rels://streamline-de-fra1-3.relay.netbird.io:443] is Available                                                                                                  
Nameservers:                                                                                                                                                       
FQDN: <<<removed>>>.netbird.cloud                                                                                                                              
NetBird IP: <<<removed>>>                                                                                                                                      
Interface type: Userspace                                                                                                                                          
Quantum resistance: false                                                                                                                                          
Lazy connection: false                                                                                                                                             
SSH Server: Disabled                                                                                                                                               
Networks: 172.22.0.6/32                                                                                                                         
Peers count: 1/2 Connected          

Create and upload a debug bundle, and share the returned file key:

f79e391890ab27fb37c88b3b4be7011e22aa2e5ca6f38ffa9c4481884941f726/0ccbc854-f02e-4156-8a5d-80da06292117

Additional context

  1. I walked through the troubleshooting doc for 502 issues → in my eyes, none of the scenarios match to my issue
  2. I do not see any critical issues (afaik) in the peer logfiles
  3. During the uploaded logfile I was in TRACE, I saw successfull log entries of the peer log, but not unsuccessful ones (except in netbird proxy activity). The requests from reverse proxy, so I assume the requests did not seem to get there at all.
  4. I used rootless and “standard” peer image
  5. I tried without auth, no change
  6. I tried with netbird domain instead of custom domain, no change
  7. I tried with a open policy (all instead of tcp 8080)
  8. I tried targeting the synology IP/docker host with 8080, does not help from outside (works locally)
  9. I tried different browsers/incognito modes…, does not help consistently

Have you tried these troubleshooting steps?

  • Reviewed client troubleshooting (if applicable)
  • Checked for newer NetBird versions
  • Searched for similar issues on GitHub (including closed ones)
  • Restarted the NetBird client
  • Disabled other VPN software
  • Checked firewall settings

I cleaned my configuration in the netbird-management-center (removed some orphaned peer and a unnecessary DNS zone configuration) - wow - since then the problem seems resolved.

My target resource is docker on 172.22… and I think the DNS entries may have directed for the same subdomain to a 192.168… address, so maybe this was it? If this is the case I would recommend to have a check and a warning for such configurations.

I actually tried to reproduce the error again - but the behaviour did not return. Should I say unfortunately?

It also might be related to a potential bug here (see GitHub #5626) that may be causing this too. A community user added a fix to it, but it is not tested yet.

Maybe this helps somebody.

Hey @ma334! I’m experiencing the same issue but can’t resolve it. I don’t have any orphaned peers, no unnecessary dns zones, everything seems configured properly - and I still have random 502 that start working sometimes. Did exactly what you did in terms of troubleshooting but nothing showed up. Any ideas? :sweat_smile: