Resolving DNAT Rule Shadowing and Port Mismatch on Eltex Gateways

Summary

A critical connectivity failure occurred during the implementation of a Destination Network Address Translation (DNAT) policy on an Eltex gateway. The intent was to expose internal services (HTTP and SSH) to a public network by mapping a public “UPLINK” address to private server IPs. However, the configuration logic contained overlapping rules and protocol mismatches, leading to inconsistent traffic routing and the inability to establish secure SSH sessions.

Root Cause

The failure stemmed from three primary configuration errors:

  • Rule Shadowing/Ordering: The DNAT ruleset was evaluated in a way where generic rules could intercept traffic intended for more specific rules.
  • Port-Protocol Mismatch: The configuration attempted to define service objects that did not align with the actual traffic requirements (e.g., defining service ranges that were too broad or incorrectly scoped).
  • Incomplete Address Mapping: In the DNAT pool definitions, if the destination port in the pool does not match the expected service port of the backend server, the TCP handshake will fail immediately upon translation.

Why This Happens in Real Systems

In high-pressure production environments, these issues arise due to:

  • Complexity of Statefulness: NAT is not just a static mapping; it is a stateful operation. If the translation of the destination port is handled incorrectly, the return traffic (the SYN-ACK) will be routed to the wrong port on the client side, breaking the connection.
  • Abstraction Leaks: Engineers often assume that defining a “service” (like HTTP) automatically handles all underlying port logic, forgetting that the NAT Engine requires explicit mapping between the external listener port and the internal server port.
  • Policy Ordering: Most network operating systems process firewall and NAT rulesets top-down. A broad rule placed at rule 1 will “shadow” a specific rule at rule 2, making the second rule unreachable.

Real-World Impact

  • Service Unavailability: Users attempting to reach the web server via the public IP receive “Connection Refused” errors.
  • Security Blind Spots: Misconfigured DNAT can inadvertently expose unintended ports if the object-group ranges are too wide.
  • Troubleshooting Latency: Because the issue is logical (the packet is being translated to the wrong port) rather than physical (the link is up), standard “ping” tests will pass, leading engineers to waste hours looking at the wrong layer of the OSI model.

Example or Code (if necessary and relevant)

# INCORRECT CONFIGURATION LOGIC

# Rule 1: Matches HTTP/SSH (Broad)
rule 1
  match destination-port HTTP/SSH
  action destination-nat pool S2_HTTP
# This rule "swallows" SSH traffic because SSH is included in the HTTP/SSH group

# Rule 2: Matches SSH (Specific)
rule 2
  match destination-port SSH
  action destination-nat pool S2_SSH
# This rule is NEVER reached for SSH traffic

How Senior Engineers Fix It

Senior engineers apply Principle of Least Privilege and Strict Ordering to resolve these issues:

  • Specific-to-General Ordering: Always place the most specific rules (e.g., a single IP/Port) at the top of the ruleset and the most general rules (e.g., entire subnets) at the bottom.
  • Atomic Rule Definition: Instead of grouping disparate services into a single object-group for a NAT rule, create dedicated rules for each service to ensure the Destination Port Translation is precise.
  • Verification via Packet Tracing: Instead of just “committing” the config, use built-in diagnostic tools to simulate a packet flow through the NAT engine to see exactly which rule is being hit.
  • Validation of Port Symmetry: Ensure that if the external port is 80, the internal pool maps specifically to the server’s listening port (e.g., 80 or 8080) to maintain the TCP state.

Why Juniors Miss It

  • Focusing on Connectivity vs. Translation: Juniors often check if the “IP is reachable” via Ping (Layer 3) but forget that DNAT is a Layer 4 (Transport) operation. If the port translation is wrong, Ping will work, but the service will fail.
  • Ignoring Rule Precedence: A common mistake is assuming all rules in a ruleset run “simultaneously.” In reality, the first match wins.
  • Misunderstanding the “Pool” concept: Juniors often view a NAT pool as just a list of IPs, failing to realize that in DNAT, the Pool Port is a critical component of the translation tuple (Source IP, Source Port, Destination IP, Destination Port).

Leave a Comment