TCPIP connection issue at the port in azure databricks

Summary

A misconfigured network access path between Azure Databricks and Azure SQL Database caused the TCP/IP connection failure. Although authentication and SQL objects were correctly created, the SQL server was not reachable from the Databricks workspace due to firewall, VNet, or region‑level connectivity restrictions.

Root Cause

The failure stems from Azure SQL Database firewall rules not allowing outbound traffic from Databricks. In real deployments, this usually happens because:

  • The SQL Server firewall does not allow Azure services (Allow Azure services and resources to access this server = OFF)
  • The Databricks workspace is deployed in a different region or different VNet than the SQL server
  • The SQL server is configured with private endpoint, but Databricks is not peered to that VNet
  • Outbound port 1433 is blocked by a network security group (NSG)
  • The SQL server public endpoint is disabled

Why This Happens in Real Systems

These issues appear frequently because:

  • Azure SQL uses strict firewall defaults
  • Databricks clusters run on ephemeral VMs, each with different outbound IPs
  • Engineers assume “same region = automatically reachable,” which is not true
  • Private endpoints require explicit VNet peering, which is easy to overlook
  • Azure’s “Allow Azure services” toggle is misleading—it does not allow all Azure resources, only a subset of Microsoft-owned IPs

Real-World Impact

When this misconfiguration occurs:

  • Full loads and ETL pipelines fail
  • JDBC connections time out
  • Clusters waste compute time, increasing cost
  • Production jobs stall, causing downstream SLA breaches
  • Debugging becomes slow because the error message is generic (“TCP/IP connection failed”)

Example or Code (if necessary and relevant)

Below is a minimal Databricks JDBC test that fails when firewall rules are incorrect:

jdbcUrl = "jdbc:sqlserver://.database.windows.net:1433;database="
connectionProperties = {
  "user": "",
  "password": dbutils.secrets.get("scope", "sql-password"),
  "driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}

spark.read.jdbc(jdbcUrl, "dbo.Product", connectionProperties).show()

How Senior Engineers Fix It

Experienced engineers validate the network path first, not the SQL credentials. They check:

  • SQL Server Firewall
    • Add Databricks outbound IPs to the SQL firewall
    • OR enable Allow Azure services
  • Region Alignment
    • Ensure Databricks and SQL are in the same region
  • Private Endpoint Setup
    • If using private endpoints, ensure:
      • VNet peering is configured
      • DNS resolution points to the private endpoint
  • NSG Rules
    • Confirm outbound port 1433 is open
  • Connectivity Tests
    • Use nc -zv <server> 1433 from an init script VM
    • Use Azure Network Watcher “Connection Troubleshoot”

Key takeaway: Senior engineers treat this as a network reachability problem, not a SQL problem.

Why Juniors Miss It

Junior engineers often focus on credentials and SQL objects, not the underlying network path. They typically:

  • Assume “login works locally, so it must work from Databricks”
  • Do not know Databricks uses ephemeral outbound IPs
  • Forget to check firewall rules or private endpoint DNS
  • Misinterpret the generic TCP/IP error as a SQL configuration issue

They troubleshoot the wrong layer, which delays resolution.

If you want, I can also generate a checklist you can use to validate Azure SQL connectivity from Databricks.

Leave a Comment