Spark JDBC insert: 1M rows fast, 2M rows extremely slow, 13M rows fast — same code
Summary We observed a highly non-linear performance anomaly when writing data via Spark JDBC to PostgreSQL. With identical code, a small dataset (1M rows) completed in minutes, a medium dataset (2M rows) hung or stalled for over an hour, and a large dataset (13M rows) resumed normal performance. The root cause was unbounded memory accumulation … Read more