Fixing GDAL/PROJ Precision Loss Due to Missing proj-data

Summary

MaxRev.Gdal packages ship with a bundled proj.db but omit the proj-data directory that contains datum‑grid shift files.
Because PROJ cannot locate these grids, coordinate transforms that normally use them fall back to less accurate formulas, resulting in small but significant precision errors in GDAL/PROJ tests.

Root Cause

  • Missing data path: proj-data is not included in the NuGet package, so PROJ only has proj.db.
  • Environment configuration: Setting PROJ_LIB or PROJ_DATA in code does not affect the internal path lookup that PROJ performs when loading dependencies.
  • Built‑in search order: PROJ searches the directories listed in the PROJ_LIB environment variable first, but if the variable is not set when the library is initialized, the search defaults to its own internal data path.
  • Grid file format mismatch: Even after placing proj-data next to proj.db, PROJ may ignore the directory if the file names or checksums differ from what the bundled proj.db expects.

Why This Happens in Real Systems

  • Package boundaries: Commercial or embedded builds often strip optional data to reduce size.
  • Runtime environment isolation: Docker containers or isolated runtimes may not expose the host’s data paths.
  • Dependency mismatch: A library may bundle a stale proj.db that references older grid files, while the user supplies a newer proj-data set.

Real-World Impact

  • Geospatial accuracy loss: Transformations can drift by centimeters to meters, unacceptable for surveying, navigation, or cadastral applications.
  • Reproducibility issues: Tests that pass on a developer’s machine may fail in CI/CD pipelines where the environment differs.
  • Operational risk: Mapping software shipped to end users may deliver incorrect positions, leading to legal or safety incidents.

Example or Code

// Example: attempting to override PROJ data path in code
Environment.SetEnvironmentVariable("PROJ_LIB", @"C:\Data\proj-data");
Environment.SetEnvironmentVariable("PROJ_DATA", @"C:\Data\proj-data");

// This has no effect because PROJ has already been initialized.
GdalBootstrapper.Configure(); // PROJ is loaded here

How Senior Engineers Fix It

  • Verify search path

    1. Call PROJ_GetInfo(2) after GdalBootstrapper.Configure() to print the current data directories.
    2. Ensure proj-data appears in the printed list.
  • Use a custom PROJ system directory

    string projData = Path.Combine(AppContext.BaseDirectory, "proj-data");
    OSGeo.GDAL.Gdal.SetConfigOption("PROJ_LIB", projData);
    GdalBootstrapper.Configure(); // Configure after setting config option

    This approach forces PROJ to use the specified directory before any native library is loaded.

  • Bundle an updated proj.db that references the exact grid files you ship.
    Re‑build proj.db with proj’s proj tools (proj 8 and above) using the same proj-data you distribute.

  • Set environment variables in the host environment (CI, Dockerfile, service startup script) instead of in application code.

    ENV PROJ_DATA=/opt/proj-data
    ENV PROJ_LIB=/opt/proj-data
  • Use PROJ’s proj_find() API to programmatically locate grids and log the path being used.

    string path = Proj.Find("", "gt", "NAD83", "NAD27", "EPSG:4269", "EPSG:4267");
    Console.WriteLine($"Using grid: {path}");

Why Juniors Miss It

  • Assuming bundled data is complete: They think proj.db alone suffices.
  • Overlooking static initialization: They set environment variables after the library loads.
  • Ignoring PROJ’s search algorithm: Lack of documentation leads them to believe all data will be found automatically.
  • Premature optimization: They neglect grid files because test failures are subtle and overlooked in quick development cycles.

By carefully configuring the PROJ search path before initialization, verifying the effective data directories, and ensuring that proj.db matches the grid files, senior engineers guarantee accurate, reproducible geospatial transformations in production systems.

Leave a Comment