Milvus standalone doesn’t create configured MinIO bucket after data write

Summary

Milvus standalone fails to create the configured MinIO bucket because Milvus uses the MinIO SDK to create buckets, which requires the MinIO server to support the CreateBucket API. However, many MinIO deployments—especially standalone instances or custom installations—do not enable this API by default for security and governance reasons. Without explicit bucket creation privileges (e.g., via MinIO mc policy settings), Milvus write operations will silently succeed at the database level but will not persist data to the intended bucket. The data may be stored in a different location (like the default files directory) or simply dropped, leading to data loss.

Root Cause

The core issue stems from a mismatch between Milvus’s bucket-creation expectation and MinIO’s default security posture:

  • Milvus is designed to auto-create the bucket if it doesn’t exist.
  • This auto-creation relies on the MinIO server’s CreateBucket API endpoint.
  • By default, MinIO restricts bucket creation to the root user or explicitly authorized accounts.
  • If the access key/secret key provided in milvus.yaml does not have the s3:CreateBucket permission, the request is silently denied or fails at the network level without clear error propagation to Milvus logs.

Why This Happens in Real Systems

In production environments, MinIO is often deployed with least-privilege principles to prevent accidental bucket sprawl. This leads to:

  • Limited IAM policies: Service accounts (e.g., the one used by Milvus) are granted only s3:PutObject and s3:GetObject but not s3:CreateBucket.
  • Standalone vs. Distributed MinIO: Standalone MinIO instances may have stricter defaults compared to distributed or managed MinIO clusters.
  • Silent Failures: The S3 protocol may return a generic 403 (Forbidden) error, which Milvus might interpret as a non-critical warning, especially if the log.level is not set to debug.
  • Fallback to Local Storage: If bucket creation fails, Milvus may fall back to local file system storage (/data/milvus) for segments, causing confusion as data exists but not in the expected remote bucket.

Real-World Impact

  • Data Inconsistency: Write operations appear successful, but query results are empty or inconsistent because data isn’t persisted in the correct bucket.
  • Operational Blindness: Without clear errors in Milvus logs (which often only log S3 errors at WARN level), engineers assume configuration is correct.
  • Scaling Issues: When Milvus is scaled horizontally (with multiple query nodes or data nodes), the issue compounds because each node may attempt to create the bucket, leading to race conditions or repeated failed attempts.
  • Debugging Overhead: Engineers waste hours checking network connectivity, DNS resolution, and TLS settings when the root cause is a permissions issue.

Example or Code

No executable code is required for this specific issue, as it is a configuration and permissions problem rather than a code defect. However, the following commands are essential for validation and troubleshooting:

  1. Verify MinIO connectivity and permissions:

    mc alias set myminio http://localhost:9000 MINIO_ROOT_USER MINIO_ROOT_PASSWORD
    mc ls myminio/  # Check if the bucket exists
    mc admin info myminio  # Verify MinIO server status
  2. Test bucket creation with the Milvus service account:

    mc mb myminio/a-bucket  # Attempt to create bucket
    • If this fails with Access Denied, the Milvus credentials lack permissions.
  3. Check Milvus logs for S3 errors (set log level to debug in milvus.yaml):

    grep -i "s3\|minio\|bucket" /var/log/milvus/milvus.log
    • Look for Failed to create bucket or AccessDenied errors.

How Senior Engineers Fix It

  1. Grant Explicit Permissions:

    • Log into MinIO as the root user (or an admin account) using mc.
    • Set the policy for the Milvus service account to allow s3:CreateBucket:
      mc admin policy attach myminio readwrite --user 
    • Alternatively, create a custom policy:
      {
        "Version": "2012-10-17",
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "s3:CreateBucket",
              "s3:PutObject",
              "s3:GetObject",
              "s3:ListBucket"
            ],
            "Resource": ["arn:aws:s3:::a-bucket", "arn:aws:s3:::a-bucket/*"]
          }
        ]
      }

      Then apply it: mc admin policy create myminio milvus-policy policy.json and mc admin policy attach myminio milvus-policy --user <milvus_access_key>.

  2. Pre-create the Bucket Manually:

    • Create the bucket manually using mc mb myminio/a-bucket before starting Milvus. This is a common production practice to avoid dependency on bucket creation APIs.
  3. Verify Milvus Configuration:

    • Ensure milvus.yaml has:
      minio:
        address: "localhost"
        port: 9000
        accessKeyID: "your_access_key"
        secretAccessKey: "your_secret_key"
        bucketName: "a-bucket"
        useSSL: false  # Adjust based on your setup
        useIAM: false  # Only if using IAM roles
    • Set log.level: "debug" temporarily to capture S3 errors.
  4. Restart and Validate:

    • Restart Milvus: systemctl restart milvus.
    • Write data and check MinIO console for the bucket and objects.
    • If data still doesn’t appear, check MinIO server logs: mc admin logs myminio.

Why Juniors Miss It

  • Over-reliance on Documentation: Juniors often follow the Milvus documentation but assume MinIO permissions are open by default, not realizing MinIO’s security-first design.
  • Log Interpretation: They may not scrutinize Milvus logs for subtle S3-related warnings (e.g., TransportError or NoSuchBucket), assuming “no error” means success.
  • Configuration Assumptions: They might misconfigure the bucket name or MinIO endpoint, leading to different errors, but the core issue is often permissions.
  • Lack of S3 Protocol Knowledge: Understanding that S3 APIs like CreateBucket require specific permissions is often overlooked in favor of focusing on Milvus-specific settings.
  • Testing in Isolation: They might test with a root MinIO user (where bucket creation works) but fail to validate with the actual service account credentials used in production.