How to Preserve Duplicate Tags When Converting WAV to FLAC with ffmpeg

Summary

ffmpeg drops duplicate metadata fields when converting WAV (RIFF ID3v2.3) to FLAC because the FLAC container does not support multiple identical tags. The encoder collapses them into a single value, keeping only the last occurrence. This behavior is by design in the libavformat metadata handling.

Root Cause

  • FLAC’s Vorbis comment spec allows a key to appear multiple times, but ffmpeg’s libavformat normalises metadata into a map keyed by string, overwriting earlier entries.
  • When -map_metadata or -metadata is used, ffmpeg copies the metadata dictionary from the input, which already contains only the last value for each duplicated key.
  • The WAV RIFF/ID3v2 metadata format permits duplicate keys, while the FLAC encoder expects a single value per key unless explicitly instructed otherwise.

Why This Happens in Real Systems

  • Many audio conversion tools rely on ffmpeg’s default metadata handling, assuming a 1‑to‑1 key/value mapping.
  • The metadata abstraction layer in FFmpeg was built for simplicity and speed, not for preserving tag multiplicity.
  • Real‑world pipelines often ignore the nuance because most consumer audio formats (MP3, AAC, etc.) also store only one value per tag.

Real-World Impact

  • Loss of genre/sub‑genre, multiple artists, or custom tags during batch conversions, leading to incomplete libraries.
  • Music players that display all tags (e.g., Picard, MusicBrainz) will show missing information, breaking workflows that rely on exact tagging.
  • Automated playlists or recommendation engines that parse multiple values may produce inaccurate results.

Example or Code (if necessary and relevant)

ffmpeg -i input.wav -map_metadata 0 -c:a flac output.flac

How Senior Engineers Fix It

  • Manually duplicate tags using the -metadata option for each occurrence, e.g., -metadata genre=Trance -metadata genre=TBG.
  • Write a pre‑processing script (PowerShell, Python, Bash) that extracts duplicate keys with ffprobe, then rebuilds the command line with multiple -metadata flags.
  • Use external tagging tools (e.g., metaflac, kid3) after conversion to re‑inject the lost tags.
  • Contribute a patch to FFmpeg’s libavformat to store duplicate keys as a list, or use the -metadata:s:v stream‑specific syntax for more granular control.

Why Juniors Miss It

  • They assume metadata is a simple key/value store, overlooking the difference between container specifications.
  • Usually rely on default -map_metadata without verifying the output, missing the fact that ffmpeg silently overwrites duplicates.
  • Lack of experience with post‑conversion tagging tools and the need to validate tags after batch operations.

Key takeaway: ffmpeg’s default metadata handling collapses duplicate tags; to retain all values you must explicitly re‑emit each tag or apply a post‑processing tagging step.

Leave a Comment