Summary
Robocopy’s output encoding is fixed as UTF-16LE, which causes issues when redirected or piped in PowerShell scripts. Attempting to change console encoding or using Set-Content with UTF-8 results in gibberish output due to encoding mismatch.
Root Cause
- Robocopy outputs in UTF-16LE regardless of system settings.
- PowerShell’s default encoding for redirection (
>) and pipes (|) is UTF-8, leading to encoding conflicts.
Why This Happens in Real Systems
- Encoding mismatch: Robocopy’s UTF-16LE output is incompatible with PowerShell’s UTF-8 handling.
- Redirection limitations: PowerShell’s
>and|operators do not preserve UTF-16LE encoding.
Real-World Impact
- Data corruption: Log files or displayed output appear as gibberish.
- Pipeline failures: Filtering or processing output fails due to invalid characters.
- Debugging challenges: Real-time logging and monitoring become unreliable.
Example or Code
$logPath = "$env:USERPROFILE\robocopy-log.txt"
$process = Start-Process -FilePath "robocopy.exe" -ArgumentList @("$env:PUBLIC", "$env:TEMP", "/E", "/L", "/UNICODE") -RedirectStandardOutput $logPath -NoNewWindow -Wait -Encoding Unicode
How Senior Engineers Fix It
- Use
Start-Processwith-RedirectStandardOutputand-Encoding Unicodeto preserve UTF-16LE. - Avoid redirection (
>) or pipes (|) for Robocopy output. - Explicitly handle encoding in logging:
Out-File -FilePath $logPath -Encoding Unicode
Why Juniors Miss It
- Assumption of default encoding: Juniors often assume Robocopy uses system default encoding.
- Overlooking PowerShell limitations: Lack of awareness about PowerShell’s UTF-8 default for redirection.
- Improper use of
Set-Content: Misunderstanding thatSet-Content -Encoding utf8works for UTF-16LE data.