On Windows the ancient
ANSI APIs are generally constrained in the length of
file paths to ~260 characters. In the
Windows.h header this limit is defined
So how does this manifest itself? Usually as a
file not found error as the
path is silently truncated to this maximum when these APIs are called. See for
example how GHC
8.0.1 handles long paths:
> C:\ghc\ghc-8.0.1\bin\ghc.exe .\hello.hs -o "R:\<very long path>\a.exe" -fforce-recomp <command line>: error: directory portion of "R:\\<very long path>\\a.exe" does not exist (used with "-o" option.)
When Microsoft introduced the
Wide versions of the
Win32 APIs they added the
ability to bypass this limit (and other naming restrictions) by using an extended
format for paths.
Windows PATHs are categorized in
Namespaces and the namespaces we’re interested
in using are the
Win32 device and
Win32 file namespaces.
Paths using these namespaces start with
\\?\ respectively. These
are the paths you see when you look at the Windows mount table (using the
device namespace is used when you require raw/direct access to a device, e.g.
file namespace is used for normal filesystem access.
are accessible as
\\?\UNC\. More at MSDN.
Using these namespaces bypass all API level limitations as the path is passed directly to the
underlying filesystem driver. This lifts the filepath length limit to about
There is however one downside. Since the Win32 API no longer processes the paths, having a
/ in a path is invalid along with relative paths and special components such as
.. in paths.
Starting with GHC
8.6 I added code into GHC that wraps the usage of the deprecated POSIX
APIs with their modern equivalent. This allowed Haskell programs like
to be able to use these longer paths. The code added to GHC does some processing to ensure that any paths it gives to the filesystem is valid. This processing is disabled if a
namespaced path is used as an input.
So what happens when we run GHC
8.6.x on a long path then?
> ghc .\hello.hs -o "R:\<very long path>\a.exe" -fforce-recomp [1 of 1] Compiling Main ( hello.hs, hello.o ) Linking R:\<very long path>\a.exe ... R:\ghc-8.6.4\lib\../mingw/bin/windres.exe: can't open file `R:\<very long path>\a.exe.manifest': No such file or directory `windres.exe' failed in phase `Windres'. (Exit code: 1)
Still crashed, not quite what was expected…
The problem is that while GHC now understands long paths, the underlying toolchain that we use does not. More specifically GCC and Binutils do not support the long paths.
Unfortunately patching GCC and Binutils to support these newer APIs pose both technical and
legal challenges. These projects do not centralize the way they open files. While
exists and is in use for some paths, most places just do
For the longest time the tickets and frustrations of Windows users have been piling up with the
"upstream". Others have devised workaround such as using
DOS 8.3 shortnames, however these
have multiple issues:
- They’re a performance bottleneck. You do extra writes in your
MFTfor each entry and the algorithm to find a unique shortname does a linear probe.
- They can be disabled, and then things won’t work.
- A shortname is extra data to an existing path, which means you can’t have a shortname to a non-existent file.
So what can we do about it without changing the source? We can change the binaries!
Essentially what we’re going to do is replace every call to certain functions in GCC and Binutils with new and improved ones. There are three ways this “hooking” can be done:
- Dynamic: The program is created suspended, the headers of the program is modified in memory and addresses for the functions we want to hook will be overridden by the new ones.
- Static: The header is modified directly in the file on disk.
- Using a local redirect: Windows has a feature where a
.localforwarder file can be used to redirect access to a DLL with one that’s local (in a subfolder of the local file). This won’t work as certain system DLLs such as
msvcrt.dllare exempt from being redirected.
Dynamic is nice as it makes no lasting changes to the binary, but it’s also slower and requires changes to
Static on the other hand will have no performance overhead but if the binaries are signed then the signatures would be invalidated.
Luckily the binaries are not signed, so I went with the
Essentially the game plan is this, in every binary in the
mingw folder of
we do the following:
Find the IAT (
Import Address Table) entry for
msvcrt.dll (the Microsoft C runtime) and replace these with a new runtime
phxcrt.dll. That name was chosen as it has the same length as the original file we’re replacing. After the replacement is done the
PE header checksum is re-calculated and the new values written out.
phxcrt.dll is a thin
forwarder only DLL. It uses a little known feature of the Windows system loader to allow one to forward a call to another DLL. The final address is the one that the loader will write back into the
import descriptor, so this cost is small and paid only once. (Note that this does require strict binding of functions, but lazy binding can also be overridden but by a different mechanism.)
phxcrt.dll will forward the calls we don’t care about to
msvcrt.dll and the ones we do will be forwarded to
muxcrt.dll (I’ll leave the details of this to another post).
So let’s see it in action:
> rm -rf /r/ghc-8.6.4/ > tar xJf ghc-8.6.4-x86_64-unknown-mingw32.tar.xz > make gcc -O2 -DUNICODE -D__USE_MINGW_ANSI_STDIO=1 -std=c99 -municode -DWINVER=0x0601 -D_WIN32_WINNT=0x0601 fs.c iat-patcher.c -o iat-patcher.exe -ldbghelp -limagehlp gcc -shared phxcrt.c fs.c -o x86_64/muxcrt.dll gcc -shared x86_64/msvcrt.def -o x86_64/phxcrt.dll > Import-Module .\Patch-GHC.psm1 -Force; Use-PatchGHC ghc "iat-patcher.exe" "ghc-jailbreak\x86_64" $true Done patching GHC's Mingw-w64 distribution. Good to go.
At this point GCC etc. should be good to go! Let’s try out that command from before:
> ghc .\hello.hs -o "R:\<very long path>\a.exe" -fforce-recomp [1 of 1] Compiling Main ( hello.hs, hello.o ) Linking R:\<very long path>\a.exe ... > /r/<very long path>/a.exe Hello World
Success! To check if as expected the redirects are cheap we inspect the API trace for
And as expected, the
IAT has a mix of
msvcrt.dll but no
phxcrt.dll apart from the initialization calls.
The patcher is currently available for wide testing and can be installed through
choco install ghc-jailbreak
and you’re good to go! To undo the patching just uninstall the patcher using
choco uninstall ghc-jailbreak
This will undo the changes to all the binaries.
For the full source visit the project GitHub.