[Support] Avoid calling CommandLineToArgvW from shell32.dll
authorReid Kleckner <rnk@google.com>
Tue, 11 Sep 2018 20:22:39 +0000 (20:22 +0000)
committerReid Kleckner <rnk@google.com>
Tue, 11 Sep 2018 20:22:39 +0000 (20:22 +0000)
commitf6968d886a4f2875d15c51d4f86754ebb1084936
treef58bcdbd0332981b6d75efaa203b93ac2c3ac8d8
parent7ce60324321a34c49aaf4f540038c6184253502c
[Support] Avoid calling CommandLineToArgvW from shell32.dll

Summary:
Shell32.dll depends on gdi32.dll and user32.dll, which are mostly DLLs
for Windows GUI functionality. LLVM's utilities don't typically need GUI
functionality, and loading these DLLs seems to be slowing down startup.
Also, we already have an implementation of Windows command line
tokenization in cl::TokenizeWindowsCommandLine, so we can just use it.

The goal is to get the original argv in UTF-8, so that it can pass
through most LLVM string APIs. A Windows process starts life with a
UTF-16 string for its command line, and it can be retreived with
GetCommandLineW from kernel32.dll.

Previously, we would:
1. Get the wide command line
2. Call CommandLineToArgvW to handle quoting rules and separate it into
   arguments.
3. For each wide argument, expand wildcards (* and ?) using
   FindFirstFileW.
4. Convert each argument to UTF-8

Now we:
1. Get the wide command line, convert the whole thing to UTF-8
2. Tokenize the UTF-8 command line with cl::TokenizeWindowsCommandLine
3. For each argument, expand wildcards if present
   - This requires converting back to UTF-16 to call FindFirstFileW
   - Results of FindFirstFileW must be converted back to UTF-8

Reviewers: zturner

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D51941

llvm-svn: 341988
llvm/lib/Support/Windows/Process.inc