delegate parallelism to Ninja when possible (#64733)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64733
The previous implementation was wrong when CPU scheduling affinity is
set. In fact, it is still wrong if Ninja is not being used.
When there is CPU scheduling affinity set, the number of processors
available on the system likely exceeds the number of processors that
are usable to the build. We ought to use
`len(os.sched_getaffinity(0))` to determine the effective parallelism.
This change is more minimal and instead just delegates to Ninja (which
handles this correctly) when it is used.
Test Plan:
I verified this worked as correctly using Ninja on a 96-core machine
with 24 cores available for scheduling by checking:
* the cmake command did not specify "-j"
* the number of top-level jobs in top/pstree never exceeded 26 (24 +
2)
And I verified we get the legacy behavior by specifying USE_NINJA=0 on
the build.
Reviewed By: jbschlosser, driazati
Differential Revision:
D30968796
Pulled By: dagitses
fbshipit-source-id:
29547dd378fea793957bcc2f7d52d5def1ecace2