|
6 | 6 | # Modified version of: https://chromium.googlesource.com/chromium/tools/depot_tools.git/+/refs/heads/main/post_build_ninja_summary.py |
7 | 7 | """Summarize the last ninja build, invoked with ninja's -C syntax. |
8 | 8 |
|
9 | | -This script is designed to be automatically run after each ninja build in |
10 | | -order to summarize the build's performance. Making build performance information |
11 | | -more visible should make it easier to notice anomalies and opportunities. To use |
12 | | -this script on Windows just set NINJA_SUMMARIZE_BUILD=1 and run autoninja.bat. |
13 | | -
|
14 | | -On Linux you can get autoninja to invoke this script using this syntax: |
15 | | -
|
16 | | -$ NINJA_SUMMARIZE_BUILD=1 autoninja -C out/Default/ chrome |
17 | | -
|
18 | | -You can also call this script directly using ninja's syntax to specify the |
19 | | -output directory of interest: |
20 | | -
|
21 | | -> python3 post_build_ninja_summary.py -C out/Default |
| 9 | +> python3 tools/report_build_time_ninja.py -C build/.. |
22 | 10 |
|
23 | 11 | Typical output looks like this: |
24 | | -
|
25 | | ->ninja -C out\debug_component base |
26 | | -ninja.exe -C out\debug_component base -j 960 -l 48 -d keeprsp |
27 | | -ninja: Entering directory `out\debug_component' |
28 | | -[1 processes, 1/1 @ 0.3/s : 3.092s ] Regenerating ninja files |
29 | | -Longest build steps: |
30 | | - 0.1 weighted s to build obj/base/base/trace_log.obj (6.7 s elapsed time) |
31 | | - 0.2 weighted s to build nasm.exe, nasm.exe.pdb (0.2 s elapsed time) |
32 | | - 0.3 weighted s to build obj/base/base/win_util.obj (12.4 s elapsed time) |
33 | | - 1.2 weighted s to build base.dll, base.dll.lib (1.2 s elapsed time) |
34 | | -Time by build-step type: |
35 | | - 0.0 s weighted time to generate 6 .lib files (0.3 s elapsed time sum) |
36 | | - 0.1 s weighted time to generate 25 .stamp files (1.2 s elapsed time sum) |
37 | | - 0.2 s weighted time to generate 20 .o files (2.8 s elapsed time sum) |
38 | | - 1.7 s weighted time to generate 4 PEFile (linking) files (2.0 s elapsed |
39 | | -time sum) |
40 | | - 23.9 s weighted time to generate 770 .obj files (974.8 s elapsed time sum) |
41 | | -26.1 s weighted time (982.9 s elapsed time sum, 37.7x parallelism) |
42 | | -839 build steps completed, average of 32.17/s |
43 | | -
|
44 | | -If no gn clean has been done then results will be for the last non-NULL |
45 | | -invocation of ninja. Ideas for future statistics, and implementations are |
46 | | -appreciated. |
47 | | -
|
48 | | -The "weighted" time is the elapsed time of each build step divided by the number |
49 | | -of tasks that were running in parallel. This makes it an excellent approximation |
50 | | -of how "important" a slow step was. A link that is entirely or mostly serialized |
51 | | -will have a weighted time that is the same or similar to its elapsed time. A |
52 | | -compile that runs in parallel with 999 other compiles will have a weighted time |
53 | | -that is tiny.""" |
| 12 | +``` |
| 13 | + Longest build steps for .cpp.o: |
| 14 | + 1.0 weighted s to build ...torch_bindings.cpp.o (12.4 s elapsed time) |
| 15 | + 2.0 weighted s to build ..._attn_c.dir/csrc... (23.5 s elapsed time) |
| 16 | + 2.6 weighted s to build ...torch_bindings.cpp.o (31.5 s elapsed time) |
| 17 | + 3.2 weighted s to build ...torch_bindings.cpp.o (38.5 s elapsed time) |
| 18 | + Longest build steps for .so (linking): |
| 19 | + 0.1 weighted s to build _core_C.abi3.so (0.7 s elapsed time) |
| 20 | + 0.1 weighted s to build _moe_C.abi3.so (1.0 s elapsed time) |
| 21 | + 0.5 weighted s to build ...flash_attn_c.abi3.so (1.1 s elapsed time) |
| 22 | + 6.2 weighted s to build _C.abi3.so (6.2 s elapsed time) |
| 23 | + Longest build steps for .cu.o: |
| 24 | + 15.3 weighted s to build ...machete_mm_... (183.5 s elapsed time) |
| 25 | + 15.3 weighted s to build ...machete_mm_... (183.5 s elapsed time) |
| 26 | + 15.3 weighted s to build ...machete_mm_... (183.6 s elapsed time) |
| 27 | + 15.3 weighted s to build ...machete_mm_... (183.7 s elapsed time) |
| 28 | + 15.5 weighted s to build ...machete_mm_... (185.6 s elapsed time) |
| 29 | + 15.5 weighted s to build ...machete_mm_... (185.9 s elapsed time) |
| 30 | + 15.5 weighted s to build ...machete_mm_... (186.2 s elapsed time) |
| 31 | + 37.4 weighted s to build ...scaled_mm_c3x.cu... (449.0 s elapsed time) |
| 32 | + 43.9 weighted s to build ...scaled_mm_c2x.cu... (527.4 s elapsed time) |
| 33 | + 344.8 weighted s to build ...attention_...cu.o (1087.2 s elapsed time) |
| 34 | + 1110.0 s weighted time (10120.4 s elapsed time sum, 9.1x parallelism) |
| 35 | + 134 build steps completed, average of 0.12/s |
| 36 | +``` |
| 37 | +""" |
54 | 38 |
|
55 | 39 | import argparse |
56 | 40 | import errno |
|
0 commit comments