alias: cu_ins, block id: 10
alias: cu_pipe, block id: 11
alias: tatd, block id: 15
Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100
Target: MI100
Command: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: None
Filtered sections: ['cu_ins', 'cu_pipe', 'tatd']

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.2s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/8][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:54:57.020671 131391732494144 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.301932 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:54:57.030567 131391732494144 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.241326 131391732494144 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:54:57.372433 131391732494144 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.341866 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.411920 131391732494144 generateRocpd.cpp:582] writing SQL database for process 2385709 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:54:57.413248 131391732494144 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2385709_results.db (UUID=00004319-de77-7e77-a4c2-fea0558e06dd)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.499371 131391732494144 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014179 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.500446 131391732494144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001044 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.502585 131391732494144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002110 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.507629 131391732494144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003160 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.529061 131391732494144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.021403 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.531510 131391732494144 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002419 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.531539 131391732494144 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.547599 131391732494144 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016045 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.547626 131391732494144 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.547638 131391732494144 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.547650 131391732494144 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.547862 131391732494144 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000192 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.548247 131391732494144 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.136328 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.554121 131391732494144 simple_timer.cpp:55] [rocprofv3] output generation ::     0.179185 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:54:57.554193 131391732494144 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.181708 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/2385709_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/8][Approximate profiling time left: 19 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:54:59.783380 133711515565888 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306191 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:54:59.793362 133711515565888 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.008599 133711515565888 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:00.142556 133711515565888 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.349194 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.177419 133711515565888 generateRocpd.cpp:582] writing SQL database for process 2385719 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:55:00.178803 133711515565888 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2385719_results.db (UUID=00004319-e93d-793d-aa7f-3d0c807767cf)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.257396 133711515565888 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.010828 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.258399 133711515565888 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.000975 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.260278 133711515565888 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001855 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.264870 133711515565888 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002830 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.279050 133711515565888 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.014157 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.281177 133711515565888 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002103 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.281203 133711515565888 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000005 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.293791 133711515565888 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.012576 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.293813 133711515565888 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.293823 133711515565888 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.293833 133711515565888 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.294004 133711515565888 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000159 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.294315 133711515565888 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.116897 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.299070 133711515565888 simple_timer.cpp:55] [rocprofv3] output generation ::     0.153808 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:00.299166 133711515565888 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.156447 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/2385719_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/8][Approximate profiling time left: 14 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:02.520331 138797403320128 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.300327 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:02.529813 138797403320128 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:02.743089 138797403320128 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:02.875355 138797403320128 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.345543 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:02.914523 138797403320128 generateRocpd.cpp:582] writing SQL database for process 2385734 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:55:02.915836 138797403320128 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2385734_results.db (UUID=00004319-f3f4-73f4-9463-d2f8640d4ac7)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:03.005009 138797403320128 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013904 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:03.006133 138797403320128 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001093 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:03.008304 138797403320128 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002143 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:03.013380 138797403320128 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003158 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:03.020878 138797403320128 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.007469 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:03.023333 138797403320128 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002426 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:03.023362 138797403320128 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:03.038964 138797403320128 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015588 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:03.039001 138797403320128 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:03.039014 138797403320128 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:03.039025 138797403320128 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:03.039228 138797403320128 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000183 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:03.039617 138797403320128 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.125094 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:03.045631 138797403320128 simple_timer.cpp:55] [rocprofv3] output generation ::     0.167778 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:03.045701 138797403320128 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.170294 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/2385734_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/8][Approximate profiling time left: 11 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:05.275738 138650003009344 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.298659 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:05.285458 138650003009344 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.496703 138650003009344 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:05.630798 138650003009344 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.345341 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.669550 138650003009344 generateRocpd.cpp:582] writing SQL database for process 2385744 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:55:05.670837 138650003009344 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2385744_results.db (UUID=00004319-feb9-7eb9-a11f-8bdc6a3a6f8d)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.756489 138650003009344 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013467 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.757608 138650003009344 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001088 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.759741 138650003009344 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002105 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.764768 138650003009344 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003128 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.769180 138650003009344 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.004383 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.771628 138650003009344 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002421 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.771657 138650003009344 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.787755 138650003009344 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016083 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.787785 138650003009344 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.787797 138650003009344 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.787809 138650003009344 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.788035 138650003009344 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000211 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.788465 138650003009344 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.118915 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.794473 138650003009344 simple_timer.cpp:55] [rocprofv3] output generation ::     0.161164 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:05.794547 138650003009344 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.163699 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/2385744_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/8][Approximate profiling time left: 8 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:08.024903 123571614232384 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.297963 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:08.034519 123571614232384 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.245664 123571614232384 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:08.374868 123571614232384 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.340349 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.413606 123571614232384 generateRocpd.cpp:582] writing SQL database for process 2385754 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:55:08.414892 123571614232384 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2385754_results.db (UUID=0000431a-0977-7977-8eab-ea67ec933cf8)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.499913 123571614232384 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013780 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.501042 123571614232384 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001099 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.503198 123571614232384 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002127 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.508174 123571614232384 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003103 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.512341 123571614232384 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.004139 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.514779 123571614232384 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002413 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.514807 123571614232384 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.530889 123571614232384 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016067 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.530919 123571614232384 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.530932 123571614232384 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.530944 123571614232384 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.531163 123571614232384 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000197 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.531581 123571614232384 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.117976 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.537545 123571614232384 simple_timer.cpp:55] [rocprofv3] output generation ::     0.160193 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:08.537623 123571614232384 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.162701 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/2385754_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/8][Approximate profiling time left: 5 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/perfmon/pmc_perf_5.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:10.762724 140349140254528 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.297423 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:10.772435 140349140254528 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:10.983823 140349140254528 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:11.113087 140349140254528 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.340652 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.151860 140349140254528 generateRocpd.cpp:582] writing SQL database for process 2385765 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:55:11.153162 140349140254528 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2385765_results.db (UUID=0000431a-142a-742a-96c2-2f4fdc525c58)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.238095 140349140254528 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013372 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.239217 140349140254528 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001086 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.241278 140349140254528 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002033 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.246122 140349140254528 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002949 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.250495 140349140254528 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.004345 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.252917 140349140254528 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002394 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.252945 140349140254528 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.269004 140349140254528 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016045 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.269031 140349140254528 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.269044 140349140254528 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.269055 140349140254528 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.269267 140349140254528 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000190 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.269625 140349140254528 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.117766 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.275681 140349140254528 simple_timer.cpp:55] [rocprofv3] output generation ::     0.160117 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:11.275763 140349140254528 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.162625 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/2385765_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/8][Approximate profiling time left: 2 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/perfmon/pmc_perf_6.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:13.499878 123159760752448 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.297559 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:13.510070 123159760752448 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:13.720926 123159760752448 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:13.850847 123159760752448 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.340777 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:13.889943 123159760752448 generateRocpd.cpp:582] writing SQL database for process 2385775 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:55:13.891232 123159760752448 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2385775_results.db (UUID=0000431a-1eda-7eda-9fde-252a96515b96)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:13.980070 123159760752448 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013835 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:13.981219 123159760752448 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001118 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:13.983346 123159760752448 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002099 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:13.988344 123159760752448 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003115 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:13.992742 123159760752448 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.004370 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:13.995194 123159760752448 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002424 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:13.995223 123159760752448 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:14.011156 123159760752448 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015918 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:14.011186 123159760752448 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:14.011198 123159760752448 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:14.011211 123159760752448 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:14.011418 123159760752448 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000190 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:14.011887 123159760752448 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.121944 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:14.017923 123159760752448 simple_timer.cpp:55] [rocprofv3] output generation ::     0.164537 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:14.018013 123159760752448 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.167117 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/2385775_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 8/8][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/perfmon/pmc_perf_7.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:16.240426 135648437227328 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.297729 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:16.250101 135648437227328 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.461041 135648437227328 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:55:16.590854 135648437227328 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.340753 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.629804 135648437227328 generateRocpd.cpp:582] writing SQL database for process 2385787 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:55:16.631097 135648437227328 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2385787_results.db (UUID=0000431a-298f-798f-b7a4-6aba975af170)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.720553 135648437227328 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013737 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.721683 135648437227328 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001100 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.723860 135648437227328 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002149 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.728950 135648437227328 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003127 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.733366 135648437227328 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.004373 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.735797 135648437227328 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002403 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.735826 135648437227328 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.751248 135648437227328 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015408 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.751276 135648437227328 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.751288 135648437227328 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.751300 135648437227328 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.751502 135648437227328 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000182 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.751855 135648437227328 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.122052 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.757675 135648437227328 simple_timer.cpp:55] [rocprofv3] output generation ::     0.164343 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:55:16.757750 135648437227328 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.166845 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_TA/MI100/out/pmc_1/2385787_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Skipping roofline
