Following counters might not be supported by rocprof: SQ_INSTS_VALU_MFMA_MOPS_F64, SQ_INSTS_VALU_FMA_F64, SQ_INSTS_VALU_MUL_F16, SQ_INSTS_VALU_FMA_F32, SQ_INSTS_VALU_TRANS_F16, SQ_INSTS_VALU_ADD_F32, SQ_INSTS_VALU_MFMA_MOPS_F32, SQ_INSTS_VALU_TRANS_F64, SQ_INSTS_VALU_MFMA_MOPS_I8, SQ_INSTS_VALU_ADD_F16, SQ_INSTS_VALU_ADD_F64, SQ_INSTS_VALU_MFMA_MOPS_BF16, SQ_INSTS_VALU_MFMA_MOPS_F16, SQ_INSTS_VALU_FMA_F16, SQ_INSTS_VALU_MUL_F32, SQ_INSTS_VALU_MUL_F64, SQ_INSTS_VALU_TRANS_F32
Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100
Target: MI100
Command: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: ['1']
Filtered sections: All

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.2s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/12][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:41.863436 132043647180608 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.305206 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:41.873692 132043647180608 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.086035 132043647180608 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_ADD_F16, SQ_INSTS_VALU_ADD_F32, SQ_INSTS_VALU_ADD_F64, SQ_INSTS_VALU_FMA_F16[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:42.217893 132043647180608 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.344201 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.257204 132043647180608 generateRocpd.cpp:582] writing SQL database for process 2382994 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:50:42.258472 132043647180608 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/dl385-20-mi100-3c48/2382994_results.db (UUID=00004315-f9be-79be-9e9e-873e8d81c965)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.347232 132043647180608 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013960 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.348417 132043647180608 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001154 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.351047 132043647180608 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002600 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.356193 132043647180608 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003154 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.425576 132043647180608 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.069352 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.428200 132043647180608 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002596 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.428231 132043647180608 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.444044 132043647180608 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015797 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.444071 132043647180608 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.444086 132043647180608 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.444100 132043647180608 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.444293 132043647180608 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000181 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.444712 132043647180608 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.187509 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.450609 132043647180608 simple_timer.cpp:55] [rocprofv3] output generation ::     0.230141 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:42.450695 132043647180608 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.232754 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/2382994_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/12][Approximate profiling time left: 33 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:44.710742 124608317898560 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.312667 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:44.721114 124608317898560 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:44.939448 124608317898560 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_FMA_F32, SQ_INSTS_VALU_FMA_F64, SQ_INSTS_VALU_MFMA_MOPS_BF16, SQ_INSTS_VALU_MFMA_MOPS_F16, SQ_INSTS_VALU_MFMA_MOPS_F32, SQ_INSTS_VALU_MFMA_MOPS_F64, SQ_INSTS_VALU_MFMA_MOPS_I8, SQ_INSTS_VALU_MUL_F16[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:45.072943 124608317898560 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.351828 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.105333 124608317898560 generateRocpd.cpp:582] writing SQL database for process 2383004 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:50:45.106371 124608317898560 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/dl385-20-mi100-3c48/2383004_results.db (UUID=00004316-04d6-74d6-b5e0-ef542383101f)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.179580 124608317898560 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.010937 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.180519 124608317898560 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.000915 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.182539 124608317898560 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001998 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.186741 124608317898560 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002578 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.235313 124608317898560 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.048550 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.237614 124608317898560 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002278 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.237637 124608317898560 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.250291 124608317898560 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.012640 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.250312 124608317898560 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.250324 124608317898560 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.250338 124608317898560 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.250482 124608317898560 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000134 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.250793 124608317898560 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.145460 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.255229 124608317898560 simple_timer.cpp:55] [rocprofv3] output generation ::     0.179692 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:45.255292 124608317898560 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.182195 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/2383004_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/12][Approximate profiling time left: 27 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:47.521558 137884615663424 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.307085 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:47.531669 137884615663424 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:47.743479 137884615663424 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_MUL_F32, SQ_INSTS_VALU_MUL_F64, SQ_INSTS_VALU_TRANS_F16, SQ_INSTS_VALU_TRANS_F32, SQ_INSTS_VALU_TRANS_F64[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:47.873104 137884615663424 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.341435 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:47.912305 137884615663424 generateRocpd.cpp:582] writing SQL database for process 2383014 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:50:47.913584 137884615663424 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/dl385-20-mi100-3c48/2383014_results.db (UUID=00004316-0fd7-7fd7-9cbd-b460f0030aed)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:48.006047 137884615663424 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014613 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:48.007271 137884615663424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001193 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:48.009888 137884615663424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002588 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:48.015106 137884615663424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003156 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:48.083030 137884615663424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.067895 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:48.085685 137884615663424 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002624 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:48.085714 137884615663424 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:48.101522 137884615663424 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015793 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:48.101551 137884615663424 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:48.101563 137884615663424 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:48.101575 137884615663424 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:48.101791 137884615663424 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000199 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:48.102322 137884615663424 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.190018 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:48.108323 137884615663424 simple_timer.cpp:55] [rocprofv3] output generation ::     0.232743 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:48.108420 137884615663424 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.235266 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/2383014_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/12][Approximate profiling time left: 23 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:50.397861 135970199129920 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304728 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:50.407749 135970199129920 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.618291 135970199129920 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:50.748278 135970199129920 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.340529 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.787819 135970199129920 generateRocpd.cpp:582] writing SQL database for process 2383024 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:50:50.789105 135970199129920 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/dl385-20-mi100-3c48/2383024_results.db (UUID=00004316-1b15-7b15-91a5-d3163db1680f)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.880573 135970199129920 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014334 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.881685 135970199129920 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001081 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.884180 135970199129920 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002467 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.889460 135970199129920 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003263 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.961753 135970199129920 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.072264 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.964350 135970199129920 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002566 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.964381 135970199129920 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.980533 135970199129920 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016138 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.980564 135970199129920 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.980576 135970199129920 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.980588 135970199129920 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.980819 135970199129920 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000210 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.981379 135970199129920 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.193560 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.987364 135970199129920 simple_timer.cpp:55] [rocprofv3] output generation ::     0.236586 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:50.987459 135970199129920 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.239129 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/2383024_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/12][Approximate profiling time left: 20 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:53.243692 127561571589952 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304900 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:53.253554 127561571589952 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.464629 127561571589952 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:53.593981 127561571589952 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.340427 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.633199 127561571589952 generateRocpd.cpp:582] writing SQL database for process 2383034 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:50:53.634470 127561571589952 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/dl385-20-mi100-3c48/2383034_results.db (UUID=00004316-2633-7633-9bc2-c657f0cf2f88)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.720690 127561571589952 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013926 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.721870 127561571589952 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001150 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.724113 127561571589952 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002215 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.728838 127561571589952 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002878 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.791890 127561571589952 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.063015 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.794607 127561571589952 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002686 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.794636 127561571589952 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.810672 127561571589952 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016021 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.810702 127561571589952 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.810714 127561571589952 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.810726 127561571589952 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.810934 127561571589952 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000195 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.811559 127561571589952 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.178360 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.817521 127561571589952 simple_timer.cpp:55] [rocprofv3] output generation ::     0.221045 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:53.817613 127561571589952 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.223583 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/2383034_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/12][Approximate profiling time left: 17 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/perfmon/pmc_perf_5.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:56.077285 127513259503424 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.305001 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:56.086582 127513259503424 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.301724 127513259503424 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:56.430932 127513259503424 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.344350 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.470083 127513259503424 generateRocpd.cpp:582] writing SQL database for process 2383044 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:50:56.471355 127513259503424 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/dl385-20-mi100-3c48/2383044_results.db (UUID=00004316-3144-7144-a9d0-18243007c469)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.558357 127513259503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013930 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.559447 127513259503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001058 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.561966 127513259503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002491 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.567166 127513259503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003174 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.621184 127513259503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.053990 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.623857 127513259503424 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002642 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.623886 127513259503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.639339 127513259503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015431 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.639373 127513259503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.639395 127513259503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.639412 127513259503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.639606 127513259503424 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000181 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.640046 127513259503424 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.169964 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.645824 127513259503424 simple_timer.cpp:55] [rocprofv3] output generation ::     0.212432 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:56.645904 127513259503424 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.214914 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/2383044_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/12][Approximate profiling time left: 14 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/perfmon/pmc_perf_6.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:58.924811 133245784563520 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.315116 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:58.934803 133245784563520 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.149759 133245784563520 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:50:59.280989 133245784563520 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.346186 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.320856 133245784563520 generateRocpd.cpp:582] writing SQL database for process 2383054 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:50:59.322234 133245784563520 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/dl385-20-mi100-3c48/2383054_results.db (UUID=00004316-3c59-7c59-bee6-d8504c6cabbc)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.410833 133245784563520 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014598 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.412053 133245784563520 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001189 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.414339 133245784563520 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002256 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.419339 133245784563520 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003108 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.471895 133245784563520 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.052527 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.474633 133245784563520 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002707 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.474662 133245784563520 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.490649 133245784563520 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015972 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.490681 133245784563520 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.490698 133245784563520 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.490720 133245784563520 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.491042 133245784563520 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000301 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.491502 133245784563520 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.170646 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.497504 133245784563520 simple_timer.cpp:55] [rocprofv3] output generation ::     0.214043 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:50:59.497586 133245784563520 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.216544 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/2383054_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 8/12][Approximate profiling time left: 11 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/perfmon/pmc_perf_SQC_DCACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:01.772883 138931291623232 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.307348 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:01.782188 138931291623232 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:01.993484 138931291623232 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:02.123942 138931291623232 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.341754 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.163414 138931291623232 generateRocpd.cpp:582] writing SQL database for process 2383064 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:51:02.164726 138931291623232 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/dl385-20-mi100-3c48/2383064_results.db (UUID=00004316-4781-7781-b89a-ef8fa9413215)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.254428 138931291623232 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014590 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.255547 138931291623232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001089 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.258032 138931291623232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002456 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.263010 138931291623232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003062 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.357401 138931291623232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.094363 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.360127 138931291623232 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002696 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.360170 138931291623232 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.376372 138931291623232 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016187 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.376407 138931291623232 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.376420 138931291623232 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.376431 138931291623232 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.376669 138931291623232 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000216 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.377354 138931291623232 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.213941 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.383317 138931291623232 simple_timer.cpp:55] [rocprofv3] output generation ::     0.256858 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:02.383415 138931291623232 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.259416 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/2383064_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 9/12][Approximate profiling time left: 8 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/perfmon/pmc_perf_SQC_ICACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:04.647649 128534075178816 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.305431 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:04.657920 128534075178816 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:04.869346 128534075178816 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:04.999055 128534075178816 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.341136 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.038160 128534075178816 generateRocpd.cpp:582] writing SQL database for process 2383074 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:51:05.039445 128534075178816 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/dl385-20-mi100-3c48/2383074_results.db (UUID=00004316-52be-72be-84bd-de3b5e967fd9)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.127455 128534075178816 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.011058 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.128407 128534075178816 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.000927 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.130448 128534075178816 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002019 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.134628 128534075178816 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002550 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.205858 128534075178816 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.071208 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.208214 128534075178816 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002334 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.208237 128534075178816 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.221286 128534075178816 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.013038 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.221307 128534075178816 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.221316 128534075178816 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.221325 128534075178816 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.221476 128534075178816 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000139 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.221834 128534075178816 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.183675 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.226410 128534075178816 simple_timer.cpp:55] [rocprofv3] output generation ::     0.224835 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:05.226484 128534075178816 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.227375 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/2383074_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 10/12][Approximate profiling time left: 5 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/perfmon/pmc_perf_SQ_IFETCH_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:07.506110 127983698722624 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.313497 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:07.514679 127983698722624 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:07.726378 127983698722624 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:07.859053 127983698722624 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.344375 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:07.893848 127983698722624 generateRocpd.cpp:582] writing SQL database for process 2383084 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:51:07.894891 127983698722624 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/dl385-20-mi100-3c48/2383084_results.db (UUID=00004316-5de1-7de1-9642-9b5225900fff)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:07.972023 127983698722624 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.011407 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:07.973073 127983698722624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001027 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:07.975175 127983698722624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002081 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:07.979677 127983698722624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002724 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:08.083244 127983698722624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.103545 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:08.085572 127983698722624 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002305 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:08.085595 127983698722624 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:08.098227 127983698722624 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.012622 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:08.098253 127983698722624 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:08.098263 127983698722624 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:08.098272 127983698722624 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:08.098429 127983698722624 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000145 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:08.098809 127983698722624 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.204961 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:08.103192 127983698722624 simple_timer.cpp:55] [rocprofv3] output generation ::     0.241565 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:08.103288 127983698722624 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.244156 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/2383084_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 11/12][Approximate profiling time left: 2 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/perfmon/pmc_perf_SQ_INST_LEVEL_LDS_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:10.394128 125276424372032 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.310842 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:10.404300 125276424372032 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:10.615803 125276424372032 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:10.748216 125276424372032 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.343916 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:10.787347 125276424372032 generateRocpd.cpp:582] writing SQL database for process 2383094 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:51:10.788626 125276424372032 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/dl385-20-mi100-3c48/2383094_results.db (UUID=00004316-692b-792b-bc22-2d7f1c175e89)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:10.882373 125276424372032 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014980 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:10.883512 125276424372032 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001109 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:10.886086 125276424372032 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002545 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:10.891302 125276424372032 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003206 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:11.015816 125276424372032 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.124485 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:11.018689 125276424372032 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002841 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:11.018718 125276424372032 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:11.034103 125276424372032 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015370 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:11.034131 125276424372032 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:11.034143 125276424372032 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:11.034155 125276424372032 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:11.034350 125276424372032 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000177 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:11.034767 125276424372032 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.247421 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:11.040615 125276424372032 simple_timer.cpp:55] [rocprofv3] output generation ::     0.289907 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:11.040718 125276424372032 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.292452 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/2383094_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 12/12][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/perfmon/pmc_perf_SQ_LEVEL_WAVES_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:13.312061 134730667093824 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304782 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:13.321789 134730667093824 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.533146 134730667093824 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:13.665133 134730667093824 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.343344 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.704292 134730667093824 generateRocpd.cpp:582] writing SQL database for process 2383104 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:51:13.705581 134730667093824 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/dl385-20-mi100-3c48/2383104_results.db (UUID=00004316-7497-7497-928c-09270f476284)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.798000 134730667093824 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014498 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.799241 134730667093824 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001211 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.801899 134730667093824 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002629 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.807015 134730667093824 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003117 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.883663 134730667093824 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.076620 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.886503 134730667093824 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002811 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.886533 134730667093824 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.902948 134730667093824 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016400 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.902987 134730667093824 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.903000 134730667093824 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.903016 134730667093824 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.903232 134730667093824 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000204 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.903754 134730667093824 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.199463 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.910326 134730667093824 simple_timer.cpp:55] [rocprofv3] output generation ::     0.242685 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:13.910422 134730667093824 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.245235 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_2/MI100/out/pmc_1/2383104_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Skipping roofline
