alias: ins_cache, block id: 13
alias: sl1d, block id: 14
Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQC/MI200
Target: MI210
Command: ./tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: None
Filtered sections: ['ins_cache', 'sl1d']

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/3][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: tests/workloads/ipblocks_SQC/MI200/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:00:19.045793 130927534497600 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.183544 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:00:19.046437 130927534497600 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.239626 130927534497600 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:00:19.321216 130927534497600 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.274779 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.343411 130927534497600 generateRocpd.cpp:583] writing SQL database for process 2523644 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:00:19.344220 130927534497600 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQC/MI200/out/pmc_1/smc4124-25-mi210-3c48/2523644_results.db (UUID=0001fa75-5d0b-7d0b-ab79-4ab5c203e264)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.425639 130927534497600 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007811 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.426761 130927534497600 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001105 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.428351 130927534497600 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001575 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.438775 130927534497600 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008441 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.495258 130927534497600 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.056467 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.497549 130927534497600 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002274 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.497567 130927534497600 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.506783 130927534497600 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009208 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.506798 130927534497600 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.506808 130927534497600 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.506816 130927534497600 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.506924 130927534497600 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000103 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.507144 130927534497600 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.163733 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.510257 130927534497600 simple_timer.cpp:55] [rocprofv3] output generation ::     0.187536 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:19.510321 130927534497600 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.189054 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQC/MI200/out/pmc_1/2523644_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/3][Approximate profiling time left: 1 second]...
[profiling] Current input file: tests/workloads/ipblocks_SQC/MI200/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:00:21.013307 135797848358720 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.181872 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:00:21.013897 135797848358720 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.206929 135797848358720 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:00:21.288446 135797848358720 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.274548 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.310791 135797848358720 generateRocpd.cpp:583] writing SQL database for process 2523655 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:00:21.311621 135797848358720 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQC/MI200/out/pmc_1/smc4124-25-mi210-3c48/2523655_results.db (UUID=0001fa75-64bc-74bc-86b6-317a3ac82c58)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.394809 135797848358720 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007855 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.395943 135797848358720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001115 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.397555 135797848358720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001595 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.408056 135797848358720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008493 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.458575 135797848358720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.050502 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.460821 135797848358720 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002229 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.460840 135797848358720 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.469845 135797848358720 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008992 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.469861 135797848358720 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.469874 135797848358720 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.469885 135797848358720 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.469990 135797848358720 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000096 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.470204 135797848358720 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.159413 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.473178 135797848358720 simple_timer.cpp:55] [rocprofv3] output generation ::     0.183316 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:21.473239 135797848358720 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.184735 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQC/MI200/out/pmc_1/2523655_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/3][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: tests/workloads/ipblocks_SQC/MI200/perfmon/pmc_perf_SQ_IFETCH_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:00:22.986895 139947449655104 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.186221 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:00:22.987553 139947449655104 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.180759 139947449655104 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:00:23.261586 139947449655104 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.274033 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.284115 139947449655104 generateRocpd.cpp:583] writing SQL database for process 2523664 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:00:23.284921 139947449655104 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQC/MI200/out/pmc_1/smc4124-25-mi210-3c48/2523664_results.db (UUID=0001fa75-6c6d-7c6d-a511-31370923786b)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.367822 139947449655104 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007751 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.369040 139947449655104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001202 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.370747 139947449655104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001692 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.381427 139947449655104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008476 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.438049 139947449655104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.056608 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.440296 139947449655104 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002232 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.440313 139947449655104 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.449067 139947449655104 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008747 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.449081 139947449655104 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.449087 139947449655104 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.449093 139947449655104 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.449195 139947449655104 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000094 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.449415 139947449655104 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.165300 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.452290 139947449655104 simple_timer.cpp:55] [rocprofv3] output generation ::     0.189372 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:00:23.452353 139947449655104 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.190719 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQC/MI200/out/pmc_1/2523664_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Skipping roofline
