Dear Michele,

 

I assume you use the hpc_benchmark.py without any modifications? On my laptop,

 

                mpirun -np 2 python install/share/doc/nest/examples/pynest/hpc_benchmark.py

 

executes without problems with NEST 3.6 (and with current master).

 

Interestingly, for the hpc_benchmark, NEST should never even get to the else block to which the assertion on line 107 in target_table.cpp belongs, since all connections in the network are primary connections. So something seems to go wrong in the communication of information about connections to the presynaptic side.

 

Could you try with this branch

 

                https://github.com/heplesser/nest-simulator/tree/36_nosingle

 

It makes sure all MPI communication strictly happens in the OpenMP master thread (in NEST 3.6, it may happen inside OpenMP single constructs). This should not make a difference for the hpc_benchmark, since it uses a single thread by default.

 

Best,

Hans Ekkehard

 

-- 

 

Prof. Dr. Hans Ekkehard Plesser

 

Department of Data Science

Faculty of Science and Technology

Norwegian University of Life Sciences

PO Box 5003, 1432 Aas, Norway

 

Phone +47 6723 1560

Email hans.ekkehard.plesser@nmbu.no

Home http://arken.nmbu.no/~plesser

 

 

 

From: Michele Martinelli <michele.martinelli@roma1.infn.it>
Date: Monday, 29 January 2024 at 10:35
To: users@nest-simulator.org <users@nest-simulator.org>
Subject: [NEST Users] Assert failed running hpc_benchmark

Some people who received this message don't often get email from michele.martinelli@roma1.infn.it. Learn why this is important

Dear NEST Users & Developers,

we're currently working on a custom OpenMPI BTL (supporting a custom FPGA-based NIC) at the National Institute for Nuclear Physics in Rome, Italy and we have an error when running hpc_benchmark (this test is currently used as simple validation test) with 2 processes (one on each of 2 hosts), the command we run is like:

mpirun -n 2 -H host1:1,host2:1 --bynode --report-bindings -mca btl apelink,self,sm python hpc_benchmark.py (apelink is our custom BTL component)

but then we see this error:

python: [...]/NEST_with_local_ompi/nest-simulator-3.6/nestkernel/target_table.cpp:107: void nest::TargetTable::add_target(size_t, size_t, const nest::TargetData&): Assertion `syn_id < secondary_send_buffer_pos_[ tid ][ lid ].size()' failed. [host:23979] *** Process received signal *** [host:23979] Signal: Aborted (6) [host:23979] Signal code: (-6)

My guess is that we are transferring something incorrectly (maybe during the initialization/setup phase?), but I'm not sure what the assert expects to have in secondary_send_buffer_pos_[ tid ][ lid ].size() and how this field should be set.

Best,

Michele