Hello,
I have tried to execute one of the e-prop tutorials (eprop_supervised_regression_handwriting_bsshslm_2020.py) and get into problems when activating MPI. I do not know whether it is an e-prop C++ implementation problem or a Python tutorial code problem (in some
cases, but not all of them, it looks like the last one).
I have two files (hostfile_orig_1 and hostfile_orig_2) to define in what nodes of the cluster to execute the program. The content of these files is:
hostfile_orig_1:
node0 slots=1
node1 slots=1
hostfile_orig_2:
node0 slots=2
node0 slots=2
The 'slots' key tells how many (mpi) processes can be executed on a particular node.
Depending on the number of processes, the errors are a bit different. In all the following examples, I use "total_num_virtual_procs": 2 on line 173 of tutorial file.
Below I write the execution command and the output error.
Command: mpirun -np 1 -hostfile hostfile_orig_1 python3 eprop_supervised_regression_handwriting_bsshslm_2020.py
Output: With a "serial" execution like this one, everything is OK:
Command: mpirun -np 2 -hostfile hostfile_orig_2 python3 eprop_supervised_regression_handwriting_bsshslm_2020.py
Output:
Traceback (most recent call last):
File "eprop_supervised_regression_handwriting_bsshslm_2020.py", line 404, in <module>
nest.GetConnections(nrns_rec[0], nrns_rec[1:3]).set([params_init_optimizer] * 2)
File "/home/neurobit/local/nest_3.9/lib64/python3.8/site-packages/nest/lib/hl_api_types.py", line 945, in set
raise TypeError("status dict must be a dict, or a list of dicts of length {}".format(self.__len__()))
TypeError: status dict must be a dict, or a list of dicts of length 1
Command: mpirun -np 2 -hostfile hostfile_orig_2 python3 eprop_supervised_regression_handwriting_bsshslm_2020.py (commenting line 404)
Output:
Traceback (most recent call last):
File "eprop_supervised_regression_handwriting_bsshslm_2020.py", line 545, in <module>
readout_signal = readout_signal.reshape((n_out, n_iter, batch_size, steps["sequence"]))
ValueError: cannot reshape array of size 364800 into shape (2,200,1,1824)
Command: mpirun -np 2 -hostfile hostfile_orig_1 python3 eprop_supervised_regression_handwriting_bsshslm_2020.py (commenting line 404)
Output:
Traceback (most recent call last):
File "eprop_supervised_regression_handwriting_bsshslm_2020.py", line 545, in <module>
readout_signal = readout_signal.reshape((n_out, n_iter, batch_size, steps["sequence"]))
ValueError: cannot reshape array of size 364800 into shape (2,200,1,1824)
Command: mpirun -np 4 -hostfile hostfile_orig_2 python3 eprop_supervised_regression_handwriting_bsshslm_2020.py (commenting line 404 and using "total_num_virtual_procs": 4 on line 173 of tutorial file.)
Output:
Traceback (most recent call last):
File "eprop_supervised_regression_handwriting_bsshslm_2020.py", line 493, in <module>
"rec_out": get_weights(nrns_rec, nrns_out),
File "eprop_supervised_regression_handwriting_bsshslm_2020.py", line 482, in get_weights
conns["senders"] = np.array(conns["source"]) - np.min(conns["source"])
TypeError: tuple indices must be integers or slices, not str
In this last case, program stops and hungs.
If you want, I could submit a bug report on github.
Xavier
_______________________________________________
NEST Users mailing list -- users@nest-simulator.org
To unsubscribe send an email to users-leave@nest-simulator.org