Page 1 of 1

CosmoMC segmentation fault

Posted: March 16 2024
by Akhilesh Nautiyal(akhi)
Hello everyone,

I am trying to run cosmomc on a virtual machine having Ubuntu 22.04 and gfortran version 11.4.0. The MPI version is mpiexec (OpenRTE) 4.1.2.
The compilation of cosmomc is OK without any error. When I am running it, I am getting segmentation fault. I got the following output after running cosmomc_debug.

Code: Select all

   export OMP_NUM_THREADS=0
   nohup mpiexec -np 4 ./cosmomc_debug test_planck.ini >out.txt &
   
    Number of MPI processes:           4
 file_root:test
 Random seeds: 26005, 25593 rand_inst:   2
 Random seeds: 26105, 25593 rand_inst:   3
 Random seeds: 26205, 25593 rand_inst:   4
 Random seeds: 25905, 25593 rand_inst:   1
 Using clik with likelihood file ./data/clik_14.0/hi_l/plik/plik_rd12_HM_v22b_TTTEEE.clik
----
clik version plc_3.1
  smica
Checking likelihood './data/clik_14.0/hi_l/plik/plik_rd12_HM_v22b_TTTEEE.clik' on test data. got -1172.47 expected -1172.47 (diff -4.34056e-07)
----
   TT from l=0 to l=        2508
   EE from l=0 to l=        2508
   TE from l=0 to l=        2508
----
clik version plc_3.1
  gibbs_gauss b13c8fda-1837-41b5-ae2d-78d6b723fcf1
Checking likelihood './data/clik_14.0/low_l/commander/commander_dx12_v3_2_29.clik' on test data. got -11.6257 expected -11.6257 (diff -1.07424e-09)
----
   TT from l=0 to l=          29
Initializing SimAll
----
clik version plc_3.1
  simall simall_EE_BB_TE
Checking likelihood './data/clik_14.0/low_l/simall/simall_100x143_offlike5_EE_Aplanck_B.clik' on test data. got -197.99 expected -197.99 (diff -4.1778e-08)
----
   EE from l=0 to l=          29
----
clik version plc_3.1
  smica
----
clik version plc_3.1
  smica
Checking likelihood './data/clik_14.0/hi_l/plik/plik_rd12_HM_v22b_TTTEEE.clik' on test data. got -1172.47 expected -1172.47 (diff -4.34056e-07)
----
   TT from l=0 to l=        2508
   EE from l=0 to l=        2508
   TE from l=0 to l=        2508
----
clik version plc_3.1
  gibbs_gauss b13c8fda-1837-41b5-ae2d-78d6b723fcf1
Checking likelihood './data/clik_14.0/low_l/commander/commander_dx12_v3_2_29.clik' on test data. got -11.6257 expected -11.6257 (diff -1.07424e-09)
----
   TT from l=0 to l=          29
Checking likelihood './data/clik_14.0/hi_l/plik/plik_rd12_HM_v22b_TTTEEE.clik' on test data. got -1172.47 expected -1172.47 (diff -4.34056e-07)
----
   TT from l=0 to l=        2508
   EE from l=0 to l=        2508
   TE from l=0 to l=        2508
----
clik version plc_3.1
  gibbs_gauss b13c8fda-1837-41b5-ae2d-78d6b723fcf1
Checking likelihood './data/clik_14.0/low_l/commander/commander_dx12_v3_2_29.clik' on test data. got -11.6257 expected -11.6257 (diff -1.07424e-09)
----
   TT from l=0 to l=          29
Initializing SimAll
Initializing SimAll
----
clik version plc_3.1
  simall simall_EE_BB_TE
----
clik version plc_3.1
  simall simall_EE_BB_TE
Checking likelihood './data/clik_14.0/low_l/simall/simall_100x143_offlike5_EE_Aplanck_B.clik' on test data. got -197.99 expected -197.99 (diff -4.1778e-08)
----
   EE from l=0 to l=          29
Checking likelihood './data/clik_14.0/low_l/simall/simall_100x143_offlike5_EE_Aplanck_B.clik' on test data. got -197.99 expected -197.99 (diff -4.1778e-08)
----
   EE from l=0 to l=          29
----
clik version plc_3.1
  smica
Checking likelihood './data/clik_14.0/hi_l/plik/plik_rd12_HM_v22b_TTTEEE.clik' on test data. got -1172.47 expected -1172.47 (diff -4.34056e-07)
----
   TT from l=0 to l=        2508
   EE from l=0 to l=        2508
   TE from l=0 to l=        2508
 Clik will run with the following nuisance parameters:
 
A_cib_217^@
 cib_index^@
 xi_sz_cib^@
 A_sz^@
 ps_A_100_100^@
 ps_A_143_143^@
 ps_A_143_217^@
 ps_A_217_217^@
 ksz_norm^@
 gal545_A_100^@
 gal545_A_143^@
 gal545_A_143_217^@
 gal545_A_217^@
 galf_EE_A_100^@
 galf_EE_A_100_143^@
 galf_EE_A_100_217^@
 galf_EE_A_143^@
 galf_EE_A_143_217^@
 galf_EE_A_217^@
 galf_EE_index^@
 galf_TE_A_100^@
 galf_TE_A_100_143^@
 galf_TE_A_100_217^@
 galf_TE_A_143^@
 galf_TE_A_143_217^@
 galf_TE_A_217^@
 galf_TE_index^@
 A_cnoise_e2e_100_100_EE^@
 A_cnoise_e2e_143_143_EE^@
 A_cnoise_e2e_217_217_EE^@
 A_sbpx_100_100_TT^@
 A_sbpx_143_143_TT^@
 A_sbpx_143_217_TT^@
 A_sbpx_217_217_TT^@
 A_sbpx_100_100_EE^@
 A_sbpx_100_143_EE^@
 A_sbpx_100_217_EE^@
 A_sbpx_143_143_EE^@
 A_sbpx_143_217_EE^@
 A_sbpx_217_217_EE^@
 calib_100T^@
 calib_217T^@
 calib_100P^@
 calib_143P^@
 calib_217P^@
 A_pol^@
 A_planck^@
 Using clik with likelihood file ./data/clik_14.0/low_l/commander/commander_dx12_v3_2_29.clik
----
clik version plc_3.1
  gibbs_gauss b13c8fda-1837-41b5-ae2d-78d6b723fcf1
Checking likelihood './data/clik_14.0/low_l/commander/commander_dx12_v3_2_29.clik' on test data. got -11.6257 expected -11.6257 (diff -1.07424e-09)
----
   TT from l=0 to l=          29
 Clik will run with the following nuisance parameters:
 A_planck^@
 Using clik with likelihood file ./data/clik_14.0/low_l/simall/simall_100x143_offlike5_EE_Aplanck_B.clik
Initializing SimAll
----
clik version plc_3.1
  simall simall_EE_BB_TE
Checking likelihood './data/clik_14.0/low_l/simall/simall_100x143_offlike5_EE_Aplanck_B.clik' on test data. got -197.99 expected -197.99 (diff -4.1778e-08)
----
   EE from l=0 to l=          29
 Clik will run with the following nuisance parameters:
 A_planck^@
 read jla dataset data/Pantheon/full_long.dataset
 reading WL data set: DES_1YR_final
 read jla dataset data/Pantheon/full_long.dataset
 reading WL data set: DES_1YR_final
 read jla dataset data/Pantheon/full_long.dataset
 reading WL data set: DES_1YR_final
 read jla dataset data/Pantheon/full_long.dataset
 reading BAO data set: 6DF
 reading BAO data set: MGS
 reading BAO data set: DR12BAO
 reading WL data set: DES_1YR_final
 Doing non-linear Pk: T
 Doing CMB lensing: T
 Doing non-linear lensing: T
 TT lmax =  2508
 EE lmax =  2508
 ET lmax =  2508
 BB lmax =  2500
 PP lmax =  2500
 lmax_computed_cl  =  2508
 Computing tensors: F
 max_eta_k         =    14000.0000
 transfer kmax     =    10.1999998
 adding parameters for: smicadx12_Dec5_ftl_mv2_ndclpp_p_teb_consext8
 adding parameters for: 6DF
 adding parameters for: JLA
 adding parameters for: DR12BAO
 adding parameters for: MGS
 adding parameters for: commander_dx12_v3_2_29
 adding parameters for: simall_100x143_offlike5_EE_Aplanck_B
 adding parameters for: BK15_dust
 adding parameters for: plik_rd12_HM_v22b_TTTEEE
 adding parameters for: DES_1YR_final
 Fast divided into            3  blocks
 Block breaks at:           15          35
 54 parameters ( 7 slow ( 0 semi-slow), 47 fast ( 0 semi-fast))
 
 Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x14fb05f72960 in ???
#1  0x14fb05f71ac5 in ???
#2  0x14fb05c1551f in ???
        at ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
#3  0x55685f96f47e in __fileutils_MOD_writeitemtxt
        at ../FileUtils.f90:1572
#4  0x55685f97029d in __fileutils_MOD_writeinlineitem
        at ../FileUtils.f90:926
#5  0x55685f970360 in __fileutils_MOD_writeinlineitems
        at ../FileUtils.f90:903
#6  0x55685f9702fa in __fileutils_MOD_writeitemstxt
        at ../FileUtils.f90:917
#7  0x55685f55ee56 in __paramnames_MOD_paramnames_writefile
        at /home/user/akhilesh/CosmoMC-master/source/ObjectParamNames.f90:398
#8  0x55685f5924c7 in __baseparameters_MOD_tbaseparameters_outputparamnames
        at /home/user/akhilesh/CosmoMC-master/source/BaseParameters.f90:264
#9  0x55685f77499b in cosmomc
        at /home/user/akhilesh/CosmoMC-master/source/driver.F90:210
#10  0x55685f775f8c in main
        at /home/user/akhilesh/CosmoMC-master/source/driver.F90:3
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 0 on node ubuntu204ltsserver exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------


   
     
I tried with mpich, but the same error occured.
I tried with running the command

Code: Select all

ulimit -s unlimited   
,
before running cosmomc, but the error remains.

Can anyone please help me to resolve this issue.

Thanks.

Re: CosmoMC segmentation fault

Posted: March 18 2024
by Antony Lewis
Don't know, seems to be crashing in general code suggesting earlier memory corruption (or possibly compiler bug). If it works without clik, it may be a clik issue.

Re: CosmoMC segmentation fault

Posted: March 18 2024
by Akhilesh Nautiyal(akhi)
Dear Antony,

Thanks for the reply.
The issue remains even after running without Planck Likelihood code and MPI.
Here is the output.

Code: Select all

./cosmomc test.ini

Code: Select all

file_root:test
 Random seeds:  8583, 29119 rand_inst:   0
 read jla dataset data/Pantheon/full_long.dataset
 reading BAO data set: 6DF
 reading BAO data set: MGS
 reading BAO data set: DR12BAO
 reading WL data set: DES_1YR_final
 Doing non-linear Pk: T
 Doing CMB lensing: T
 Doing non-linear lensing: T
 TT lmax =  2500
 EE lmax =  2500
 ET lmax =  2500
 BB lmax =  2500
 PP lmax =  2500
 lmax_computed_cl  =  2500
 Computing tensors: F
 max_eta_k         =    14000.0000    
 transfer kmax     =    10.1999998    
 adding parameters for: smicadx12_Dec5_ftl_mv2_ndclpp_p_teb_consext8
 adding parameters for: MGS
 adding parameters for: DR12BAO
 adding parameters for: 6DF
 adding parameters for: JLA
 adding parameters for: BK15_dust
 adding parameters for: DES_1YR_final
 Fast divided into            2  blocks
 Block breaks at:           15
 34 parameters ( 7 slow ( 0 semi-slow), 27 fast ( 0 semi-fast))

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x14fe3215a960 in ???
#1  0x14fe32159ac5 in ???
#2  0x14fe31dfd51f in ???
	at ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
#3  0x5607159ec336 in __fileutils_MOD_writeitemtxt
	at ../FileUtils.f90:1572
#4  0x5607159ed155 in __fileutils_MOD_writeinlineitem
	at ../FileUtils.f90:926
#5  0x5607159ed218 in __fileutils_MOD_writeinlineitems
	at ../FileUtils.f90:903
#6  0x5607159ed1b2 in __fileutils_MOD_writeitemstxt
	at ../FileUtils.f90:917
#7  0x5607155e8d60 in __paramnames_MOD_paramnames_writefile
	at /home/user/akhilesh/CosmoMC-master/source/ObjectParamNames.f90:398
#8  0x56071561c27a in __baseparameters_MOD_tbaseparameters_outputparamnames
	at /home/user/akhilesh/CosmoMC-master/source/BaseParameters.f90:264
#9  0x5607157f1c76 in cosmomc
	at /home/user/akhilesh/CosmoMC-master/source/driver.F90:210
#10  0x5607157f31ac in main
	at /home/user/akhilesh/CosmoMC-master/source/driver.F90:3
Segmentation fault (core dumped)


Re: CosmoMC segmentation fault

Posted: March 19 2024
by Antony Lewis
You'll have to debug what string is causing the issue when writing. (or use Cobaya, or try another compiler)

Re: CosmoMC segmentation fault

Posted: March 20 2024
by Akhilesh Nautiyal(akhi)
Dear Antony,

Thanks for the reply.

I will try that.