Skip to content

GitLab

  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • S spinifel
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 33
    • Issues 33
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 7
    • Merge requests 7
  • Deployments
    • Deployments
    • Releases
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • MTIP
  • spinifel
  • Issues
  • #24

Closed
Open
Created Jul 10, 2021 by Iris Chang@ihchangOwner

PyCUDA ERROR

@eslaught @jpblaschke @vinayr Running the legion code with -c and -f enabled, it completes without error when N_orientations is small (~10k) and N_batch_size=100. However, when gradually increasing N_orientations, sometimes it freezes or outputs something as below.

c05n11 starts Orientation Matching.
c05n11 starts Orientation Matching.
c05n11 starts Orientation Matching.
c05n11 starts Orientation Matching.
c05n11 starts Orientation Matching.
c05n11 starts Orientation Matching.
CUDA error at src/memtransfer_wrapper.cu:342 code=2(cudaErrorMemoryAllocation) "cudaMalloc(&d_plan->sortidx, M*sizeof(int))"
CUDA error at src/memtransfer_wrapper.cu:342 code=2(cudaErrorMemoryAllocation) "cudaMalloc(&d_plan->sortidx, M*sizeof(int))"
CUDA error at src/memtransfer_wrapper.cu:342 code=2(cudaErrorMemoryAllocation) "cudaMalloc(&d_plan->sortidx, M*sizeof(int))"
CUDA error at src/memtransfer_wrapper.cu:342 code=2(cudaErrorMemoryAllocation) "cudaMalloc(&d_plan->sortidx, M*sizeof(int))"
CUDA error at src/memtransfer_wrapper.cu:342 code=2(cudaErrorMemoryAllocation) "cudaMalloc(&d_plan->sortidx, M*sizeof(int))"
-------------------------------------------------------------------
PyCUDA ERROR: The context stack was not empty upon module cleanup.
-------------------------------------------------------------------
A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.
Use Context.pop() to avoid this problem.
-------------------------------------------------------------------
*** Caught a fatal signal (proc 5): SIGABRT(6)
NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the environment to generate a backtrace.
NOTICE: We recommend linking the debug version of GASNet to assist you in resolving this application issue.
-------------------------------------------------------------------
PyCUDA ERROR: The context stack was not empty upon module cleanup.
-------------------------------------------------------------------
A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.
Use Context.pop() to avoid this problem.
-------------------------------------------------------------------
*** Caught a fatal signal (proc 1): SIGABRT(6)
NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the environment to generate a backtrace.
NOTICE: We recommend linking the debug version of GASNet to assist you in resolving this application issue.
-------------------------------------------------------------------
PyCUDA ERROR: The context stack was not empty upon module cleanup.
-------------------------------------------------------------------
A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.
Use Context.pop() to avoid this problem.
-------------------------------------------------------------------
*** Caught a fatal signal (proc 2): SIGABRT(6)
NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the environment to generate a backtrace.
NOTICE: We recommend linking the debug version of GASNet to assist you in resolving this application issue.
-------------------------------------------------------------------
PyCUDA ERROR: The context stack was not empty upon module cleanup.
-------------------------------------------------------------------
A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.
Use Context.pop() to avoid this problem.
-------------------------------------------------------------------
*** Caught a fatal signal (proc 0): SIGABRT(6)
NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the environment to generate a backtrace.
NOTICE: We recommend linking the debug version of GASNet to assist you in resolving this application issue.
-------------------------------------------------------------------
PyCUDA ERROR: The context stack was not empty upon module cleanup.
-------------------------------------------------------------------
A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.
Use Context.pop() to avoid this problem.
-------------------------------------------------------------------
*** Caught a fatal signal (proc 3): SIGABRT(6)
NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the environment to generate a backtrace.
NOTICE: We recommend linking the debug version of GASNet to assist you in resolving this application issue.
CUDA error at src/memtransfer_wrapper.cu:306 code=2(cudaErrorMemoryAllocation) "cudaMalloc(&d_plan->fw, maxbatchsize*nf1*nf2*nf3* sizeof(CUCPX))"
-------------------------------------------------------------------
PyCUDA ERROR: The context stack was not empty upon module cleanup.
-------------------------------------------------------------------
A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.
Use Context.pop() to avoid this problem.
-------------------------------------------------------------------
*** Caught a fatal signal (proc 4): SIGABRT(6)
NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the environment to generate a backtrace.
NOTICE: We recommend linking the debug version of GASNet to assist you in resolving this application issue.
ERROR:  One or more process (first noticed rank 2) terminated with signal 6 (core dumped)
Edited Jul 10, 2021 by Iris Chang
Assignee
Assign to
Time tracking