IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
> concurrent copy and execution
susan7
post Nov 6 2009, 09:35 PM
Post #1



*

Group: Members
Posts: 7
Joined: 14-June 09
From: Germany
Member No.: 159,232



Hello,

I want to increase the performance of my program by overlapping memory copy and kernel using Nstreams as follow:-

threads=dim3(512,1);
blocks=dim3(n/(nstreams*threads.x),1);
memset(a, 255, nbytes); // set host memory bits to all 1s, for testing correctness
cudaMemset(d_a, 0, nbytes); // set device memory to all 0s, for testing correctness
cudaEventRecord(start_event, 0);
for(int k = 0; k < nreps; k++)
{
// asynchronously launch nstreams kernels, each operating on its own portion of data
for(int i = 0; i < nstreams; i++)
init_array<<<blocks, threads, 0, streams[i]>>>(d_a + i * n / nstreams, d_c, niterations);

// asynchronoously launch nstreams memcopies. Note that memcopy in stream x will only
// commence executing when all previous CUDA calls in stream x have completed
for(int i = 0; i < nstreams; i++)
cudaMemcpyAsync(a + i * n / nstreams, d_a + i * n / nstreams, nbytes / nstreams, cudaMemcpyDeviceToHost, streams[i]);
}
cudaEventRecord(stop_event, 0);
cudaEventSynchronize(stop_event);
CUDA_SAFE_CALL( cudaEventElapsedTime(&elapsed_time, start_event, stop_event) );
printf("%d streams:\t%.2f (%.2f expected with compute capability 1.1 or later)\n", nstreams, elapsed_time / nreps, time_kernel + time_memcpy / nstreams);


Thats for one kernel and one copy, could any one tell me how to do it on many copies and many kernels, shall I loop over the copies and then loop over the kernels? and increase the '#0' in execution configuration in kernel2 by 1 or how?



Thank you for help


Go to the top of the page
 
+Quote Post

Reply to this topicStart new topic

 



Copyright 2008 NVIDIA Corporation.  Terms of Use | Legal Info | Privacy Policy Time is now: 24th November 2009 - 12:44 AM
Unites States Argentina Brazil Chile China Colombia France Germany India Italy Japan Korea Mexico Poland Russia Spain Taiwan United Kingdom Venezuela