A New Approach for Parallel Region Growing Algorithm in Image Segmentation using MATLAB on GPU Architecture KRISHNA KATRAGADDA MANEESH BODDU
IMAGE SEGMENTATION  Image Segmentation is a process of partitioning a digital image into multiple segments.  Segmentation plays key role in image analysis.  Mostly used in Medical field for locating tumors and other pathologies.
SEGMENTATION TECHNIQUES Threshold method Clustering method Histogram-based method Edge detection method Region growing method Watershed transformation method Partial differential equation-based methods
GRAPHIC PROCESSING UNIT(GPU) GPU is a programmable logic chip (processor) specialized for display functions. It renders images, animations and video for the computer's screen. It has extremely high floating-point processing performance. GPU can accelerate segmentation process. A GPU performs quick math calculations and frees up the CPU to do other things. Whereas a CPU uses a few cores focused on sequential serial processing, a GPU has thousands of smaller cores made for multi-tasking.
CUDA CUDA is a parallel computing platform and programming model developed by Nvidia for general computing on its own GPUs CUDA model is a collection of threads running in parallel. At instruction level, 32 consecutive threads in a thread block make up of a minimum unit of execution, which is called a thread warp. Threads in a single block communicate through the shared memory. CUDA consists of a set of C language extensions and a runtime library that provides APIs to control the GPU. Thus, CUDA programming model allows the programmers to better exploit the parallel power of the GPU for general-purpose computing
MATLAB MATLAB (matrix laboratory) is a numerical computing environment tool. It is also widely used in image processing task. MATLAB also provide a way to work with parallel programs by integrating it with some programming model. Here we are using CUDA as a programming model and trying to integrate the MATLAB with CUDA environment.
GOAL To evaluate and compare the performance of a serial and parallel region growing segmentation algorithm that takes benefit of the highly parallel architecture of the GPU. This work presents a serial and parallel implementation of a region growing algorithm for GPUs. This paper suggests parallel processing improves the performance when compare to that of serial processing
APPROACH Propose a different parallelization scheme that takes benefit of the highly parallel architecture of the GPU:  Each pixel is processed by a different thread.  2 new attributes to calculate spatial heterogeneity to maximize computational efficiency.  Algorithm is implemented by C and CUDA.
GPU ARCHITECTURE GPU’s are parallel processors that support fine-grain threads. Each multiprocessor contains processor cores, multi-threaded instruction unit, number of registers and shared memory. CUDA is C-based development environment for GPU’s Threads are organized into thread blocks, and executed in groups called wraps.
REGION GROWING ALGORITHM  Conversion of image matrix’s range between 0 to 1 rather than 0 to 255.  We then calculate region mean of image.  Calculate the homogeneity among pixels and form segments (region wise).  Homogeneity is calculated based on the given threshold value.
Homogeneity between pixels  Compare initial seed point with neighboring pixels.  Pixels that are homogenous are merged into segments.  A list of different pixels that are the candidate for forming a different region are created.
PARALLEL ALGORITHM The parallel algorithm assigns looping part in serial code to multiple number of threads that work independently. To attain parallel processing we are trying to integrate MATLAB with CUDA on GPU’s
Similar to serial implementation matrix range is converted to 0 to 1 rather than 0 to 255. Define GPU kernel with << number of blocks, threads, dynamic memory per block, stream associated >> using CUDA. Calculate region mean. GPU create two shared variables to store region mean and homogeneity of pixels. Pixels are segmented based on homogeneity and the results are stored in shared memory. Finally segmented image will be shown using functions in MATLAB. Steps :
PERFORMANCE ANALYSIS We consider MRI brain image for segmentaton by using both serial and parallel algorithm.
 The above graph shows performance of both CPU and GPU in term of execution time by choosing different seed points for image segmentation. Here GPU takes very less execution time than CPU.
CONCLUSION The parallel algorithm essentially assigns a particular thread to each image pixel so as to exploit the GPU support of fine-grain threads and the large number of processing elements available. It should also be noted that these performance gains can be obtained with low investment in hardware, as GPUs with increasing processing power are currently available on the market at declining prices. We can optimize our result using new version of NVIDIA graphics card by selecting automatic seed points rather than manual.
Thank You

A New Approach for Parallel Region Growing Algorithm in Image Segmentation using MATLAB on GPU Architecture

  • 1.
    A New Approachfor Parallel Region Growing Algorithm in Image Segmentation using MATLAB on GPU Architecture KRISHNA KATRAGADDA MANEESH BODDU
  • 2.
    IMAGE SEGMENTATION  Image Segmentationis a process of partitioning a digital image into multiple segments.  Segmentation plays key role in image analysis.  Mostly used in Medical field for locating tumors and other pathologies.
  • 3.
    SEGMENTATION TECHNIQUES Threshold method Clusteringmethod Histogram-based method Edge detection method Region growing method Watershed transformation method Partial differential equation-based methods
  • 4.
    GRAPHIC PROCESSING UNIT(GPU) GPUis a programmable logic chip (processor) specialized for display functions. It renders images, animations and video for the computer's screen. It has extremely high floating-point processing performance. GPU can accelerate segmentation process. A GPU performs quick math calculations and frees up the CPU to do other things. Whereas a CPU uses a few cores focused on sequential serial processing, a GPU has thousands of smaller cores made for multi-tasking.
  • 5.
    CUDA CUDA is aparallel computing platform and programming model developed by Nvidia for general computing on its own GPUs CUDA model is a collection of threads running in parallel. At instruction level, 32 consecutive threads in a thread block make up of a minimum unit of execution, which is called a thread warp. Threads in a single block communicate through the shared memory. CUDA consists of a set of C language extensions and a runtime library that provides APIs to control the GPU. Thus, CUDA programming model allows the programmers to better exploit the parallel power of the GPU for general-purpose computing
  • 6.
    MATLAB MATLAB (matrix laboratory)is a numerical computing environment tool. It is also widely used in image processing task. MATLAB also provide a way to work with parallel programs by integrating it with some programming model. Here we are using CUDA as a programming model and trying to integrate the MATLAB with CUDA environment.
  • 7.
    GOAL To evaluate andcompare the performance of a serial and parallel region growing segmentation algorithm that takes benefit of the highly parallel architecture of the GPU. This work presents a serial and parallel implementation of a region growing algorithm for GPUs. This paper suggests parallel processing improves the performance when compare to that of serial processing
  • 8.
    APPROACH Propose a differentparallelization scheme that takes benefit of the highly parallel architecture of the GPU:  Each pixel is processed by a different thread.  2 new attributes to calculate spatial heterogeneity to maximize computational efficiency.  Algorithm is implemented by C and CUDA.
  • 9.
    GPU ARCHITECTURE GPU’s areparallel processors that support fine-grain threads. Each multiprocessor contains processor cores, multi-threaded instruction unit, number of registers and shared memory. CUDA is C-based development environment for GPU’s Threads are organized into thread blocks, and executed in groups called wraps.
  • 11.
    REGION GROWING ALGORITHM  Conversionof image matrix’s range between 0 to 1 rather than 0 to 255.  We then calculate region mean of image.  Calculate the homogeneity among pixels and form segments (region wise).  Homogeneity is calculated based on the given threshold value.
  • 12.
    Homogeneity between pixels  Compareinitial seed point with neighboring pixels.  Pixels that are homogenous are merged into segments.  A list of different pixels that are the candidate for forming a different region are created.
  • 13.
    PARALLEL ALGORITHM The parallel algorithmassigns looping part in serial code to multiple number of threads that work independently. To attain parallel processing we are trying to integrate MATLAB with CUDA on GPU’s
  • 14.
    Similar to serialimplementation matrix range is converted to 0 to 1 rather than 0 to 255. Define GPU kernel with << number of blocks, threads, dynamic memory per block, stream associated >> using CUDA. Calculate region mean. GPU create two shared variables to store region mean and homogeneity of pixels. Pixels are segmented based on homogeneity and the results are stored in shared memory. Finally segmented image will be shown using functions in MATLAB. Steps :
  • 15.
    PERFORMANCE ANALYSIS We consider MRIbrain image for segmentaton by using both serial and parallel algorithm.
  • 16.
     The abovegraph shows performance of both CPU and GPU in term of execution time by choosing different seed points for image segmentation. Here GPU takes very less execution time than CPU.
  • 17.
    CONCLUSION The parallel algorithmessentially assigns a particular thread to each image pixel so as to exploit the GPU support of fine-grain threads and the large number of processing elements available. It should also be noted that these performance gains can be obtained with low investment in hardware, as GPUs with increasing processing power are currently available on the market at declining prices. We can optimize our result using new version of NVIDIA graphics card by selecting automatic seed points rather than manual.
  • 18.