• DocumentCode
    75946
  • Title

    PSO Efficient Implementation on GPUs Using Low Latency Memory

  • Author

    Silva, Eric H. M. ; Bastos Filho, Carmelo J. A.

  • Author_Institution
    Univ. de Pernambuco (UPE), Recife, Brazil
  • Volume
    13
  • Issue
    5
  • fYear
    2015
  • fDate
    May-15
  • Firstpage
    1619
  • Lastpage
    1624
  • Abstract
    This paper proposes an efficient implementation for the Particle Swarm Optimization (PSO) algorithm using the shared memory available in the Graphic Processing Units (GPU) of CUDA (Compute Unified Device Architecture) platforms. In our proposal each dimension of each particle is mapped as a thread. The threads are executed in parallel within a GPU block. Since the GPU blocks present a maximum number of allowed parallel threads, we propose to use multiple sub-swarms. Each sub-swarm is executed in a GPU block aiming at maximizing data alignments and avoiding instructions bifurcations. We also propose two communication mechanisms and two topologies in order to allow the sub-swarm to exchange information and collaborate by using the GPU global memory. The results for 8 sub-swarms, each one with 32 particles and 32 dimensions, show speedups up to 100 and 5 times when compared to the serial implementation and PSO start-of-art implementation for CUDA, respectively. Our proposal allows one to deploy PSO algorithms in continuous optimization problems, which present many input variables. This type of problem is very common in engineering.
  • Keywords
    graphics processing units; parallel architectures; particle swarm optimisation; CUDA platforms; Compute Unified Device Architecture platforms; GPU blocks; GPU global memory; PSO algorithm; continuous optimization problems; graphic processing units; low latency memory; particle swarm optimization; shared memory; Central Processing Unit; Computer architecture; Graphics processing units; Instruction sets; Kernel; Particle swarm optimization; Surges; CUDA; Graphics Processing Units; Parallel Computing; Particle Swarm Optimization; Shared Memory; Swarm intelligence;
  • fLanguage
    English
  • Journal_Title
    Latin America Transactions, IEEE (Revista IEEE America Latina)
  • Publisher
    ieee
  • ISSN
    1548-0992
  • Type

    jour

  • DOI
    10.1109/TLA.2015.7112023
  • Filename
    7112023