A heterogeneous system based on latent semantic analysis using GPU and multi-CPU

Gabriel A. León-Paredes; Liliana I. Barbosa-Santillán; Juan J. Sánchez-Escobar

doi:10.1155/2017/8131390

A heterogeneous system based on latent semantic analysis using GPU and multi-CPU

Gabriel A. León-Paredes, Liliana I. Barbosa-Santillán, Juan J. Sánchez-Escobar

Cloud Computing Research Group Smart Cities & High Performance Computing (GIHP4C)

Research output: Contribution to journal › Article › peer-review

9 Scopus citations

Abstract

© 2017 Gabriel A. León-Paredes et al. Latent Semantic Analysis (LSA) is a method that allows us to automatically index and retrieve information from a set of objects by reducing the term-by-document matrix using the Singular Value Decomposition (SVD) technique. However, LSA has a high computational cost for analyzing large amounts of information. The goals of this work are (i) to improve the execution time of semantic space construction, dimensionality reduction, and information retrieval stages of LSA based on heterogeneous systems and (ii) to evaluate the accuracy and recall of the information retrieval stage.We present a heterogeneous Latent Semantic Analysis (hLSA) system,which has been developed usingGeneral-Purpose computing onGraphics ProcessingUnits (GPGPUs) architecture, which can solve large numeric problems faster through the thousands of concurrent threads onmultiple CUDA cores ofGPUs and multi-CPU architecture, which can solve large text problems faster through a multiprocessing environment.We execute the hLSA system with documents from the PubMed Central (PMC) database. The results of the experiments show that the acceleration reached by the hLSA system for largematrices with one hundred and fifty thousand million values is around eight times faster than the standard LSA version with an accuracy of 88% and a recall of 100%.

Translated title of the contribution	Un sistema heterogéneo basado en el análisis semántico latente mediante GPU y multi-CPU.
Original language	English
Journal	Scientific Programming
Volume	2017
Issue number	8131390
DOIs	https://doi.org/10.1155/2017/8131390 https://doi.org/10.1155/2017/8131390
State	Published - 1 Jan 2017

Access to Document

Cite this

@article{4a44e4ba42fb48c99213ff9f3cb2ed34,

title = "A heterogeneous system based on latent semantic analysis using GPU and multi-CPU",

abstract = "{\textcopyright} 2017 Gabriel A. Le{\'o}n-Paredes et al. Latent Semantic Analysis (LSA) is a method that allows us to automatically index and retrieve information from a set of objects by reducing the term-by-document matrix using the Singular Value Decomposition (SVD) technique. However, LSA has a high computational cost for analyzing large amounts of information. The goals of this work are (i) to improve the execution time of semantic space construction, dimensionality reduction, and information retrieval stages of LSA based on heterogeneous systems and (ii) to evaluate the accuracy and recall of the information retrieval stage.We present a heterogeneous Latent Semantic Analysis (hLSA) system,which has been developed usingGeneral-Purpose computing onGraphics ProcessingUnits (GPGPUs) architecture, which can solve large numeric problems faster through the thousands of concurrent threads onmultiple CUDA cores ofGPUs and multi-CPU architecture, which can solve large text problems faster through a multiprocessing environment.We execute the hLSA system with documents from the PubMed Central (PMC) database. The results of the experiments show that the acceleration reached by the hLSA system for largematrices with one hundred and fifty thousand million values is around eight times faster than the standard LSA version with an accuracy of 88% and a recall of 100%.",

author = "Le{\'o}n-Paredes, {Gabriel A.} and Barbosa-Santill{\'a}n, {Liliana I.} and S{\'a}nchez-Escobar, {Juan J.}",

year = "2017",

month = jan,

day = "1",

doi = "10.1155/2017/8131390",

language = "English",

volume = "2017",

journal = "Scientific Programming",

issn = "1058-9244",

publisher = "Hindawi Limited",

number = "8131390",

}

TY - JOUR

T1 - A heterogeneous system based on latent semantic analysis using GPU and multi-CPU

AU - León-Paredes, Gabriel A.

AU - Barbosa-Santillán, Liliana I.

AU - Sánchez-Escobar, Juan J.

PY - 2017/1/1

Y1 - 2017/1/1

N2 - © 2017 Gabriel A. León-Paredes et al. Latent Semantic Analysis (LSA) is a method that allows us to automatically index and retrieve information from a set of objects by reducing the term-by-document matrix using the Singular Value Decomposition (SVD) technique. However, LSA has a high computational cost for analyzing large amounts of information. The goals of this work are (i) to improve the execution time of semantic space construction, dimensionality reduction, and information retrieval stages of LSA based on heterogeneous systems and (ii) to evaluate the accuracy and recall of the information retrieval stage.We present a heterogeneous Latent Semantic Analysis (hLSA) system,which has been developed usingGeneral-Purpose computing onGraphics ProcessingUnits (GPGPUs) architecture, which can solve large numeric problems faster through the thousands of concurrent threads onmultiple CUDA cores ofGPUs and multi-CPU architecture, which can solve large text problems faster through a multiprocessing environment.We execute the hLSA system with documents from the PubMed Central (PMC) database. The results of the experiments show that the acceleration reached by the hLSA system for largematrices with one hundred and fifty thousand million values is around eight times faster than the standard LSA version with an accuracy of 88% and a recall of 100%.

AB - © 2017 Gabriel A. León-Paredes et al. Latent Semantic Analysis (LSA) is a method that allows us to automatically index and retrieve information from a set of objects by reducing the term-by-document matrix using the Singular Value Decomposition (SVD) technique. However, LSA has a high computational cost for analyzing large amounts of information. The goals of this work are (i) to improve the execution time of semantic space construction, dimensionality reduction, and information retrieval stages of LSA based on heterogeneous systems and (ii) to evaluate the accuracy and recall of the information retrieval stage.We present a heterogeneous Latent Semantic Analysis (hLSA) system,which has been developed usingGeneral-Purpose computing onGraphics ProcessingUnits (GPGPUs) architecture, which can solve large numeric problems faster through the thousands of concurrent threads onmultiple CUDA cores ofGPUs and multi-CPU architecture, which can solve large text problems faster through a multiprocessing environment.We execute the hLSA system with documents from the PubMed Central (PMC) database. The results of the experiments show that the acceleration reached by the hLSA system for largematrices with one hundred and fifty thousand million values is around eight times faster than the standard LSA version with an accuracy of 88% and a recall of 100%.

UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85041288678&origin=inward

UR - https://www.scopus.com/inward/citedby.uri?partnerID=HzOxMe3b&scp=85041288678&origin=inward

U2 - 10.1155/2017/8131390

DO - 10.1155/2017/8131390

M3 - Article

SN - 1058-9244

VL - 2017

JO - Scientific Programming

JF - Scientific Programming

IS - 8131390

ER -

A heterogeneous system based on latent semantic analysis using GPU and multi-CPU

Abstract

Access to Document

Other files and links

Fingerprint

Cite this