Dynamically and partially reconfigurable hardware architectures for high performance microarray bioinformatics data analysis
View/ Open
Date
29/11/2012Author
Hussain, Hanaa M.
Metadata
Abstract
The field of Bioinformatics and Computational Biology (BCB) is a multidisciplinary field
that has emerged due to the computational demands of current state-of-the-art biotechnology.
BCB deals with the storage, organization, retrieval, and analysis of biological datasets,
which have grown in size and complexity in recent years especially after the completion of
the human genome project. The advent of Microarray technology in the 1990s has resulted in
the new concept of high throughput experiment, which is a biotechnology that measures the
gene expression profiles of thousands of genes simultaneously. As such, Microarray requires
high computational power to extract the biological relevance from its high dimensional data.
Current general purpose processors (GPPs) has been unable to keep-up with the increasing
computational demands of Microarrays and reached a limit in terms of clock speed.
Consequently, Field Programmable Gate Arrays (FPGAs) have been proposed as a low
power viable solution to overcome the computational limitations of GPPs and other methods.
The research presented in this thesis harnesses current state-of-the-art FPGAs and tools to
accelerate some of the most widely used data mining methods used for the analysis of
Microarray data in an effort to investigate the viability of the technology as an efficient, low
power, and economic solution for the analysis of Microarray data. Three widely used
methods have been selected for the FPGA implementations: one is the un-supervised Kmeans
clustering algorithm, while the other two are supervised classification methods,
namely, the K-Nearest Neighbour (K-NN) and Support Vector Machines (SVM). These
methods are thought to benefit from parallel implementation. This thesis presents detailed
designs and implementations of these three BCB applications on FPGA captured in Verilog
HDL, whose performance are compared with equivalent implementations running on GPPs.
In addition to acceleration, the benefits of current dynamic partial reconfiguration (DPR)
capability of modern Xilinx’ FPGAs are investigated with reference to the aforementioned
data mining methods.
Implementing K-means clustering on FPGA using non-DPR design flow has
outperformed equivalent implementations in GPP and GPU in terms of speed-up by two
orders and one order of magnitude, respectively; while being eight times more power
efficient than GPP and four times more than a GPU implementation. As for the energy
efficiency, the FPGA implementation was 615 times more energy efficient than GPPs, and 31 times more than GPUs. Over and above, the FPGA implementation outperformed the
GPP and GPU implementations in terms of speed-up as the dimensionality of the Microarray
data increases. Additionally, the DPR implementations of the K-means clustering have
shown speed-up in partial reconfiguration time of ~5x and 17x over full chip reconfiguration
for single-core and eight-core implementations, respectively.
Two architectures of the K-NN classifier have been implemented on FPGA, namely, A1
and A2. The K-NN implementation based on A1 architecture achieved a speed-up of ~76x
over an equivalent GPP implementation whereas the A2 architecture achieved ~68x speedup.
Furthermore, the FPGA implementation outperformed the equivalent GPP
implementation when the dimensionality of data was increased. In addition, The DPR
implementations of the K-NN classifier have achieved speed-ups in reconfiguration time
between ~4x to 10x over full chip reconfiguration when reconfiguring portion of the
classifier or the complete classifier.
Similar to K-NN, two architectures of the SVM classifier were implemented on FPGA
whereby the former outperformed an equivalent GPP implementation by ~61x and the latter
by ~49x. As for the DPR implementation of the SVM classifier, it has shown a speed-up of
~8x in reconfiguration time when reconfiguring the complete core or when exchanging it
with a K-NN core forming a multi-classifier.
The aforementioned implementations clearly show FPGAs to be an efficacious, efficient
and economic solution for bioinformatics Microarrays data analysis.
Collections
The following license files are associated with this item: