Dissertation / PhD Thesis FZJ-2019-05050

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Standards-based Models and Architectures to Automate Scalable and Distributed Data Processing and Analysis



2019

102 pp. () = Dissertation, University of Iceland, 2019

Please use a persistent id in citations:

Abstract: Scientific communities engaging in big data analysis face numerous challenges in managing complex computations and the related data on emerging and distributed computing infrastructures. Large-scale data analysis requires applications with simplified access to multiple resource management systems. Several generic or domain-specific technologies have been developed to exploit diversified computing environments, but due to the heterogeneity of computing and data architectures they are not capable of enabling real science cases. Scientific gateways and workflows are one such example which requires the management of jobs on multiple kinds of batch systems using heterogeneous supercomputing architectures and access to advanced distributed file systems. To support these requirements, a unified architectural framework is presented in this dissertation that coalesces the right combination of standards and adequate middleware realisation. This framework manages concurrent access for diversified user communities through consistent and robust computing and data interfaces oriented to current application and infrastructure demands. The investigations reported in this dissertation were mainly motivated by physical and machine-learning models, represented by two scientific case studies: biophysics and Earth sciences. In the field of biophysics, the UltraScan scientific gateway is enhanced to enable the processing of domain-specific data through standards-based job and data management interfaces in HPC environments. The second domain deals with Earth sciences and automates the processing of machine-learning algorithms (e.g. classification of remote sensing images) using scalable and parallel implementations. As proof of concept, both the case studies are supported through open source implementations, in the form of middleware realisation, client APIs and their integration with state-of-the-art science gateway frameworks.


Note: Dissertation, University of Iceland, 2019

Contributing Institute(s):
  1. Jülich Supercomputing Center (JSC)
Research Program(s):
  1. 512 - Data-Intensive Science and Federated Computing (POF3-512) (POF3-512)

Appears in the scientific report 2019
Database coverage:
OpenAccess
Click to display QR Code for this record

The record appears in these collections:
Document types > Theses > Ph.D. Theses
Workflow collections > Public records
Institute Collections > JSC
Publications database
Open Access

 Record created 2019-10-14, last modified 2021-01-30


Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)