Inicio  /  Informatics  /  Vol: 4 Par: 4 (2017)  /  Artículo
ARTÍCULO
TITULO

A Data Quality Strategy to Enable FAIR, Programmatic Access across Large, Diverse Data Collections for High Performance Data Analysis

Ben Evans    
Kelsey Druken    
Jingbo Wang    
Rui Yang    
Clare Richards and Lesley Wyborn    

Resumen

To ensure seamless, programmatic access to data for High Performance Computing (HPC) and analysis across multiple research domains, it is vital to have a methodology for standardization of both data and services. At the Australian National Computational Infrastructure (NCI) we have developed a Data Quality Strategy (DQS) that currently provides processes for: (1) Consistency of data structures needed for a High Performance Data (HPD) platform; (2) Quality Control (QC) through compliance with recognized community standards; (3) Benchmarking cases of operational performance tests; and (4) Quality Assurance (QA) of data through demonstrated functionality and performance across common platforms, tools and services. By implementing the NCI DQS, we have seen progressive improvement in the quality and usefulness of the datasets across the different subject domains, and demonstrated the ease by which modern programmatic methods can be used to access the data, either in situ or via web services, and for uses ranging from traditional analysis methods through to emerging machine learning techniques. To help increase data re-usability by broader communities, particularly in high performance environments, the DQS is also used to identify the need for any extensions to the relevant international standards for interoperability and/or programmatic access.

 Artículos similares

       
 
Ladifatou GACHILI NDI GBAMBIE,Ousseni MONGBET     Pág. 1 - 22
Sub-Saharan Africa (SSA) countries have benefited for more than fifty years from international aid in the form of loans and/or donations. Nevertheless, they seem not to benefit from these massive financial resources (ODA) they receive because their econo... ver más

 
Fhrizz S. De Jesus, Hazel Jade E. Villamar, Ramezesh E. Dionisio     Pág. 40 - 53
AbstractThe COVID-19 pandemic has expedited the transition towards a more technologically advanced world, with lasting repercussions on online buying habits. Due to constraints on face-to-face communication, the consumer has migrated from in-person to on... ver más

 
Haoran Liu, Kehui Xu, Bin Li, Ya Han and Guandong Li    
Machine learning classifiers have been rarely used for the identification of seafloor sediment types in the rapidly changing dredge pits for coastal restoration. Our study uses multiple machine learning classifiers to identify the sediment types of the C... ver más
Revista: Water

 
António Carlos Pinheiro Fernandes, Luís Filipe Sanches Fernandes, Daniela Patrícia Salgado Terêncio, Rui Manuel Vitor Cortes and Fernando António Leal Pacheco    
Interactions between pollution sources, water contamination, and ecological integrity are complex phenomena and hard to access. To comprehend this subject of study, it is crucial to use advanced statistical tools, which can unveil cause-effect relationsh... ver más
Revista: Water

 
Zuhier Alakayleh, Xing Fang and T. Prabhakar Clement    
This study aims at furthering our understanding of the Modified Philip?Dunne Infiltrometer (MPDI), which is used to determine the saturated hydraulic conductivity Ks and the Green?Ampt suction head ? at the wetting front. We have developed a forward-mode... ver más
Revista: Water