CODATA-RDA School of Research Data Science

The ever-accelerating volume and variety of data being generated is having a huge impact of a wide variety of research disciplines, from the sciences to the humanities: the international, collective ability to create, share and analyse vast quantities of data is having a profound, transformative effect. What can justly be called the ‘Data Revolution’ offers many opportunities coupled with significant challenges. High among these is the need to develop the necessary professions and skills.  There is a recognised need for individuals with the combination of skills necessary to optimise use of the new data sets. Researchers and research institutions worldwide recognise the need to develop data skills and we see short courses, continuing professional development and MOOCs providing training in data skills and research data management.

The Need for Foundational Data Skills in all Disciplines

In sum, this is because of the realisation that contemporary research – particularly when addressing the most significant, inter-disciplinary research challenges – cannot effectively be done without a range of skills relating to data.  These skills include the principles and practice of Open Science and research data management and curation, the use of a range of data platforms and infrastructures, large scale analysis, statistics, visualisation and modelling techniques, software development and annotation, etc, etc. The ensemble of these skills, we define as ‘Research Data Science’, that is the science of research data: how to look after and use the data that is core to your research.

The CODATA-RDA School of Research Data Science has developed a short course, summer school, style curriculum that addresses these training requirements.  The course partners Software Carpentry (using the Shell command line and GitHub), Data Carpentry (using R and SQL) and the Digital Curation Centre (research data managment and data management plans) and builds on materials developed by these organisations.  Also included in the programme are modules on Open Science, ethics, visualisation, machine learning (recommender systems and artificial neural networks) and research computational infrastructures.

The foundational curriculum and school was piloted in August 2016 at the International Centre of Theoretical Physics in Trieste. In July 2017 it was repeated, with some refinements, and combined with a set of more advanced or discipline specific workshops looking at Extreme Data, the Internet of Things and Bioinformatics.  In this way, we are exploring a vision where a foundational programme, suitable for any research discipline, can eventually be combined with more advanced training suitable for specific skills or disciplines.

An International Network of 'Data Schools'

ICTP Adriatico Guest House which will host the school

The signfiicant and palpable demand from individuals and institutions means that it is a strategic priority for both CODATA and the Research Data Alliance to build capacity and to develop skills, training young researchers in the principles of Research Data Science.  A particular issue is also the needs of young researchers in Lower and Middle Income Countries (LMICs): it is important that Open Data and Open Science benefit research in LMICs and that an unequal ability to exploit these developments does not become another lamentable aspect of the ‘digital divide’.  On the contrary, it has been argued that the ‘Data Revolution’ may offer a notable opportunity for reducing that divide in a number of respects.

For these reasons, the vision of the CODATA-RDA Schools of Research Data Science is to develop into an international network which makes it easy for partner organisations and institutions to run the schools in a variety of locations.  The annual event at the ICTP in Trieste will serve as a motor for building the network and building expertise and familiarity with the intiaitive's mission and objectives.  The core materials are made avaialble for reuse and the co-chairs and Working Group team will provide guidance to assist partners in organising the school, in identifying instructors and helpers etc. The first school to expand this initiative will take place at ICTP-SAIFR (South American Institute of Fundamental Research), Sao Paolo, Brazil in December 2017.

 

Mission and Objectives

The CODATA-RDA Schools of Research Data Science will:

  • address recognised need for Research Data Science skills across all disciplines;
  • follow an recognised and accredited curriculum that addresses foundational data skills required by all researchers;
  • provide a pathway from a broad foundational course through to more advanced and specialised courses or workshops;
  • be reproducible: all materials will be online with Open licences;
  • be scalable: emphasis will be placed on training trainers, building partnerships and developing an international network which makes it easy for schools to be run in many locations.

 

Programme and Vision

 

Future Schools

The next CODATA-RDA School of Research Data Science will be:

 

Past Schools

 

Convenors and Organisers

Partners

 

Working Group Co-Chairs

Sarah Jones, Digital Curation Centre, Scotland: Sarah.Jones [at] glasgow.ac.uk

Ciira Maina, Dedan Kimathi University of Technology, Kenya: ciira.maina [at] dkut.ac.ke

Rob Quick, University of Indiana, USA: rquick [at] iu.edu

Hugh Shanahan, Royal Holloway University of London, England: Hugh.Shanahan [at] rhul.ac.uk

 

Previous Co-Chairs

Simon Hodson, CODATA: simon [at] codata.org

Anelda van der Walt, Talarify.

Andrew Harrison, University of Essex.