Dialogue on Open Science and FAIR Data - Kenya

Date: Apr 18, 2019

CODATA (CODATA President Prof Barend Mans & CODATA Executive Director Dr. Simon Hodson) on Monday 16 April 2019 had an excellent meeting with Kenya HE VP Williams Ruto and his team on #FAIRdata #OpenScience #OpenData #OpenGovernment

Read Prof. Muliaro: The advent of big data heralds huge opportunities

Congratulations all, and to Kenya, in making progress towards promoting Open Data, to benefit all!

Public Lecture: Openness in Data, Science and Governance

Date: Apr 15, 2019

CODATA's Executive Committee member, Professor Muliaro Wafula will be giving a public lecture on 'Openness in Data, Science and Governance' at the Jomo Kenyatta University of Agriculture and Technology. CODATA's President, Barend Mons and CODATA's Executive Director, Simon Hodson will attend this event.

Location: Jomo Kenyatta University of Agriculture and Technology, ICT Centre of Excellence and Open Data-iCEOD.

Venue: Assembly Hall, JKUAT Main Campus

Time: 1.50 pm – 4.00 pm

Date: Monday, April 15th, 2019

Registration: Entry is free

Download the flyer


Date: Apr 8, 2019


The CODATA 2019 Conference will be held on 19-20 September 2019 in Beijing, China. This year’s conference theme is: Towards next-generation data-driven science: policies, practices and platforms.  The conference will follow a high-level workshop, 17-18 September 2019, on ‘Implementing Open Research Data Policy and Practice’ that will examine such challenges in China and elsewhere in the light of the emergence of data policies and in particular the China State Council’s Notice on ‘Measures for Managing Scientific Data’.

CODATA 2019: Towards next-generation data-driven sciencepolicies, practices and platforms

Science globally is being transformed by new digital technologies.  At the same time addressing the major global challenges of the age requires the analysis of vast quantities of heterogeneous data from multiple sources.  In response, many countries, regions and scientific domains have developed Research Infrastructures to assist with the management, stewardship and analysis.  These developments have been stimulated by Open Science policies and practices, both those developed by funders and those that have emerged from communities.  The FAIR principles and supporting practices seek to accelerate this process and unlock the potential of analysis at scale with machines.  This conference provides a significant opportunity to survey and examine these developments from a global perspective.

The convening organisations are pleased to invite you to contribute to the program by proposing sessions.  The deadline for session proposals is 29 April 2019:

All proposals related to open science and open data, FAIR data, research data management and stewardship, research infrastructures and platforms are welcomed. The following themes are of particular interest:

1. FAIR and Open data policies

  • Open data policies, their implications and implementation
  • FAIR data, its challenges and opportunities
  • Incentives and metrics for data and research contribution
  • Trustworthiness and sustainability for FAIR data, research infrastructures, data repositories
  • Co-operation in research data policies and management at international, national and institutional levels
  • Data policies towards the next-generation open science community 

2. Advanced research infrastructures for Open Science and FAIR data

  • Opportunities and challenges in research infrastructures.
  • Disciplinary technical infrastructure for research data management and data stewardship.
  • The development of Open Science Clouds, Platforms and Commons: a new model for coordination?
  • Successes and models for different aspects of the research data management and stewardship infrastructure
  • Institutional research data management and stewardship status, models and challenges.
  • Data management technologies and interoperability between human and technical processes.
  • Data science education and training.
  • Business models for Open Science, FAIR data, research infrastructures and data stewardship.
  • Key data specifications, RDM protocols and research infrastructures
  • Other technologies and standards feature open data issues. 

3. Data driven scientific discovery and decision-making.

  • Case studies and exploration of science discoveries based on data-driven research
  • Data-driven decision-making, from data to evidence
  • Disciplinary data applications
  • Data success stories 

4. Data intensive research for international scientific and global challenges

  • FAIR data, interoperability and data integration in multi-disciplinary research areas.
  • Data-driven practices in support of the United Nation’s sustainable development goals.
  • Data-driven practices in support of Sendai and disaster risk reduction
  • Data-driven practices in support of Resilient Cities; biodiversity; climate change adaptation; agriculture; hydrology and other research areas.


  • Sessions: Sessions will be 90 minutes.  Two session formats are suggested: 1) papers sessions which may include research papers, practice papers or a mixture of these; 2) lightning talks followed by a structured panel discussion.  Paper presentations should be a minimum of 15 minutes; papers sessions should include a maximum of 4-5 papers.
  • Keynote speakers: there will be keynote sessions on the morning of each day. 
  • Plenary panel discussion: the conference will close with a Plenary Panel discussion featuring short presentations.
  • Poster session: there will be a poster session on the later afternoon of the first day. 

Important dates:

1 February: Call for Session Proposals Released

22 April: Registration Open

29 April: Deadline for session proposals

13 May: Accepted session proposals notified; call for presentations and posters released

17 June: Deadline for presentation submissions and first round of poster submissions

8 July: Submitters notified of acceptances of full presentations and posters

1 August: Close of second round of poster submissions

18 August: Close of early bird registration

17-18 September: High Level Workshop on 'Implementing Open Research Data Policy and Practice’

19-20 September: CODATA 2019 Conference ‘Towards next-generation data-driven science: policies, practices and platforms’


  • Convenors: CODATA, CODATA China
  • Supporter: Minister of Science and Technology (MOST), Chinese Academy of Sciences (CAS), NSFC
  • Local organiser: Computer Network Information Center, CAS; National S&T Infrastructure Center, MOST 

Possible venue (TBC): the Friendship Hotel of Beijing.

The Friendship Hotel of Beijing is one of the largest garden-style hotels in Asia. Located in the heart of ZhongGuanCun Hi-Tech Zone, the Friendship Hotel neighbours many world famous tourist sites and universities such as Tsinghua and Peking University, the Summer Palace.

Address: Zhongguancun South St. Beijing 100873, P.R. China


Tracking the impact of the CODATA/RDA data science schools: the case of the OSG

Date: Apr 3, 2019

The CODATA-RDA Research Data Science Schools provide Early Career Researchers with the opportunity to meet their colleagues and learn relevant Data Science skills. We actively encourage students to use their learning as an opportunity to create new collaborations and generate new research.

One spectacularly successful example of this is Oscar Arbelaez Echeverry from Colombia who, through links made at the schools, enabled approximately 1.2 million CPU hours [this is akin to having access to a 1600 core cluster for a month] to be run on Monte Carlo simulations. As a result of accessing the Open Science Grid resource, six publications [1-7] have been generated by his supervisor, in the best journals in the field and for wider audiences. By providing Oscar with the relevant skills, he has been instrumental to advancing research in his home institution.

Read the full blog here.

Deadline for Applications for the 2019 Foundation School and Advanced Workshops is approaching: 18 April 2019

Disaster Risk Reduction and Open Data Newsletter: April 2019 Edition

Date: Apr 1, 2019

England could run short of water within 25 years Sir James Bevan, the Chief Executive of the Environment Agency, recently shared these sentiments at the Waterwise conference in London.

Philippines: EU Copernicus programme provides full, free open data to aid in tackling El Niño The drought that is currently sweeping the country as a result of El Niño is already hitting Filipino farmers hard.

Protecting the world from the threat of pandemics Creating mathematical and computational models of infectious diseases like pandemic flu gives government and policy-makers a toolkit to respond to an ever-present threat, says the University of Melbourne.

Read the full newsletter here

Deadline Approaching: Call for Nominations and Applications: Editor-in-Chief, Data Science Journal, Apply by 14 April 2019

Date: Mar 25, 2019

The Data Science Journal is currently accepting nominations and applications to become the Editor-in-Chief of the journal:

Applications can be made through the Google form at
The deadline for applications is 12 midnight GMT on Sun 14 April.

The Data Science Journal is overseen by CODATA and is a locus for discussions and research around the science of data for the Open Science, FAIR data and Research Data Management communities.  It was founded by CODATA in 2002.  Since moving to the Ubiquity Press platform and under the direction of Sarah Callaghan, the reach and impact of the Data Science Journal has made considerable progress.

The Data Science Journal is concerned with all aspects of the science of data.  It is a peer-reviewed, open access, electronic journal, publishing papers on the management, dissemination, use and reuse of research data and databases across all research domains, including science, technology, the humanities and the arts. The scope of the journal includes descriptions of data systems, their implementations and their publication, applications, infrastructures, software, legal, reproducibility and transparency issues, the availability and usability of complex datasets, and with a particular focus on the principles, policies and practices for open data.  For the journal scope please see
The main role of the Editor-in-Chief is:
  1. to set the strategic direction of the Journal and develop policies to support the strategy (in collaboration with the editorial board)
  2. to advocate for and promote the journal in general conversation and day-to-day work and by soliciting papers of interest to the community
  3. to manage the editorial process (assisted by editors selected from the editorial board)
More specifically, the responsibilities expected are:
  • in coordination with CODATA and with the Ubiquity Press managing editor, the EiC, selects and manages the Editorial Board
  • respond to initial queries about the journal (please see already drafted boilerplate text regarding time to publication, APCs, requests for waivers etc.)
  • perform regular management tasks such as reminding editors to look at the status of their papers and make decisions/invite new reviewers/etc. as appropriate
  • assign papers to other editors according to their interests and workload
  • approve other editors’ decisions regarding papers
  • send papers to be copy-edited/typeset as appropriate
  • to adhere to COPE Core Practices (
  • to shepherd submissions through the publication system, assisting in locating suitable peer reviewers, and making judgements on the quality of the submitted articles according to their area of expertise
  • to distribute Calls for Papers as widely as possible and to promote the journal whenever possible to gain submissions
  • to carry out occasional peer review for the journal
  • to let the rest of the editorial board know of conferences/events that could be used as journal promotion opportunities (especially relevant if they will be attending the event)
  • to provide editorial advice/opinion to fellow editors on submissions, if required
  • to be available to contribute to a yearly editorial board review, should it be required
  • to help promote specific publications and press releases, 
  • to consider submitting to the journal their own publications
The previous Editor-in-Chief, Sarah Callaghan, allocated c.10% or 3-4 hours per week to this role.  As is often the case for journal editors, her employer allowed this time (0.1 FTE) as part of her activities.

The Editor-in-Chief of the Data Science Journal is an ex officio member of the CODATA Executive Committee and CODATA provides a small fund to support travel and promotion of the journal.  
The position is not a paid role but an honorarium can be negotiated with excellent candidates for outstanding performance.

Congratulations to Boniface O. Akuku, KALRO - Winner of the Africa Tech CIO Award

Date: Mar 13, 2019

CODATA would like to warmly congratulate Boniface O. Akuku, Director ICT, Kenya Agricultural & Livestock Research Organization (KALRO), for being awarded the Africa Tech CIO of the Year, during the recent Africa Tech Week event in Cape Town, South Africa (on 5 March 2019). Boniface is one of the co-chairs of the CODATA Agricultural Data Task Group.

This is award signifies very well-deserved recognition of Boniface's skills and tireless efforts in managing the KALRO data resources, promoting the availabiity of quality data and information resources for farmers and agricultural scientists. 

Boniface's work with the Agricultural Data Task Group includes driving the agenda for FAIR agricultural data, developing skills and capacity and engaging stakeholders around data and information services in the East African Region. 

Well done, Boniface!



Disaster Risk Reduction and Open Data Newsletter: March 2019 Edition

Date: Mar 8, 2019

Coastal Flooding forecasting strengthened in Indonesia
Indonesia is making progress in integrating flood forecasting into its meteorological early warning system.

The UN Global Platform for Official Statistics
The group's goal is to show the global statistical community how to combine the latest technologies in order to improve the quality and relevance of official statistics.

GeoMesa - An open source suite
GeoMesa is an open source suite of tools that enables large-scale geospatial querying and analytics on distributed computing systems.

Read the full newsletter here

DEADLINE APPROACHING - KEYNOTES ANNOUNCED: Drexel-CODATA FAIR and Responsible Research Data Management Workshop

Date: Mar 1, 2019

The deadline for presentation proposals for the Drexel-CODATA FAIR and Responsible Research Data Management Workshop is approaching.  Submit your proposal for a presentation by 4 March: ttps://
FAIR and Responsible Research Data Management (FAIR-RRDM)
A Drexel Metadata Research Centre and CODATA workshop on knowledge sharing between research communities and research institutions
Sun 31 March and Mon 1 April 2019, Drexel University, Philadelphia, USA, as a colocated, pre-event to the 13th RDA Plenary
Call for Papers, deadline, 4  March: 
Places limited, register at: 
The organisers and programme committee are pleased to announce the following keynotes speakers:
  • Lisa Federer, National Library of Medicine: FAIR Data at the National Library of Medicine and National Institutes of Health
  • Robert Hanisch, NIST: Let’s Make it Easy to be FAIR
  • Patricia Cruse, DataCite
  • Barend Mons, Professor at Leiden University Medical Center, Ambassador of GO FAIR, President of CODATA

Lisa Federer: FAIR Data at the National Library of Medicine and National Institutes of Health

The National Institutes of Health recently issued a Strategic Plan for Data Science, which outlines five broad goals to enhance the biomedical community’s ability to conduct data-intensive science. Underlying these goals is the need to ensure that data and other research outputs are FAIR – Findable, Accessible, Interoperable, and Reusable. As the world’s largest biomedical library and one of the institutes of the NIH, the National Library of Medicine plays a key role in realizing the goals of the Strategic Plan for Data Science. This talk will describe some of the activities at NIH and NLM to increase the FAIRness of biomedical data and other research objects.

 In addition, this talk explores the impacts of FAIR data by exploring how researchers actually reuse shared datasets. Using over 10,000 requests to use data from three NIH repositories, this study considered who requested datasets and how they intended to reuse them. In addition, the descriptions of the datasets themselves were analyzed to determine characteristics of highly used datasets that could be helpful in determining which datasets will be of high value early in the data life cycle. The findings of this study have implications for FAIR data management – better understanding how data are reused in practice will help ensure that they are managed, curated, and made accessible in ways that will maximize their usefulness to the research community.

About Lisa Federer

Lisa Federer is the Data Science and Open Science Librarian at the National Library of Medicine, focusing on developing efforts to support workforce development and enhance capacity in the biomedical research community for data science and open science. She serves on the editorial board of the Journal of the Medical Library Association and was the editor of the Medical Library Association Guide to Data Management for Librarians. She holds an MLIS from the University of California-Los Angeles, as well as graduate certificates in data science and data visualization, and is a PhD candidate in information science at the University of Maryland, focusing on biomedical researchers’ data reuse practices and characteristics of datasets that predict high use. 

Robert Hanisch: Let’s Make it Easy to be FAIR

Those of us with roles in data management all firmly believe in the FAIR principles.  The data arising from publicly funded research should be considered a public good, and the value of that public good can only be realized by the data being FAIR.  In some fields of research, where data has been FAIR long before the acronym was invented, there is clear evidence for data re-use and increased research productivity.  In many fields, however, the situation is less clear and it is therefore challenging to convince researchers that making their data FAIR is worth the effort.  In many cases, as well, the tools and infrastructure needed to automate and sustain FAIRness are lacking, and it is only through making it easy—in fact, the default—to be FAIR that we will be able to confirm our belief that FAIR will both increase productivity and lead to more robust and innovative research.

About Robert Hanisch

Dr. Robert J. Hanisch is the Director of the Office of Data and Informatics, Material Measurement Laboratory, at the National Institute of Standards and Technology in Gaithersburg, Maryland. He is responsible for improving data management and analysis practices and helping to assure compliance with national directives on open data access. Prior to coming to NIST in 2014, Dr. Hanisch was a Senior Scientist at the Space Telescope Science Institute, Baltimore, Maryland, and was the Director of the US Virtual Astronomical Observatory. For more than twenty-five years Dr. Hanisch led efforts in the astronomy community to improve the accessibility and interoperability of data archives and catalogs.

Applications Invited to Participate in the CODATA-RDA Research Data Science Advanced Workshops, Trieste, Italy 2019 - Deadline 18 April 2019

Date: Feb 21, 2019

The CODATA-RDA Research Data Science Summer School and Research Data Science
Advanced Workshops will run for their fourth and third year respectively at the International Centre for Theoretical Physics, Trieste, Italy.  
The CODATA-RDA Research Data Science Advanced Workshops will take place on 19-23 August 2019.  Apply here!  
Deadline: 18/04/2019 

During this activity, several applied/thematic workshops on Research Data Science run in parallel.

  • Bioinformatics:  This workshop focuses on building Machine Learning workflows using NGS Data. Topics include: Experimental design; Introduction to NGS data analysis; Machine Learning in NGS; and CWL. Participants should be familiar with UNIX shell and R programming language.
  • IoT and Big Data Analytics: This workshop presents the analysis of vast amounts of data produced by embedded devices, sensors, appliances and other data-collecting systems in real time using new processes and tools for collecting, storing and processing IoT big data, event/streaming data. Participants should be familiar with software installation and programming in R or Java or Python. Professionals & corporate entities should apply for this workshop via the ITU Academy.
  • Climate Data Sciences: This workshop will introduce Cloud-Computing based Data access, processing and visualisation tools for Climate Science, including the Copernicus climate data services platforms and the CMIP Earth System Grid. Participants will work in small project teams and should have a background in Climate Sciences and/or climate modeling.
  • Extreme Sources of Data: This workshop introduces the basics relative to a cut-and-count particle physics analysis as performed in the ATLAS Collaboration (Large Hadron Collider). Topics to be covered include phenomenological, experimental and data-analysis aspects of the Standard  Model; software development and tools for analysis and reproducible science and sharing. Participants should have taken at least one course on Particle Physics at High Education level.


Individuals seeking an introduction to Research Data Science can apply to the Summer School that runs immediately before the workshops.
For more information from the ICTP site go to:



  • A limited number of grants are available to support the attendance of selected participants from developing countries.Professionals and corporate entities should register/apply for the Advanced Workshop on IoT and Big-Data Analytics via the ITU Academy platform.



About the CODATA-RDA Research Data Science Summer School, 5-16 August 2019

The CODATA-RDA Research Data Science Summer School provides training in the foundational skills of Research Data Science.  Contemporary research – particularly when addressing the most significant, transdisciplinary research challenges – cannot be done effectively without a range of skills relating to data. This includes the principles and practice of Open Science and research data management and curation, the use of a range of data platforms and infrastructures, large scale analysis, statistics, visualisation and modelling techniques, software development and annotation and more. We define ‘Research Data Science’ as the ensemble of these skills.
Find out more about the schools here where you will find links to information about past schools in 2016, 2017 and 2018 held in Trieste and São Paulo.  Watch a video about the Schools of Research Data Science here
The School of Research Data Science has enjoyed a remarkable success and clearly responds to a burning need for data skills among Early Career Researchers and others involved in the research process, internationally.  

 ICTP School Poster for DataTrieste 2019


Page 1 of 32  > >>