Depositing Data ****************************************************************************************** * Why select and appraise? ****************************************************************************************** It is neither possible nor feasible to keep all digital data forever. Long-term preservati of data entails additional costs that would last beyond the duration of the research proje costs related to data retention, with a growing volume of digital data, discovery becomes noise-to-signal ratio increases.  ****************************************************************************************** * Five steps to help you decide what data to keep ****************************************************************************************** DCC: Five steps to decide what data to keep [ URL "http://www.dcc.ac.uk/resources/how-guid decide-what-data-keep"]  (CC-BY)  1. Consider potential reuse purposes - what aims could the data meet?  2. Check for indications that it must be kept considering legal or policy complianc 3. Identify which data should be kept as it may have long-term value   4. Weigh up the costs - which data management costs have already been incurred and and how much more is planned and affordable? Where will the funds to pay these c questions will give you the cost element of your data appraisal and should help advice, e.g., on how to deal with any shortfall in the budget.   5. Complete your data appraisal  - this will list what data must, should or could b reuse purposes. The appraisal should also summarise any actions needed to prepar justification for not keeping it *========================================================================================= * 1. Step 1: Identify purposes that the data could fulfill *========================================================================================= Consider which type of data is suitable for reuse  a.Source data: data collected or created that the research has used  b.Assembled datasets: data extracted or derived from (a)  c.Referenced data: data reworking a subset of (b) above in order to take the analysis furt conclusions  Data collections may also include any other objects necessary to access or interpret the d software, lab protocol, notebook etc.  ? you may need to create multiple data collections as there may be different access permis conditions  Consider which purpose the data could serve beyond the research context  • Verification: enable others to follow the process leading to published findings  • Further analysis: increase opportunities for further analysis of the data  • Building academic reputation: data that is discoverable has greater visibility and can b rates for the published findings  • Community resource development: publish a data resource of value to a known user group  • Further publications: the publication of a data article will contribute to scholarly com • Learning & teaching: embedded data in a learning/teaching engagement to enhance interact • Private use: find the data more easily in the future to exploit for other potential uses *========================================================================================= * 2. Identify the data that must be kept *========================================================================================= Your decision on what data to keep will also depend on legal, regulatory or policy complia will need to check whether there is any institutional or a funder’s policy that defines wh retained - these are mostly the ‘data behind the graph’ - and whether the data should be m have restricted access.  *========================================================================================= * 3. Identify data that should be kept *========================================================================================= As a general rule, the data should be kept if you have already identified a compliance rea answer ‘yes’ to at least one question under any two of the headings (criteria) below.  Is it good enough?  • Description: is there enough information about what the data is, how and why it was coll • Quality: is the data quality good enough in terms of completeness, sample size, accuracy reliability, representativeness, or any other criteria relevant in the domain?  Is there likely to be demand?  • Known users: are there users waiting for this data, or is there past evidence of a deman • Recommendation: does the funder or a learned/professional society recommend sharing data • Integration potential: does the data describe things that fit standardised terms or voca other research domains?  • Reputation: was the data produced by a research group or a project that is highly rated originality, significance and rigour of previous research outputs? Will making the data enhance the group/project’s reputation?  • Appeal: could the data have broad appeal? (e.g., relating to a landmark discovery)  Are the data easily replicable?  • Non-replicable: would reproducing the data be difficult/costly?  Is there likely to be demand?  • Cleared: is the data classified according to its sensitivity and free from privacy/ethic license or copyright terms and conditions that restrict public access and reuse?  • Open format: is the data in a format that does not require license fees or proprietary s to reuse?  • Independent: if any specialist software/hardware is needed to use the data, is that wide field of study and readily available?  Is it the only copy?  • Unique: is this the only and most complete copy of the data?  • At risk: is the data held somewhere that cannot guarantee long-term storage?  *========================================================================================= * 4. Step 4: Weigh up the costs *========================================================================================= Consider the economic case for keeping the data - how much has been spent on staff time, e hardware/software and service charges, and how much still needs to be spent. Is funding av for preparing the data for archiving and to pay any charges for storage and curation beyon period?  *========================================================================================= * 5. Step 5: Complete the data appraisal *========================================================================================= Weigh up the value and any costs still to be incurred, considering long-term aims, the qua identified, the time and money already invested in it and the risks of being unable to pre keep” data for preservation.   ****************************************************************************************** * Useful Resources ****************************************************************************************** Whyte, Angus & Andrew Wilson. 2010. How to Appraise and Select Research Data [ URL "http:/ sites/default/files/documents/How%20to%20Appraise%20and%20Select%20Research%20Data.pdf"] . Digital Curation Centre