Data types and formats

Choose data formats that are durable and non-proprietary or open, as this improves the chances of interoperability between different systems and re-use of the data in the future.

The National Archives of Australia (NAA) Open data and formats guide provides a list of durable open file formats that are suitable for long-term preservation of your data.

File naming conventions

Use file and folder naming conventions to ensure your data is accessible to those working on the project. Choose descriptive, meaningful file names that can be clearly understood.

Name your files with a naming convention to provide a preview of the content, organize them in a logical way (by date yyyy-mm-dd), identify the responsible party and convey the work history. Document the convention chosen and ensure it is followed consistently.

File names could include information such as:

  • project or experiment name
  • location/spatial coordinates
  • researcher name/initials
  • date or date range of experiment
  • data type
  • conditions
  • file version number.

File naming examples

 
Without a naming convention Files with a naming convention
Test_data_2013 20130503_DOEProject_DesignDocument_Smith_v2-01.docx
Project_Data 20130709_DOEProject_MasterData_Jones_v1-00.xlsx
Design for project.doc 20130825_DOEProject_Ex1Test1_Data_Gonzalez_v3-03.xlsx
Lab_work_Eric 20130825_DOEProject_Ex1Test1_Documentation_Gonzalez_v3-03.xlsx
Second_test 20131002_DOEProject_Ex1Test2_Data_Gonzalez_v1-01.xlsx
Meeting Notes Oct 23 20141023_DOEProject_ProjectMeetingNotes_Kramer_v1-00.docx

Directory structure

Set up a clear directory structure that includes information like the project title, a date, and some type of unique identifier.

Include a readme.txt file in the directory that explains the naming format and any abbreviations or code used.

Directory structure example

Proposal (or Discovery & Planning)

  • Project_Name/
  • README.txt
  • Proposal/
    • Application
    • Data_Management_Plan
      • Ethics_approval

Data collection

  • Project_Name/
  • README.txt
  • Admin/
  • Dataset/
    • Raw_data/
    • Processed_data/
      • YYYYMMDD_Version
      • YYYY-MMDD_Version
    • Metadata/

Data analysis

  • Project_Name/
  • README.txt
  • Data_alysis/
    • Data_cleaning/
    • Data_preprocessing/
    • Notebooks/
    • Output/
      • Graphs
      • Tables

Publications

  • Project_Name/
  • README.txt
  • Publications/
    • .tex_files
    • .bib_files

Archive

  • Project_Name/
  • README.txt
  • Archive
    • Data_collections/
      • License/
      • Metadata/

Version control

Manage the versions of your project's dataset to ensure the integrity and validity of your work. Document a system for tracking versions, updates, and changes made, and ensure it is followed consistently.

Version control can be as simple as appending a number to the end of a file after each major edit. For example:

  • Journal_v1.0.tex, Journal_v1.2.tex
  • Journal_Oct11.tex, Journal_Dec12.tex
  • Journal_Oct11_David_DRAFT_WithSarahsEdits_NewDiagram.tex

Revision (or version) Control Software, provides access control, a collaborative work environment, synchronization between home/office/laptop computers, and a degree of data safety. You may consider this software when you are working with multiple researchers or make lots of edits and/or if simple version control becomes unmanageable.

  • Apache Subversion (SVN) – Server-client revision control system.
  • Bitbucket – Online service with free unlimited repositories for up to 5 users. Options to use Git and/or Mercurial.
  • Git – Distributed revision control system. Free, open-source, designed to handle from small to large projects
  • GitHub – Unlimited public repositories for unlimited number of users. Can use with Git or SVN.
  • Mercurial – Distributed revision control system. Free, open-source.


Contact the Library

Contact us

Live chat with Library

Chat to our team for real-time
answers to your questions.

Chat with us

Book a librarian

Get answers to all your
questions.

Make an appointment