Edgar Y. Walker and Dimitri Yatsenko ran a two-day workshop on DataJoint sponsored by NeuroNex. The workshop was open to the public and targeted researchers with an interest in stronger data organization tools who had no prior experience working with DataJoint. Edgar led the workshop as the primary instructor.
Day one covered introductory topics with demonstrations and exercises throughout the workshop. Day two covered hands-on applications of DataJoint, such as building a data pipeline from scratch for an existing project and extending an existing data pipeline with new analysis tables. Edgar also reviewed best practices for integrating DataJoint with other technologies such as GitHub and Docker.
All workshop materials are available at https://github.com/datajoint/neuronex_workshop_2018, and the videos are available at the following link.
Introduction - Dimitri Yatsenko presents the background, motivation, and overview of DataJoint.
• Session 0 – Getting Connected - Edgar Y. Walker describes how to set up DataJoint to connect with a database server.
• Session 1 – Getting Started with DataJoint - Edgar Y. Walker presents background concepts of a data pipeline, and shows how to use DataJoint to define a first pipeline with manual tables. He then shows how to insert, query, and delete data in the pipeline.
• Session 2 – Imported and Computed Tables - Edgar Y. Walker shows how to extend a data pipeline with imported tables to import and store data present in external files. He then demonstrates how to define computations and their results as computed tables in the pipeline, taking a simple spike detection algorithm as an example.
• Session 3 - Design Patterns and Complex Query - Edgar Y. Walker revisits design patterns encountered in the example pipelines and discusses pipeline design. He then covers multiple advanced query exercises.
• Session 4 – Case Study 1 - Edgar Y. Walker presents a case study for designing a new data pipeline based on a specific research project.
• Session 5 – Case Study 2 - Edgar Y. Walker presents a case study where the students extend an existing data pipeline with new computations.
• Session 6 - Best Practices for Pipeline Design and Maintenance - Edgar Y. Walker presents various tools and technologies that can be used together with DataJoint to facilitate data sharing and reproducible research. In particular, Edgar covers Git, GitHub and Docker.
• Session 7 - Future Developments, Resources and Workshop Recap - Dimitri Yatsenko discusses the upcoming roadmap for DataJoint and surrounding technology development. Edgar Y. Walker then provides a recap of the workshop and discusses further learning resources.