Exploring unstructured datastores for DIRAC accounting data
Introduction
This is a shared notebook on our work aiming at exploring the suitability of unstructured data stores for recording DIRAC accounting data.
Motivation
In the current implementation of the DIRAC software, accounting records (in particular, job execution records) are stored in a relational database using
MySQL. This database is central to the system and is used not only for storing live data about the current status of the system (what jobs are in the queue waiting for execution, what jobs are currently in execution, etc.) but also for storing historical data (what jobs have been submitted by a user, how much execution time the job took, where the job was executed, etc.). Historical data is exploited for several purposes, for instance for analytics and also for generating activity plots accessible through the DIRAC dashboard.
The purpose of the work documented in this espace is to understand if and how unstructured data stores can be used for storing historical data, and in particular accounting records generated by the DIRAC system.
DIRAC accounting data have several properties that make them attractive for unstructured data stores. In particular, its write-once-read-many access pattern, the amount of cumulated data over the time and the use-cases for exploiting them seem suited for this kind of stores. The end goal is to explore the possibilities of separating the storage system used for storing live data from the one used for storing historical data.
People
This work is a collaboration between people from the BES experiment, from IHEP computing center. Here is the list of contributors:
- ZiYan DENG: BES experiment
- Fabio HERNANDEZ: IHEP computing center and CC-IN2P3 computing center
- Gang ZHANG: master's student
Specific Topics
In this project we need to explore specific topics that get documented in their own sections:
Meetings
We meet every other week, on Thursday 2:30pm in Fabio's office. The goals of these meetings is to review progress, identify problems and look for solutions.
Next meeting: Date to be decided by e-mail.
Below are the summary of the past meetings, in reverse chronological order:
- June 13th, 2013
- May 24th, 2013
- May 2nd, 2013
- April 11th, 2013
- March 14th, 2013
- January 24th, 2013
- January 10th, 2013
- December 20th, 2012
- December 6th, 2012
- November 22nd, 2012
- November 8th, 2012
- October 25th, 2012
- September 27th, 2012
- September 13th, 2012
- August 23rd, 2012
- August 9th, 2012
--
FabioHernandez - 2012-08-16