Tags:
tag this topic
create new tag
view all tags
---++ %MAKETEXT{"Meeting Minutes - 2013-03-14"}% Attendants: IHEP: Ziyan, Gang, Fabio Time: 2:30 p.m. Secretary: Fabio *Progress Report* This is the first meeting after spring break. Gang reported on his work since the last meeting. Gang's [[%ATTACHURL%/discussion(3.14).ppt.pptx][presentation is attached]]. * Gang looked at the way the aggregated job accounting records are stored in DIRAC's MySQL database. A pyramidal model is implemented with finer granularity for recent records and lower granularity for old records. Specifically, the records for the jobs executed in the last week are aggregated per hour. The records of the jobs executed in the last month are aggregated in 2 hours-long buckets. The records for the last 5 months are aggregated in buckets of 1 day duration. Buckets of lengths of 2 days are used for aggregating jobs of 6 months and jobs older than 6 months are aggregated using buckets of 1 week. * Gang performed some tests using Cassandra for storing all the fields included in individual job records. This is in contrast with the previous data scheme in which jobs were aggregated using only one grouping criteria (e.g. by site, by user, ...) in 1 day-long buckets. This schema proven useful for generating plots fast using a single selection criteria but it cannot be used for generating plots with several selection criteria because that information is not stored. * The Python routines for generating the plots were modified for taking into account this new schema. Some plot generation tests were performed, first without additional selection criteria and then using more than one selection criteria. The results of the first type of tests are presented in slide 7. The time for generating those plots are too high (275 to 450 seconds) to be usable for the portal. Generating the plots with multiple selection criteria are faster (i.e. 145 to 170 secs) but not fast enough to be considered usable. * In the previous meeting we had identified a plot that was generated by the portal and looked very differently from the one generated by Gang's tests. Gang looked again at it (see slide 11) using exactly the same data from the LHCb accounting records and the plot still looks different from the DIRAC portal. * We disccussed then how to make progress and we decided to: * ask some guidance from the Cassandra community to understand if there are some features of Cassandra or better ways to organize the data that could be used to help improve the performance. * evaluate other sort of data stores that can exploit the fact that the accounting records are structured (i.e. all the records contain roughly the same information). Column-oriented data stores could be an alternative to relational DBMS. Examples of those systems are MonetDB (free, open source), Vertica (commercial) and VoltDB, among others. *Next meeting* April 1st, 2:30 pm, Fabio's office -- Main.FabioHernandez - 2013-03-25
Attachments
Attachments
Topic attachments
I
Attachment
History
Action
Size
Date
Who
Comment
pptx
discussion(3.14).ppt.pptx
r1
manage
1175.5 K
2013-03-25 - 08:57
FabioHernandez
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r1 - 2013-03-25
-
FabioHernandez
Home
Site map
AFS web
BES web
BesDistributedComputing web
BESIII web
BOSS web
Castor web
CCSystem web
Amanda web
CA web
PBS web
Quattor web
ClusterFileSystem web
ClusterSys web
CMS web
AMS web
HERD web
Condor web
CRAC web
DayaBay web
DBgroup web
Dirac web
DSEP web
FileStore web
GangLia web
Hadoop web
HERD web
More...
Dirac Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Register User
Български
Cesky
Dansk
Deutsch
English
Español
_Français_
Italiano
日本語
한글
Nederlands
Polski
Português
Русский
Svenska
简体中文
簡體中文
E
dit
A
ttach
Copyright © 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback