Tags:
tag this topic
create new tag
view all tags
---++ %MAKETEXT{"Meeting Minutes - 2013-05-02"}% Attendants: Ziyan, Gang, Fabio Time: 2:30 p.m. Secretary: Fabio *Progress Report* Gang reported on his work since the last meeting. Gang's presentation is [[%ATTACHURL%/discussion(05.02).ppt.pptx][attached]]. * Gang worked for getting more familiar with MongoDB v2.4. * The data in the MySQL JobType table was inserted in MongoDB. It tool about 14GB (to be compared to 5GB in MySQL). An index was also created for these data. * Gang explored the possibilities for indexing the data to support specific queries. The indexes required by the queries need to fit into RAM of the MongoDB servers. MongoDB uses RAM for storing the whole index and a working set (i.e. a fraction of the data). * MongoDB offers some options for configuring the database in a highly-available way using replica sets. Members of a replica set can be primary, secondary and arbiter. The members of a replica set replicate the data among them and ensure automated failover. The client of MongoDB (i.e. the application) knows what is the primary server. When the primary is down, a new primary is elected from the secondary servers. In a typical installation, there are replica sets composed of 3 MongoDB servers. Writing data goes only through the primary server but secondary servers can serve read requests. * After configuring a 3-members replica set, Gang made some tests for understanding how the process of electing a new primary server works, by explicitely shuting down the primary. As a result, after about 10 seconds, a new primary server is selected and the application receives the response to its query. * For scalability, MongoDB's approach is to create shards. In a sharded configuration, each MongoDB stores a fraction of the data. Usually, each shard is configured as a replica set. A MongoDB cluster is composed of router processes (called mongos) which route the client application reads and writes to the shards. There are also configuration servers which are responsible for storing metadata about the cluster. Finally, there are the shards which actually store the data. * Sharding the data is recommended when the dataset approaches the storage capacity of a node, when the working set approaches the max amount of RAM or when there is a large amount of write activity. * For testing purposes, Gang configured a MongoDB cluster using 2 physical nodes. The cluster was configured with 2 shards, one on each machine. Each shard is a replica set with 3 copies (all the copies in the same machine). * For querying the cluster, MongoDB proposes an aggregation framework. Gang used this framework for querying the data required for reproducing a plot that shows the CPU efficiency distribution for a particular user. This kind of plots is reported in the DIRAC accounting paper provided by Ricardo, so is a typical example of the kind of detailed analysis that a DIRAC operator would be able to perform with the job accounting data. *Next steps* * Use the MongoDB cluster query mechanisms to reproduce the plots for the the detailed analysis use-cases reported in the DIRAC accounting paper and measure the time to produce each one of them. Next meeting: Friday May 24th, 14:30, Fabio's office -- Main.FabioHernandez - 2013-05-22
Attachments
Attachments
Topic attachments
I
Attachment
History
Action
Size
Date
Who
Comment
pptx
discussion(05.02).ppt.pptx
r1
manage
724.3 K
2013-05-22 - 03:02
FabioHernandez
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r1 - 2013-05-22
-
FabioHernandez
Home
Site map
AFS web
BES web
BesDistributedComputing web
BESIII web
BOSS web
Castor web
CCSystem web
Amanda web
CA web
PBS web
Quattor web
ClusterFileSystem web
ClusterSys web
CMS web
AMS web
HERD web
Condor web
CRAC web
DayaBay web
DBgroup web
Dirac web
DSEP web
FileStore web
GangLia web
Hadoop web
HERD web
More...
Dirac Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Register User
Български
Cesky
Dansk
Deutsch
English
Español
_Français_
Italiano
日本語
한글
Nederlands
Polski
Português
Русский
Svenska
简体中文
簡體中文
E
dit
A
ttach
Copyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback