create new tag
view all tags

Meeting Minutes - 2011-01-05

Attendants: Wenjing, Ran, Jie, Fabio

Time: 10h30

Secretary: Fabio

  • We reviewed the activities performed since the previous meeting
  • Ran studied the gLite User Guide to get familiar with the data management features provided by gLite and the WLCG grid. She also studied the Amazon's Dynamo paper. The next step for her is to get a certificate and try some of the gLite commands for managing data in the grid.
  • Wenjing started exploring Cassandra and its usage in particular at Twitter. She also subscribed to the mailing list and noticed that the community behind this tools looks very active and the developers are responsive to queries by the users. The next step for her is to download and deploy Cassandra in a toy configuration to get more familiar with it, its possibilities and limitations.
  • Jie continued studying HadoopFS and installed it in 3 virtual machines in his desktop. Hadoop does not provide the high availability features we are looking for: its Name Node which contains the metadata is a single point of failure. However, we decided to better understand the mechanisms used by Hadoop for partitioning the data in chunks and distributing the load among all the machines in the cluster. Studying this could give us some ideas that we could use in case we need to partition the files into smaller chunks to fit the potential limitations of the selected data store.
  • Fabio is reading about Project Voldemort but has not had the opportunity to install it. It is his next step.
  • Wenjing will take care of setting up and documenting the testing platform, that will be initially composed of 2 machines provided by IHEP. We agreed to run ScientificLinux 5 on some virtual machines on those 2 physical machines in order to make easier the initial tests of the products by different people involved without disturbing each other. We are aware that using virtual machines, the performance figures we are going to obtain are not representative of what we would get using real machines. In addition, we will reconsider using Lemote machines, once we will have selected the data store back-end, as it is very likely that a software porting effort would be required on those machines because of the different architecture (MIPS) and operating system (Debian-like) they use.
  • We agreed to start documenting via the wiki the features of each one of the products we study in order to build a comparison matrix of them, according to a set of criteria that we will build progressively.
  • Open questions:
    • The size of the data items we need to store in the file store ranges from the hundreds of Megabytes to a few Gigabytes. It appears that the tools we are considering for implementing the file store back-end were not designed for storing data items of such size so may become a potential issue.
    • What are the mechanism implemented in the candidate back-ends for authentication and authorization? In the context of the this project we want to be able restrict access to the stored files to the authorized individuals and to allow the end user to set permissions on the files (or directories) stored.
  • Next meeting: Friday, January 21st 2011, 10h30, Fabio's office

-- FabioHernandez - 2011-01-05

Topic revision: r1 - 2011-01-05 - FabioHernandez
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback