Difference: 20110121Minutes ( vs. 1)

Revision 12011-01-21 - FabioHernandez

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="Meetings"

Meeting Minutes - 2011-01-21

Attendants: Wenjing, Ran, Jie, Fabio

Time: 10h30

Secretary: Fabio

  • Testbed Platform
    • The testbed platform is available. It is composed of 2 real machines which have been configured to run each 4 virtual machines. The details of the virtual and real configuration are available in the TestBed page.
    • Reminder: the goal of this testbed is to have a platform for initial evaluation of the candidate software products for building the FileStore. The products that are being considered in at this time are: Cassandra, OpenStack's Swift, HadoopFS, Project Voldemort.
    • An initial allocation of the available machines have been proposed. It will be adjusted as we learn the needs for evaluating each product.
    • In order to interactively connect to those machines, each one of the persons involved in the project will have an individual non-priviliged account with the appropriate 'sudo' privileges to issue commands as super-user whenever is required. However, it is highly desirable that the routine work is performed under the non-priviledge accounts.
    • We all have to provide Jie our public SSH keys so to configure the SSH demons in those machines.
  • Cassandra [Wenjing]
    • initial installation and configuration is performed on the testbed platform. Some tests performed aiming to get a basic understanding of what Cassandra can do in the context of the FileStore project.
    • The Facebook use-case, which is often cited in the Cassandra documentation, is not really relevant for our particular use-case.
    • One of the problems that we may face is related with the size of the values Cassandra is able to manage: the current understanding is that a BLOB (binary object) cannot be bigger than 64MB. This size is incompatible with our requirement to store files in the range of a few MB to a few GB of data.
    • Some alternatives were discussed during the meeting: splitting a file into several chunks, all managed by Cassandra. This would require some intelligence in the client side, both when storing the files into the repository (for splitting it into small pieces) and when retrieving it (for collecting the pieces in the right order).
    • In addition, there may be some restrictions related to the protocol used by Cassandra, which is based on Thrift. It is apparently not designed to stream data, which is something we feel we will need if we want to store and retrieve multi-gigabyte files.
    • As a side, it is noted that some progress has been made in the Gluster file system regarding the meta-data management, which has now some replication features.
  • OpenStack's Swift [Ran]
    • Ran has very carefully read some documents and has now a good understanding of the design and possibilities of this product.
    • An installation and initial configuration has been performed but not yet finalized. Some problems observed that remains to be understood.
    • Some doubts about the exact role of the Proxy component of Swift.
    • As far as we understand, it is not possible with Swift to create a hierarchy of files and directories, which is something desirable from a file store, from the end-user point of view. This point needs to be confirmed.
    • Swift uses 2 types of authentication: the details are not yet understood.
  • HadoopFS [Jie]
    • Jie prepared a short presentation with some of the features of HadoopFS relevant for building a FileStore.
    • Unlikely Cassandra and Voldemort, HadoopFS is file-oriented by design, so it has several features that are usable out-of-the-box for building a FileStore. The meta-data server however is a single point of failure.
    • Jie has an initial installation of it for testing purposes, in particular, as a repository of virtual machine images.
  • Project Voldemort [Fabio]
    • Fabio has spent some time reading the documentation of Project Voldemort.
    • As claimed by the developers, it is a open source Java-based implementation of Amazon's Dynamo.
    • There are noticeably less documentation than Cassandra or HadoopFS. However, there seems to be a community behind this product and the developers seem very active correcting bugs in a timely manner.
    • The product is used at LinkedIn, but it is worth noticing that several of the developers of Voldemort work there.
    • Next step is to install it on the testbed platform.

-- FabioHernandez - 2011-01-21

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback