create new tag
view all tags

Meeting Minutes - 2011-02-23

Attendants: Wenjing, Ran, Jie, Fabio

Time: 10h30

Secretary: Fabio

  • Testbed Platform
    • The testbed platform is available and accessible. No heavy tests yet, but it will be used to install Cassandra and Swift (see below).
  • Cassandra [Wenjing]
    • Wenjing has been exploring some ways to split files (i.e. BLOBs) into chunks to cope with the instrinsic limitations imposed by Cassandra for any value. Fabio mentions that there is an specification made by the team developing MongoDB (another key-value store) named GridFS. It is a convention on how to name the keys of files and the chunks composing each BLOB. Although it has been proposed for MongoDB there is nothing specific to that product and the same ideas could be implemented for other stores. It is worth having a look at it.
    • We agreed to build all the tests and prototypes in Python, so that we can easily share pieces among the members of the team.

  • OpenStack's Swift [Ran]
    • Ran prepared a presentation summarizing her findings on Swift. She has experienced some difficulties with the LaunchPad platform, the repository used by the Swift team to store the software.
    • The Proxy concept of Swift is not completely understood and in particular, its role in the system: would this component be a single point of failure?
    • It is confirmed that Swift does not include features for on-the-fly encrytion of the stored data. If we want the files to be encrypted, they must be encrypted before uploading and decrypting after downloading.
    • In addition, Swift put some constraints for organizing the files in the store in hierarchies of arbitrary depth, a feature that is typically requested by end-users. Swift allows for a 3-level depth hirearchy composed of 'accounts' which contains 'containers' which can contain 'objects' (i.e. files).
    • One object stored in Swift (i.e. one file) is limited to 5 GB.
    • The features for controlling access to the files stored in Swift (such as ACLs) needs to be explored.
    • Ran will study the replication mechanims of Swift and the ways to configure it for our purposes.
  • HadoopFS [Jie]
    • No additional work performed on Hadoop since the last meeting.
  • Project Voldemort [Fabio]
    • Fabio has not installed Voldemort as initially intended. However, he spent some time understanding how to model the features necessary to build a file store by using the limited expressiveness of a key-value interface. He used Redis (an in-memory key-value store) and wrote the tests in Python. A detailed description of this work can be found here.
  • Discussion [all]
    • The tests performed by Fabio was discussed in detail. The general impression so far is that none of the stores we are evaluating has all the features desired for building the file store. One way to turaround this problem is to separate metadata (file attributes such as file owner, size, creation date, access control, checksum, etc.) from the data itself and use 2 different key-value stores for storing those data. For instance, using Cassandra for storing the meta-data and Swift for storing the actual data.
    • In this design, we would be able to turn around the problem of the hierarchy contraints imposed by Swift. The hierarchical structure would be exposed by the data stored in Cassandra (see Fabio's tests with Redis) but we could benefit of the features for storing the actual BLOBs (files in our case) in Swift: for instance, we should not have to develop the logic for splitting the file into chunks for storage and reassebling the chunks for retrieval.
  • Next meeting
    • Tuesday March 22nd, 10h30, Fabio's office

-- FabioHernandez - 2011-02-25

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r2 - 2011-02-25 - FabioHernandez
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback