416DAT

Update of "Project Proposal"
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview

Artifact ID: 7cd1608148d26dd473a19457106c57856c448525
Page Name:Project Proposal
Date: 2013-02-27 20:22:18
Original User: wtzou
Parent: cbaab95de4f700aee357d04e48fe9172b67f2840
Content

Project Description

Our project will involve a comparative study of different document store database systems. The 5 systems we will be examining will be MongoDB, OrientDB, Couchbase, Redis and Apache Cassandra.

We will use the Yahoo! Cloud Serving Benchmark (YCSB) framework to collect data on the performance of each system. Performance will be tested by running common workloads across all of the systems using the YCSB client which is an open source workload generator. In addition to a predefined set of "Core" workloads already provided by YCSB, due to the extensible nature of the client, we will be able to define new workloads and look at performance from different aspects and test various scenarios

By the end of this study, we hope to achieve a better understanding of how document store databases perform in a distributed environment and how different offerings of document stores perform relative to each other under similar conditions. We also expect to gain a deeper appreciation of the challenges behind distributed systems in the context of databases.

Goals

Final Deliverables

Internal Milestones

  1. Get a YCSB instance up and running with the core workload ( Feb 11 )
  2. Add client interfaces if ones do not exist ( Feb 18th )
  3. Run core workloads on our databases and analyze results (local) ( Feb 22nd )
  4. Configure Amazon instances to use same resources (RAM, HDD/SSD speed, processors, caches); Snapshots, system images, schemes ( Mar 3rd )
  5. Set up databases on instances ( Mar 8th )
  6. Collect data ( Mar 15th )
  7. Data analysis and 1st Report Draft Done ( Mar 29th )

Division Of Work

Wish List ( If Time Permits)

Risk Management

Due to the modular nature of our goals we can choose not to do extended workflows or write our own workflows if the core workflow is proving too complex to analyze effectively under time constraints.

Tools, Scripts, Acknowledgements

We will be relying heavily on research papers, documentation, and the YCSB benchmark framework for our report.

Project References

Links To Reference Documents
YCSB Overview
YCSB Research Paper
YCSB Experimental Results
YCSB GitHub Source Code
YCSB - new workloads
Core Workloads
List of wiki pages for YCSB
MongoDB
OrientDB
Couchbase
Redis
Cassandra

Project Progress

Project Description and Progress