This event has ended!

View current events hosted by Triangle Hadoop Users Group

TriHUG October Meeting featuring Josh Patterson

Tuesday, October 11, 2011 from 6:30 PM to 9:30 PM (ET)

Durham, NC

Ticket Information

Type End     Quantity
TriHUG October Ended Free  

Event Details

Title: Lumberyard: Time series Indexing at Scale

Abstract: 

As time series data explodes in volume in the genomic, sensor, and

financial realms [1] companies are looking for more effective ways to

store and query this data. To handle this explosion in scale systems

are looking to the Hadoop, HBase, and NoSQL domain for components to

build their systems on. In this talk we introduce Lumberyard [3], a

system which can potentially (1) store Terabytes of time series data

and allow for this data to be interactively queried at low latencies

to provide real time access. Lumberyard stores iSAX [4] indexes in

HBase's Multi-dimensional sorted map storage system which give

Lumberyard the reliability of HDFS yet the low latencies of HBase. Our

approach leverages a multidimensional indexing structure which is

stored in HBase's highly available distributed multi-dimensional

sorted map. We present the design of Lumberyard's implementation and

illustrate the differences between an in-memory iSAX index compared

with a persisted HBase-backed iSAX index.

 

Sponsored by Cloudera and Bronto Software.

 

More info at www.trihug.org.

 

Bio:

 

Master’s Thesis: self-organizing mesh networks Published in IAAI-09:

TinyTermite: A Secure Routing Algorithm

 

Conceived, built, and led Hadoop integration for the openPDC project

at TVA (Smartgrid stuff). Led small team which designed classification

techniques for timeseries and Map Reduce. Open source work at

http://openpdc.codeplex.com

 

Now: Sr. Solutions Architect at Cloudera