P. 1


|Views: 38|Likes:
Published by wwdt4h

More info:

Published by: wwdt4h on Nov 21, 2010
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less





Despite the increasing trend towards search becoming a primary way to access and

manage files, file systems do not provide any search functionality. Current file system hier-

archies can only be searched with brute force methods, such as grep and find. Instead, a

separate search application that maintains search-based index structures and is separate from

the file system is often used [14,55,67]. However, search applications and the file system share

the same goal: organizing and retrieving files. Keeping file search separate from the file system

leads to consistency and efficiency issues as all file attributes and changes must be replicated in

separate applications, which can limit performance and usability especially at large-scales.

We hypothesize that a more complete, long-term solution is to integrate search func-

tionality directly into the file system. Doing so eliminates the need to maintain a secondary

search application, allows file changes to be searched in real-time, and allows data organization

to correspond to the need for search functionality. However, enabling effective search within

the file system has a number of challenges. First, there must be an way to organize file attributes

internally so that they can be efficiently searched and updated. Second, this organization must

not significantly degrade performance for normal file system workloads.

We propose two new file system designs that directly integrate search. The first, Mag-

ellan, is a new metadata architecture for large-scale file systems that organizes the file system’s

metadata so that it can be efficiently searched. Unlike previous work, Magellan does not use

relational databases to enable search. Instead, it uses new query-optimized metadata layout,

indexing, and journaling techniques to provide search functionality and high performance in a

single metadata system. In Magellan, all metadata look ups, including directory look ups, are

handled using a single search structure, eliminating the redundant index structures that plague

existing file systems with search grafted on.

The second, Copernicus, is a new semantic file system design that provides a search-

based namespace. Unlike previous semantic file systems which were designed as naming layers

above a traditional file system or general-purpose index, Copernicus uses a dynamic, graph-

based index that stores file attributes and relationships. This graph replaces the traditional


directory hierarchy and allows the construction of dynamic namespaces. The namespace al-

lows “virtual” directories that correspond to a query and navigation to be done using inter-file

relationships. An evaluation of our Magellan prototype shows that it is capable of searching

millions of files in under a second, while providing metadata performance that is comparable

to, and sometimes better than, other large-scale file systems.

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->