DOSR Vision - FDIS

DOSR Vision - FDIS

Digital Object Storage and Retrieval (DOSR) Vision Josh Alspector June 2008 Approved for Public Release, Distribution Unlimited Disclaimer This presentation discusses areas of technology investigation and interest. It does not relate to any existing DARPA program, nor should it be inferred to anticipate a future DARPA program. 02/10/20 Approved for Public Release, Distribution Unlimited The Mundaneum In 1910 Belgians Paul Otlet and future Nobel Peace Prize laureate Henri La Fontaine opened the Palais Mondial, later renamed the Mundaneum. The Mundaneums mission was to collect metadata on every book, journal, and periodical ever published and record it in a card file system that embodied what we would call a faceted classification scheme. By 1934 it contained over 15 million entries. Unique identifiers included embedded links to related documents.

Staff responded to search requests received by post and telegraph and returned hand-copied cards by post. In 1934 Otlet conceived a global network of electric telescopes that would allow people to search and browse through interlinked documents, images, audio and motion picture recordings. He wrote that, from his armchair, everyone will hear, see, participate, will even be able to applaud, give ovations, sing in the chorus, add his cries of participation to those of all the others. 02/10/20 Mundaneum Infrastructure Social Network Feedback Telegraph and postal network Approved for Public Release, Distribution Unlimited Human Search Engine Hyper-linked Card Catalog Documents, Images, Recordings Fatal Flaw: Scalability

DOSR Vision Create a resilient, distributed, scalable, and secure network of information that does not require a completely trusted or stable network of processing nodes [employ network overlays, and advanced cryptographic techniques] Videos E-mail Images Advance the state-of-the art in automated metadata generation and interoperability [apply machine learning techniques] Automatically get information where it is needed, or may be needed, using less bandwidth and processing. [integrate user models, compact information retrieval encodings, and distributed content delivery] Web Web pages pages Text files Spreadsheets Automated Metadata Generation User and Data

Models Reliably track where information goes, and where it came from [encapsulate provenance and audit information in network-maintained virtual objects] Enable secure, resilient information storage, characterization, retrieval, and collaboration across barriers of time, geography, community of interest, technology, and administrative domain 02/10/20 What we can find defines what we can do Approved for Public Release, Distribution Unlimited Photos courtesy of U.S. Army, U.S. Hard Problems Automated metadata extraction and generation DoD has many stovepipe systems with limited metadata Automatic extraction of metadata, especially from non-textual information is an unsolved problem requiring some form of artificial intelligence Email, papers, presentations, forms, databases do not possess a community-maintained mesh of reciprocal references, so Google-like search, relevance, and ranking algorithms do not work Scalable security for sharable objects

Decentralized (for scalability) key distribution systems present security challenges Protection from known cryptographic and corruption attacks is hard; protection from unknown attacks is harder Usable secure sharing (as convenient as email) is needed or system wont be used Scalable, revocable group access to synchronized, encrypted, versioned documents is essential Scalable replicated storage and parallel data distribution Globally unique identifiers (GUIDs) for retrieval and update are essential, and must be unbreakable, verifiable, and afford scalable resolution of a retreivable, trackable object How to track fragmented and replicated objects for persistence and provenance Object replication for secure, scalable, high-bandwidth distribution (secure BitTorrent-style) Enhance resiliency and service in network-poor, areas Respond adaptively to service degradation for high-demand data and large-scale disruptions Personalization, intelligent agents and user models 02/10/20 Intelligent agents needed to locate content near likely users, based on user models User models based on authorization, active input and passive tracking Approved for Public Release, Distribution Unlimited

Key Capabilities Object 1 Architecture and protocols Protocols for exchanging objects, metadata, and security controls Mobile agents and federated requests for information Retrieve latest version from closest fragments or replica Replicas and fragments Version 1 Persistence of digital objects Distribute replicas and coded fragments Global, persistent, verifiable, unique identifiers (GUIDs) Version-controlled, collaborative updates Trust, security and provenance Authorized, authenticated access Decentralized encryption for scalability Verifiable provenance and tracking of all objects Resilience to attacks Object 1 Decentralized,

scalable key distribution Version 2 update Scalability Scale-free architecture Decentralized, peer-to-peer techniques Manage latency, consistency and security as scale grows Metadata and search Scalable resources, storage and participant networks Extract metadata from video, maps, images Relevance feedback Efficient federated search Accessibility and User Models User models include authorization, preferences, location, need-to-know Content finds you without search Information locally available is personally relevant Needed objects migrate to local server for user 02/10/20 Approved for Public Release, Distribution Unlimited

Interesting Research Ongoing in Automated metadata extraction Decentralized, self-configuring, location and routing Federated search Information retrieval Personalization and user models Proxy re-encryption Scalable security and PKI Search over encrypted indexes Securing resilient peer-to-peer networks DOSR Workshop will address these areas 02/10/20 Approved for Public Release, Distribution Unlimited Preliminary Schedule July 15 Talks 8:30 am Opening remarks DARPA Architecture 8:45 am Dr. Robert Kahn - keynote address 9:15 am Dr. Peter Lucas MAYA 9:35 am Dr. Daniel Crichton NASA 9:55 am Break Metadata 10:15 am Dr. Ajay Divakaran - Sarnoff Corp. 10:35 am Dr. Randal Burns - JHU 10:55 am Dr. Shmuel Peleg - HU-J 11:15 am Mr. Jason Byassee - Northrop Grumman Security 11:35 am Dr. James Allan - U. Mass-Amherst 11:55 am Dr. Rafail Ostrovsky UCLA 12:15 pm Lunch

1:40 pm Dr. Urs Muller - Net-Scale Tech. 2:00 pm Dr. Matt Staker - IBM Research 2:20 pm Dr. Angelos Stavrou - Global InfoTek Inc. 2:40 pm Break User Models 3:00 pm Dr. Peter Brusilovsky U. Pittsburgh 3:20 pm Dr. Michael Walfish - UT-Austin 3:40 pm Dr. Rafael Alonso - SET Corp. 4:00 pm Mr. Peter Haglich - Lockheed Martin 02/10/20 Approved for Public Release, Distribution Unlimited July 15 Posters 4:20 pm Break 4:40 pm Poster Session 1 5:20 pm Poster Session 2 6:00 pm Adjourn July 16 Breakouts 9:00 am Dr. Josh Alspector - DOSR vision and breakout group instructions 9:30 am Breakout group discussions Noon Lunch 1:30 pm Brief out Group 1 2:00 pm Brief out Group 2 2:30 Break 2:50 Brief out Group 3 3:20 Brief out Group 4 3:45 Plenary Session 4:15 Adjourn Levels of Success

DoD adopts system internally Portions of system are made available for open-source uses by Apache Legal, medical, and financial records management firms adopt GUIDs, protocols, and system components ISPs and media companies adopt GUIDs, protocols, and system components for subscription services Amazon, Google and iTunes use GUIDs and protocols 02/10/20 Approved for Public Release, Distribution Unlimited Prior Art Coda (CMU) Cooperative File System (MIT) FARSITE (Microsoft) Grid (Argonne National Laboratory) Lustre (now owned by Sun Microsystems) OceanStore (UC Berkeley) PASIS (CMU) Universal Database (Maya Design) 02/10/20 Approved for Public Release, Distribution Unlimited

Recently Viewed Presentations

  • The Design of School Choice Systems in NYC

    The Design of School Choice Systems in NYC

    * Changing the Boston school match: A system with incentive problems (Abdulkadiroglu, Pathak, Roth and Sonmez) Students have priorities at schools set by central school system Students entering grades K, 6, and 9 submit (strict) preferences over schools. In priority...
  • Themes in To Kill a Mockingbird - Thomas English 8

    Themes in To Kill a Mockingbird - Thomas English 8

    It's a sin to kill a mockingbird. It is a symbol of innocence. Boo Radley and Tom Robinson are also innocent. What is the author trying to make us think about innocence from the events of this novel? The Gothic...
  • Irony and Ambiguity - PC\|MAC

    Irony and Ambiguity - PC\|MAC

    Irony and Ambiguity Mr. Pettine 10/19/2015 English 9 It's Ironic… Irony is a contrast between expectation and reality - between what is said and what is really meant, between what is expected to happen and what really does happen, or...
  • Physiology and Medical Aspects

    Physiology and Medical Aspects

    Times New Roman Arial Wingdings Beam Slide 1 Underwater Physiology Respiration Respiration Effects of Heat and Cold Temperature Injuries Pressure Related Problems (direct) Slide 8 Pressure Related Problems (indirect) Nitrogen narcosis Slide 11 Physiology Summary
  • Tourism: Principles, Practices, Philosophies Part One ...

    Tourism: Principles, Practices, Philosophies Part One ...

    Tourism and the Environment 17 LEARNING OBJECTIVES Understand fundamental nature of sustainable development and sustainable tourism. Identify guiding principles for achieving sustainable tourism. ... Supplementary ecosystem-specific indicators for application to particular ecosystems (e.g ...
  • Unit 2 - Federalism / Legislative Branch

    Unit 2 - Federalism / Legislative Branch

    UNIT 2 - FEDERALISM / LEGISLATIVE BRANCH. ... DESECRATION IN BURNING US FLAG - Take away the right to burn the American flag. DEFINITION OF MARRIAGE - define marriage as between one man and one woman . ... NEW MEMBERS...
  • Marriage as a Covenant Tabernacle of David Presentation

    Marriage as a Covenant Tabernacle of David Presentation

    Age: Girls 14-20, boys 26-32. Arranged, but still given a choice. Marriage in the ANE (Ancient Near East) Roles in the marriage. ... What is scarier?2 couples walking down a dark street towards you or 4 guys? Marriage is a...
  • BOSTON UNIVERSITYBay State Road/Back Bay West Architectural ...

    BOSTON UNIVERSITYBay State Road/Back Bay West Architectural ...

    BOSTON UNIVERSITYBay State Road/Back Bay West Architectural Conservation District. Dahod Family Alumni Center. ... Proposed signs at Bay State Road - Details ... will be fit to existing stone pier with visual edge of pier detail visible.