Unidata THREDDS: Making Distributed Datasets More Available (and Usable) in NSDL THematic Real-time Environmental Distributed Data Services Ben Domenico October 2003 Sponsored by the National Science Foundation http://www.nsf.gov 1 Topics Traditional Unidata Approach Mainly meteorological data Subscription system pushes data to user sites UPC provides data analysis tools for use on data at user sites THREDDS Enhancements Broader menu of Earth system data Local client access from remote servers Less arcane, more accessible tools
Integration of data and analysis tools into educational modules and digital libraries 2 Unidata Community Today More than160 institutions Includes over 100 academic departments plus government agencies and private sector research groups Does not count separate installations, e.g. Spanish weather service IDD, US Weather Service radar data system Interdisciplinary from the outset: 1996 survey showed over 2/3 of institutions had some uses outside meteorology (oceanography, hydrology, climatology, civil engineering, environmental science) 3 Impact Survey Over 21,000 college students per year use
Unidata tools and data in classrooms and labs Nearly 4,000 women/minority students More than 1,800 faculty and research staff Over 55,000 K-12 students involved through Unidata-connected university programs Informal education: in excess of 1 million hits at Unidata-based university web sites per day 97% of community report being satisfied or very satisfied 4 Principal Activities of the Unidata Program Center Facilitating Data Access to a broad spectrum of observations & forecasts (in near real time) Providing Tools to visualize, analyze, organize, receive, & share data at university sites Supporting Faculty who use Unidata systems at colleges & universities (most in the U.S.) Building and Advocating for a Community where data, tools, & best practices in
education/research are shared 5 Traditional Unidata Data Types Individual observations from weather stations around the globe Satellite imagery Radar data from 150 NEXRAD radars Output from forecast model runs at the National Centers for Environmental Prediction Lightning strike data Measurements from sensors on commercial aircraft 6 1Km Radar Image 7
IDD: The Community in Action The Internet-based system by which universities acquire huge quantities of weather data in near-real time (i.e. ASAP) typifies Unidatas community orientation. The system has no data center -- all tasks are performed on the participants own (small) computers. Currently the most used advanced application on the Abilene network (2-3% in terms of packets and bytes transferred) 8 Internet Data Distribution (IDD) with Multiple Sources (Injecting 17 Gigabytes per Day) Source LDM LDM LDM
Source Source LDM LDM LDM Internet LDM LDM LDM Using LDM software for instant data relaying, ~160 institutions cooperate to acquire a wide range of real-time, global, atmospheric & oceanic observations, model outputs, remotely sensed images..., in a coordinated community effort. 9
Typical Data Handling at a Unidata Site Unidata user Unidata user running local running analysis and display tools local analysis and display tools Forecast Model Output Application specific protocols Satellite imagery Decoders
Local data decoded into application specific formats Decoders Weather station observations IDD Radar data Decoders Decoders Decoders Lightning, aircraft, GPSmet, etc.
10 Thematic Data Servers (combining IDD push with several forms of pull and DL discovery) Local user applications: e.g., LAS, McIDAS, IDV, VGEE, IDL, MatLab... Discovery Client/server data access protocols, e.g. OpenDAP, ADDE, WCS, FTP Hydrology Data, e.g. IDD IDD
DLESE Digital Library for Earth-System Education DL interchange protocol Geophysical Data, e.g. IDD IDD Satellite Satellite Satellite Satellite Images,e.g. e.g. Images,
Images, e.g. Imagery... IDD 11 THREDDS THematic Real-time Environmental Distributed Data Services Connecting people, documents and data People Documents Data 12 THREDDS Overview National Science Digital Library (NSDL) collections project Integrating real-time environmental data into
Online educational materials Digital libraries (DLESE, NSDL) Two-year grant from NSF Department of Undergraduate Education (DUE) Second generation under negotiation Led by Unidata Program Center (UPC) 13 THREDDS Data Providers
University of Alabama Huntsville (Sara Graves, Rahul Ramachandran, Steve Tanner, Ken Keiser) ARM (Atmospheric Radiation Measurement, Chris Klaus) CDC, the Climate Diagnostic Center (Roland Schweitzer) COLA, Center for Oceans Land Atmosphere (Joe Wielgosz) University of Florence (Stefano Nativi) GMU, George Mason University (Menas Kafatos and Ruixin Yang) IRI/LDEO, International Research Institute/Lamont Doherty Earth Observatory (Benno Blumenthal) ESG, the Earth System GRID (Luca Cinquini, NCAR/SCD) IRIS DMC, Incorporated Research Institutes for Seismology Data Management Center (Rob Casey)
NCAR, the National Center for Atmospheric Research (Don Middleton) NCDC, the National Climatic Data Center (Ben Watkins) NGDC, National Geophysical Data Center (Ted Habermann) NOMADS,NOAA Operational Model Archive and Distribution System, (Glenn Rutledge, NCDC) University of Oklahoma (Kelvin Droegemeier) PMEL, the Pacific Marine Environment Laboratory (Steve Hankin) FNMOC, Fleet Numerical Meteorological and Oceanographic Center (Phil Sharfstein) SSEC, the Space Science and Engineering Center., U. of Wisconsin-Madison (Steve Ackerman, Tom Whittaker) Unidata Community ADDE servers (Tom Yoksas, Unidata Program Center) CIESIN (Consortium for International Earth Science Information Network, Bob Downs) CUAHSI (Consortium of Universities for Advancement of Hydrologic Science, David Maidment) ESIG/NCAR (NCAR Environmental Societal Impacts Group, Bob Harriss) Earthscope (UCAR UNAVCO, Chuck Meertens) GEON (GEOphysical Network, Chaitan Baru, UCSD San Diego Supercomputer Center) ESRI GIS Community 14 THREDDS Analysis/Display Tool Builders
Data Discovery Toolkit and Foundry based on EDMI (Earth Data Multimedia Instrument, New Media Studio, Bruce Caron). GDS, GrADS/DODS Server (COLA, Center for Oceans Land Atmosphere, Joe Wielgosz) IDV, Integrated Data Viewer (Unidata Program Center, Don Murray) INGRID (IRI/LDEO, International Research Institute/Lamont Doherty Earth Observatory, Benno Blumenthal) LAS, Live Access Server (PMEL, the Pacific Marine Environment Laboratory, Steve Hankin) VGEE, Virtual Geophysical Exploration Environment (NCAR, DLESE, U. of Illinois, Unidata, many collaborators) WXWISE Applets (SSEC, the Space Science and Engineering Center., U. of Wisconsin-Madison, Tom Whittaker)
ESRI GIS Clients (ESRI, Inc., Jack Dangermond, President) OGC Clients (Open GIS Consortium, David Schell, President) MyWorld (Northwestern educational GIS Client, Danny Edelson) 15 THREDDS Interoperability Partners
ADDE, Abstract Data Distribution Environment (University of Wisconsin Madison, Tom Yoksas) DIMES, DIstributed MEtadata System (George Mason University, Ruixin Yang) DODS/OPeNDAP/Aggregation Server, Distributed Oceanographic Data System/Open source Project for a Network Data Access Protocol (University of Rhode Island, Unidata, Ethan Davis) DLESE, Digital Library for Earth System Education (Rajul Pandya) ESML, Earth System Markup Language (University of Alabama-Huntsville, Rahul Ramachandran) ESRI, Environmental Science Research Institute (various) GCMD, Global Change Master Directory (Gene Major) OGC and ISO Standards (University of Florence, Stefano Nativi) ADL (Gazetteer Services The University of California, Santa Barbara, Linda Hill and Michael Goodchild) DLESE Evaluation Services (The University of Colorado CIRES, Susan Buhr) DLESE Data Services (Tamara Ledley) DLESE Program Center Digital Library for Earth System Education (Mary Marlino) ESRI (Jack Dangermond, President) OPeNDAP (The University of Rhode Island Open source Project for a Network Data Access Protocol -- formerly DODS, Peter Cornillon) LAITS (Laboratory for Advanced Information Technology and Standards,Liping Di, George Mason University)
NSDL Evaluation Services (University of Colorado, Tamara Sumner) OGC (Open GIS Consortium, David Schell, President) SWEET (Semantic Web for Earth and Environmental Terminology, Rob Raskin) 16 Unidatas Contributions A large, (inter)national, active, cooperative academic user community Coordination of many disparate contributors (universities, government agencies, digital libraries, commercial vendors, standards bodies) Reliable, automated, real-time data systems Platform-independent 5D visualization with HTML document integration Basic inventory catalog generator and server software Client-side catalog access modules 17 Funding Sources Unidata 2003/2008 (NSF Atmospheric
Science Division) THREDDS NSDL Collections Grant (NSF Department of Undergraduate Education) DODS/OPeNDAP (University of Rhode Island subcontract on Naval Ocean Partnership Program Grant and NASA Earth Science Enterprise) NWS/COMET Case Studies (NOAA NWS) 18 The Web Well-developed connections People Document references Embedded multimedia Embedded interactive applets Powerful tools
Google Dreamweaver Documents Web-site management tools Web services Data 20 Data Access Technologies People Documents Web-based data interactions with passive gif images -- most analysis work done on remote server Traditional Unidata IDD with analysis on local clients
Combinations with Web browse and FTP delivery for local analysis, Client/server, e.g., DODS/OPeNDAP Data All lack sophisticated, textbased Web search/discovery tools and coherent integration 22 People Documents THREDDS is the Bottom line Data Associate words of the science with available datasets Create compound documents pointing to datasets Connect analysis tools to
documents and datasets Wide range of compound documents Lists of datasets available on server with brief description of dataset classes Online publications pointing to datasets illustrating concepts Massive arsenal of Web and Digital Library search/discovery tools can be applied to compound documents 25 People Discovery and Publication Tools og ols tal To Ca tion ra
ne Ge Documents THREDDS Middleware Da ta Se Cata rv i l ce og s Discovery and Publication Services Analysis and Visualization Tools
Data Services Data 28 Basic Compound Document THREDDS Server Inventory Catalog Inventory list of datasets on server Generated automatically with minimal human input Viewed from within analysis and display application Can be harvested for inclusion in GCMD, DLESE, NSDL for use by module builders 30
Enhanced Metadata Catalog 31 Compound Publication: Educational Module within Interactive Analysis Tool Discovery at DLESE module at DPC VGEE tool at Unidata datasets at NCAR Lends itself well to Web discovery tools, DL integration Can be: education module online scientific publication
32 Browser-base Thin Client Access LDEO/IRI web site publishes catalog of datasets available on server at UCAR Catalog resides and is updated at UCAR Browsing of datasets on UCAR server from LDEO server Also enables analysis and display of datasets on UCAR server using tools on LDEO server 33 Future Directions Standards-based web services approach
to providing both data and metadata Integrate GIS clients and servers into THREDDS for access to societal impacts, infrastructure, hydrology data, etc. Work with OGC and ISO to incorporate emerging standard access protocols into THREDDS Actively participate in future DLESE Data Access Working Group and Data Services workshops to create more compound document educational module. 44 THREDDS, GIS, DL Interoperability THREDDS Client Applications GIS Client Applications OGC or proprietary GIS protocols
OGC or OPeNDAP ADDE. FTP protocols OpenGIS Protocols: WMS, WFS, WCS GIS Servers GIS Server Demographic, infrastructure, GIS Server societal impacts, datasets Metadata crosswalk THREDDS Servers THREDDS Server THREDDS
Server Satellite, radar, forecast model output, datasets Metadata crosswalk Open Archives Initiative (OAI) Metadata Harvesting Digital Library Discovery Systems 45 Summary Universities have used Unidata tools to acquire, analyze, and display real-time atmospheric data for nearly 20 years THREDDS along with related client/server access and display technologies-- makes an even broader menu of Earth system data to a more diverse community of users THREDDS technologies enable the creation of
compound educational modules and scientific publications with embedded pointers to datasets and tools. 46 More Information http://my.unidata.ucar.edu/ http://www.unidata.ucar.edu/projects/THREDDS/ [email protected] 47