The Arctic Boreal Vulnerability Experiment and Big Data

The Arctic Boreal Vulnerability Experiment and Big Data

The Arctic Boreal Vulnerability Experiment and Big Data Analytics for Ecosystem Science and Data Management Stephen D. Ambrose1, Elizabeth Hoy,2 Peter Griffith3 NASA CISTO Climate Model Data Services (CDS), 2, 3 NASA Carbon Cycle and Ecosystems Office 1 NASA, GSFC Greenbelt, Maryland 1 Presentation Outline Introduction The ABoVE Campaign & Data Management NASAs High Performance Compute Capabilities CISTO, NCCS, CDS and the ADAPT system Data Sets for ABoVE Analytics Examples Summary 2

Arctic-Boreal Vulnerability above.nasa.gov ABoVE is a large-scale NASA-led study of environmental change in arctic & boreal regions and the implications for ecological systems and society Our overarching Science Question is How vulnerable or resilient are ecosystems and society to environmental change in the arctic and boreal region of western North America? above.nasa.gov 3 The Carbon Cycle & Ecosystems Office is responsible for implementation and management of ABoVE Science team members work closely with the CCEO and rely upon our guidance for field operations and safety, communications with local and regional stakeholders and authorities, and utilization of ABoVE cyberinfrastructure. The ABoVE Science Cloud combines high performance computing with emerging technologies, such as analytics, and data management tools for analyzing and processing geographic information to create an environment specifically designed for large-scale modeling, analysis of remote sensing data, copious disk storage for big data with integrated data management, and integration of core variables from in-situ networks. 44

The CCE Office will assist the Science Team throughout the Data Management Lifecycle. Plan Data Reuse Other Data Analyze Collect Integrate QA/QC Discover Archive Traditional Project Analyze Publish Preserve

Traditional Project Describe Data Reuse Closing the Data Life Cycle Augmented from Regg et al 2014 in Front Ecol Environ 5 5 The ASC will surround these aspects of the data lifecycle. Science Definition Team CCE Office NASA HQ Experiment Plan Plan Call for Proposals Selection of Funded Proposals/Scientists Plan for the next Phase of ABoVE Analytics Collect Identify Conventions CF Metadata, Ameriflux, GTN-P

SensorML, Instrument Vendors, SmartPhone Apps Other Data Integrate ESRI, R, MatLab, Python, IDL Model and Observational Data NASA DAACs NSIDC QA/QC Discover Governance Analyze Traditional Project Publish

Describe Preserve CCE Office Data Reuse Closing the Data Life Cycle Augmented from Regg et al 2014 in Front Ecol Environ 6 6 Background: CISTO Conceptual Service Layers CISTO Conceptual Service Layers Computational and Information Sciences and Technology Offices (CISTO) Service Layers... Brings together the tools, data storage and high-performance computing to for timely analysis over large-scale data sets, where the data resides, to ultimately produce societal benefits Serving the Earth Science Community 7

NASA Center for Climate Simulation (NCCS) CISTO Center provides an integrated high-end computing environment designed to support the specialized requirements of Climate and Weather modeling. High-performance computing, data storage, and networking technologies High-speed access to petabytes of Earth Science data Collaborative data sharing and publication services Advanced Data Analytics Platform (ADAPT) High Performance Science Cloud Current Primary Customers (NASA Climate Science) Global Modeling and Assimilation Office (GMAO) Goddard Institute for Space Studies (GISS) High-Performance Science http://www.nccs.nasa.gov 8 Advanced Data Analytics Platform (ADAPT) High Performance Science Cloud High Performance Science Cloud is uniquely positioned to provide data processing and analytic services for NASA Science projects

Adjunct to the NCCS HPC environment Lower barrier to entry for scientists Customized run-time environments Reusable HPC/Discover hardware Expanded customer base Scientist brings their analysis to the data Extensible storage; build and expand as needed Persistent data services build in virtual machines Create purpose built VMs for specific science projects Difference between a commodity cloud Platform-as-a-Service that comes close to matching HPC levels of performance Critical Node-to-node communication high speed, low latency Shared, high performance file system Management and rapid provisioning of resources High Performance Science Cloud Conceptual Architecture 9 System Components/Configuration Capability and Description Configuration Persistent Data Services Nodes with 128 GB of RAM,

10 GbE, and FDR IB Virtual machines or containers deployed for web services, examples include ESGF, GDS, THREDDS, FTP, etc. Database High available database nodes with solid state disk. Nodes with 128 GB of RAM, 3.2 TB of Solid State Drive (SSD), 10 GbE, and FDR IB Remote Visualization Enable server side graphical processing and rendering of data. High Performance Compute More than 1,000 cores coupled via high speed Infiniband networks for elastic or itinerant computing requirements. High-Speed/High-Capacity Storage Petabytes of storage accessible to all the above capabilities over the high speed Infiniband network. Nodes with 128 GB of RAM, 10 GbE, FDR IB, and GPUs ~100 nodes with 32 to 64 GB

of RAM, and FDR IB Storage nodes configured with a total of about 3 PB of RAW storage capacity 10 NASA Climate Model Data Services Data Publication and Distribution Services Data Publication Services Web Access For downloading small files File Transfer Protocol (FTP) Anonymous FTP supporting wget Protocol Download Subsetting 2D Visualization HTTP FTP GRads Data Server (GDS) Data subsetting and analysis services OPENDAP Live Access Server (LAS) Data subsetting and analysis services OPENDAP

THREDDS Data Server (TDS) subsetting , analysis, & visualization OPENDAP Earth System Grid Federation (ESGF) Data access to IPPC CMIP data OPENDAP Web Map Service (WMS) Data publication to IPPC CMIP Format OPENDAP 11 ODISEA Ontology-Driven Interactive Search Environment for ABoVE ABoVEs metadata search engine for project and data access Built in house, it is derived from Langleys Atmospheric Science Data Center ODISEES search engine for Earth

Science Finds and compares variables from heterogeneous data sets 12 ABoVE Science Cloud Solution for the ABoVE Computing and Storage Services Requirement is the innovative High Performance Science Cloud Partnership between the CCE, CISTO, and NCCS Provide compute, storage, data management, and data publication for the ABoVE campaign using the HPSC Reduces technical overhead for ABoVE scientists Allows scientists to focus on science in a optimized computing environment The Conceptual Architecture to support analytics: Data analysis platform collocating data, compute, data management, and data services Ease of use for scientists; customized run-time environments; agile environment Data storage surrounded by a compute cloud Large amount of data storage High performance compute capabilities Very high speed interconnects 13

Staged / Common Data Sets in the ABoVE Science Cloud Common datasets Staged for ABoVE investigators Staged and available for direct use Individual investigators dont have to locate and download Additional datasets can be added Data Management services Preliminary Staged datasets Landsat, Surface reflectance, 123 TB MODIS, Daily surface reflectance, 57 TB NGA High Resolution Imagery, 447TB MERRA, GEOS-5 reanalysis, 89 TB ABoVE Core and Extended Domains 14 ABoVE Specific CDS Services: NGA/DigitalGlobe High Resolution Commercial Satellite Imagery National Geospatial Agency (NGA) has licensed all DigitalGlobe 31 cm satellite imagery for US Federal use, i.e., NSF, NASA and NASA funded projects. DigitalGlobe Satellite Fleet Archive of 4.2 billion km2 of data from 2000 to present Data from six different satellites: Worldview-1, 2 and 3; Ikonos; Quickbird; and Geoeye-1 Access to NGA imagery (~3-4/ km2) at no cost to NASA

Satellite Bands Nadir Panchromatic Resolution (m) Nadir Multispectral Resolution (m) Ikonos Pan, R, G, B, Near IR 0.82 3.2 GeoEye Pan, R, G, B, Near IR 0.41 1.65 Quickbird

Pan, R, G, B, Near IR 0.55 2.16 WorldView-1 Panchromatic only 0.5 N/A WorldView-2 Pan, R, G, B, Near IR 1, Near IR 2, Coastal, Red Edge, Yellow 0.46 1.85 WorldView-3 Same as WV-2 plus 8 SWIR bands and 12

CAVIS bands 0.31 1.24 Worldview 3 15 ABoVE Science Cloud DigitalGlobe Imagery: ABoVE Study Domain Note: Imagery is included from all seasons and for years 1999-2015. 16 Other Sources of Data that will be used in the ABoVE Science Cloud - Researchers will share their data using ABoVEs cyberinfrastructure and/or partnering networks - Storage in the ASC will be tailored to meet Science Team needs - Using ORNL DAAC best practices will facilitate data integration 17 NASAs Climate Analytics-as-a-Service APPLICATION SERVICES MERRA AS Functional Use Case How an external NASA user calculates the global monthly temperature average over 42 layers of

the atmosphere for the last 30 years Example Download Times For 80TB 18 Analytics Application Support Takes in large amounts of input and creates a small amount of output Use large amounts of distributed observation and model data to generate science Analysis applications are typically 100s of lines of code Python, IDL, Matlab, custom Agile environment users run in their own environments Example Decadal water predictions for the high northern latitudes for the past three decades (Mark Carroll) Requires 100,000+ Landsat images and about 20 TB of storage M. Carroll

19 Projects: Estimated Biomass in South Sahara Using NGA data to estimate tree and bush biomass over the entire arid and semi-arid zone on the south side of the Sahara Project Summary: Estimate carbon stored in trees and bushes in arid and semi-arid south Sahara Establish carbon baseline for later research on expected CO2 fertilization of photosynthesis will first be manifested in the arid and semi-arid zones, because bushes and trees will use less water to grow, thus growing more Replicate technique globally (i.e., Arctic where bushes and shrubs are moving into the tundra because of warmer conditions) Proven successful method can be expanded for all

arid and semi-arid areas of the planet if successful Principal Investigator: Compton J. Tucker, NASA Goddard Space Flight Center Shadow Tree Crown NGA Imagery representing tree & shrub automated recognition 20 ABoVE Data Management at CISTO ArcGIS Desktop and Portal ABoVE Users Account Setup User

Data Upload NCCS/ABoVE User Working Directory Space Curated Peer Reviewed Science Products ODISEA Data Discovery/Access System ADAPT Research Data and Product Access Satellite /Remote Sensing Image Raster Data Provided as Needed Other Observation Datasets Provided as Needed

ABoVE Curated Data Pool Ready for DAACs Data Backup Until Data Reaches DAAC ABoVE P. I. Phase 1 & 2 Products Pre-ABoVE Curated Products ArcGIS Server and Portal Access DAAC Archive Researchers Policy Makers Decision Makers Applications 21

Summary: Technical Infrastructure for ABoVE CITSOs 10 Year Commitment to Technical Solutions for Carbon Cycle & Ecosystems ADAPT access (ASC), compute, storage, staged data, analytics, and technical Support Data Management services/System Admin/User Support/Help Desk Metadata, catalog, DOI, and science product transition to DAACs ESRI ArcGIS Services for GIS (Server, Portal, Desktop) On site access to NASA and NGA satellite imagery Data search capability (ODISEA) The ABoVE Science Cloud is a collaboration that promises to accelerate the

pace of new Arctic science for researchers participating in the field campaign. Furthermore, by using the ABoVE Science Cloud as a shared and centralized resource, researchers reduce costs for their proposed work, making proposed research more competitive. (source: CCE Office) 22 Thank You Questions? 23

Recently Viewed Presentations

  • S GIO DC V O TO TNH IN

    S GIO DC V O TO TNH IN

    sỞ giÁo dỤc vÀ ĐÀo tẠo tỈnh ĐiỆn biÊn cuỘc thi thiẾt kẾ bÀi giẢng e-learning bÀi giẢng period 50.unit 8: celebrations
  • Solving Inequalities Using Addition & Subtraction An inequality

    Solving Inequalities Using Addition & Subtraction An inequality

    Solving Inequalities Using Addition & Subtraction "x < 5" means that whatever value x has, it must be less than 5. Try to name ten numbers that are less than 5! "x ≥ -2" means that whatever value x has,...
  • Acids and Bases pH and Titrations - Geneva High School

    Acids and Bases pH and Titrations - Geneva High School

    Sour milk - lactic acid. Vinegar - acetic acid. Grapes - tartaric acid. Strength of acids. Acids that are weak electrolytes are weak acids. Astrong acid is one that ionizes completely . in aqueous solutions. ... Acids and Bases pH...
  • Notional Program Management Career Map* v6 - HCI

    Notional Program Management Career Map* v6 - HCI

    Notional Program Management Career Map* v6. Contracting Specialist. Operations Research Analyst. General Engineer. Software Engineer . Test & Evaluation Analyst. Logistics Mgmt Specialist. Program Analyst. Project Engineer. Program Integrator. Integration Product/Project . Team Lead. Capability Test Team Chair. Sustainment ....
  • Food Chains and Food Webs

    Food Chains and Food Webs

    A food web is "an interlocking or overlapping pattern of food chains" Food Webs In the wild, animals may eat more than one thing, so they belong to more than one food chain. To get the food they need, small...
  • Promación de estilos de vida saludable - WordPress.com

    Promación de estilos de vida saludable - WordPress.com

    Puede intervenir en el factor de estilo de vida sedentaria incentivando a las personas a realizar actividad física, ya que la falta de ejercicio físico o actitud sedentaria es un claro factor de riesgo para la enfermedades Se puede incidir...
  • Diapositive 1 - Free

    Diapositive 1 - Free

    SERVUCTION ET BLUEPRINTING La notion de service est liée à celle de la servuction La servuction c'est l'organisation systématique et cohérente de tous les éléments physiques et humains nécessaires à la réalisation d'une prestation de service dont le niveau de...
  • Dealing with Resistance to Change PARC Lasallian Institute

    Dealing with Resistance to Change PARC Lasallian Institute

    Dealing with Resistance to Change PARC Lasallian Institute Dr. Carmelita I. Quebengco AFSC "There is nothing more difficult to take in hand, more perilous to conduct, or more uncertain in its success than to take the lead in the introduction...