XML and Web Services for Astronomers Roy Williams

XML and Web Services for Astronomers Roy Williams

XML and Web Services for Astronomers Roy Williams California Institute of Technology [email protected] Robert Brunner University of Illinois [email protected] ASASS 2002, 13 October 2002, Baltimore Roy Williams, Robert Brunner XML and Structured Data XML Syntax VOTable and other formats

Transformation, Parsing, Binding What is Markup? Memorandum Markup in a document means extra tags to define the meaning of the text.

From: Antonio Stradivarius
To: Domenico Scarlatti
Date: 13 April 1723

Message: Io bisogno una appartamento acoglienti a Cremona

This markup is HTML Memorandum From: Antonio Stradivarius To: Domenico Scarlatti Date: 13 April 1723 Message: Io bisogno una appartamento acoglienti a Cremona Structure with XML

Separation of structure from presentation Antonio Stradivarius Domenico Scarlatti Rendering 13 4 1723 4/13/23 April 13, 1723 Io bisogno una appartamento acoglienti a Cremona

17.iv.1723 Processing The computer can read the document: Find all memos from April 1723 Why XML? XML is a standard way to represent structured documents, including metadata and data Platform neutral / Open Vendor supported / Vendor neutral Proven -- decades with SGML Extensible

Syntax checking -- Explicit Schema Industry convergence Web friendly Why XML? Documents and data Human readable, editable, mailable Can encode many data models Can encode program too Many tools Parsers in Java, C, C++, Perl, Python, ... Browsers and editors XML databases Style sheets, formatting, transformation

What is Markup? Markup is everywhere Latex, Postscript, FITS, . From here we consider only XML dialects: XML VOTable HTML XDF XML Usage Model SQL Structured

Data storage XML XSLT presentation forms queries HTML Interoperability DB

Web Services GET, POST, SOAP Service Workflow Catalog Service Query Check Service Users code Query

Estimator Archive Service Crossmatch Service Archive Service Storage Service SOAP envelopes of XML: VOTable and other VO dialects

AND broadband binary XML and Structured Data XML Syntax VOTable and other formats Transformation, Parsing, Binding XML Syntax Start tag Content

Antonio Stradivarius Element End tag White space is part of the content -- Many applications ignore it Element names are case-sensitive is not

XML Syntax Empty element: is equivalent to Note that the HTML constructions

Are not proper: should be

13 parent-child

4 1723 One element has no parent Root or Document element siblings Attributes An attribute is a name-value pair inside the start tag. Antonio Stradivarius

Dont forget the quotes! Name must be unique in element Can use an empty element with attributes Element Names Names can have a-Z 0-9 _ - . :

Colon is reserved for namespaces Names cannot have ` $ ^ % ; <> <> This is good XML 011 33 91 55 46 23 98 Text in XML Must escape five symbols

Symbol escapes < > & This is Greek theta θ François not Francois! < > & "

' H < 3 & K > 4 Patrick O'Reilly See http://www.unicode.org Bulk escape through CDATA 4 Patrick OReilly ]]> Other stuff

omments rocessing Instructions ?myprinter color="purple" ?> ?robots ignore="yes" ?> ?xml-stylesheet type="text/xsl" href="http://us-vo.org/xml/VOTable-basic.xsl"?> Well-formed XML Every start tag must have an end tag match Elements may nest, but not overlap (this is wrong) There must be exactly one root element Attribute values must be quoted

An element cannot have 2 attributes of the same name No comments inside tags No unescaped <, >, & in element text or attibute text Etc etc Validation (DTD/Xschema) XML dialects Applications accept particular types of data Adobe Illustrator takes Scalable Vector Graphics ML VO applications take VOTable Browser takes Platform for Privacy Preferences ML Validation checks the XML file

Against DTD (Document Type Definition> Against Xschema Validation is Optional Checks if Instance is member of Class DTD Inherited from past, not XML Example from VOTable.dtd > XSchema XML-based document definition Elements can be more complex Type derivation and inheritance Occurrence constraints

Eg a marriage has exactly two people Simple data types For Character data and attributes string, integer, dateTime, etc Patterns Eg a US phone number is xxx-xxx-xxxx Namespaces! Xschema fragment

Namespaces We took the table and chair dimensions, and wrote them in a table. Namespace = mydomain.com/furniture Namespace =

mydomain.com/word-processing This is a URI (NOT a URL). A URI is a unique string. A URL is an address on the Internet. FITS keywords have no namespace! Namespaces For reusing document definitions Xschema Example

13 4 1723 Class Instance

Xschema Example

XML and Structured Data

XML Syntax VOTable and other formats Transformation, Parsing, Binding VOTable VOTable = hierarchy of Metadata + Tables Metadata = Parameters + Infos + Descriptions + Links + Fields Table = list of Fields + Data Data = stream of Rows

Row = list of Cells Cell = Primitive or variable-length list of Primitives or multidimensional array of Primitives Primitive = integer, character, float, floatComplex, etc Data in VOTable Data expressed in XML Or FITS binary table

Or BINARY format simple format, can seek, parallelize VOTable Stream STREAM can use different protocols:

Data in VOTable Table cell is array of primitives datatype Meaning "boolean" Logical "L" 1 Bit "X" *

Byte (0 to 255) "B" 1 "bit" "unsignedByte" "short" "int" FITS Bytes Short Integer

"I" 2 Integer "J" 4 "long" Long integer "K" 8 "char" ASCII Character

"A" 1 "unicodeChar" "float" "double" "floatComplex" Unicode Character 2 Floating point "E" 4

Double "D" 8 Float Complex "C" 8 "doubleComplex" Double Complex "M" 16 Metadata in VOTable Column header == FIELD

Has name, ID, unit, accuracy, etc Has datatype, arraysize Has UCD PHOT_INT-MAG_B ORBIT_ECCENTRICITY STAT_MEDIAN INST_QE Integrated total blue magnitude Orbital eccentricity Statistics Median Value Detector's Quantum Efficiency VOTable Example

This parameter is designed to store the observer's name

Some bright stars datatype="float" precision="F3" width="7"/> VOTable Example

Whitespace separated tokens
Procyon114.827 5.227 4 5 3 4 3 2 1 2 3 3 5 6
Vega279.234 38.7828 7 8 6 8 6
for array of primitives VOTable Example

XDF (NASA Goddard) N-dimensional blocks

Spatial information Scalar, vector fields on grid Tables of multidimensional Spectra with their wavelength scales, images with coordinate axes, vector fields with unitDirection, data cubes in complicated spaces, tables with column headers, and series of tables with each table having a unique name XDF Example

01-12-99 m/s

XDF Example

AML: Astronomical Markup Language' Standard exchange format for metadata in astronomy astronomical object

article table set of tables image person project AML Example UGC 6 MCG+04-01-013

000309.55 +215736.4 Seyfert_2 Sc 0.02226 1.1 x 0.8 14.62 105 1997ApJS..108..155G 1997ApJS..108..229H

XML and Structured Data XML Syntax VOTable and other formats Transformation, Parsing, Binding XPath and XSLT XSL Extensible Style Language XSLT

Extensible Style Language Transformation XML document XSLT engine XSL stylesheet Output (HTML, Latex, Excel, ...) XSLT example see http://us-vo.org/VOTable

for details XSLT in the browser ?xml-stylesheet type="text/xsl" href="http://us-vo.org/xml/VOTable-basic.xsl"?> First line of XML document ?xml-stylesheet is a processing instruction Works with Netscape 7 And IE 6 -- set security to medium-low see http://us-vo.org/VOTable for details Building XSLT This document is a stylesheet

When you see this Xpath template

The Memo Day is:

Copy this text Then the text of the relevant element

XML Parsing with SAX SAX: Event-Based Handlers for StartElement, Text, EndElement, etc. startElement Memo startElement From characters Antonio Stradivarius endElement From startElement Date startElement Day characters 13 . XML Parsing with SAX try {

XMLReader parser = XMLReaderFactory.createXMLReader(); parser.setContentHandler(new myHandler()); parser.parse("http://musicalmemos.org/strad.xml"); } catch(SAXParseException e) { // Well-formed error } catch(SAXException e) { // Could not find XMLReader } catch(IOException e) { // could not read file from net } XML Parsing with SAX

public class myHandler implements ContentHandler { public void startElement(, String elementName,, Attributes atts){ } public void endElement(, String elementName, ){ } public void characters(char[] test, int start, int length){ } + some other methods } XML Parsing with DOM DOM: Document Object Model Returns a tree-like Document object with data attached Memo

Body From Antonio Stradivarius Domenico Scarlatti Date To Day Month Year 13 4

1723 Io bisogno una appartamento acoglienti a Cremona Parsing XML with DOM DOMParser dp = new DOMParser(); dp.parse(("http://musicalmemos.org/strad.xml"); Node nd = dp.getDocument().getDocumentElement(); int count = numberOfNodes(nd); public int numberOfNodes(Node nd){ int number = 1; Nodelist nl = nd.getChildNodes();

for(int i=0; i XML Binding Automatically makes code from DTD/XSchema eg. Element generates getDay(), setDay() getMonth(), setMonth()

getYear(), setYear() Much easier than building it with DOM XML Binding Votable v = votw.getVotable(); // just get the first resource -- there may be more that we ignore Resource r = null; if(v.getResourceCount() > 0) r = (Resource)v.getResourceAt(0); else // just get the first table -- there may be more that we ignore

Table table = null; if(r.getTableCount() > 0) table = (Table)r.getTableAt(0); else XML Binding Parsing VOTable Finding the RA, dec columns by UCD for(int i=0; i Also soon JAXB java.sun.com/xml/jaxb/ Web Services for Astronomers What are Web Services Web Service Architecture

Building Web Services The Future of Web Services What are Web Services? Web (From Dictionary.com) 1. 2. 3. A latticed or woven structure Something intricately contrived, especially something that ensnares or entangles. A complex, interconnected structure or

arrangement Shorthand for the World Wide Web What are Web Services? Service (From Dictionary.com) 1. 2. 3. The performance of work or duties for a superior or as a servant An act or a variety of work done for

others, especially for pay Assistance; help Slang terms not suitable for print. What are Web Services? Web Service Distributed Computing Model Self-Contained Modular Applications Platform Independent Language Independent Or

An unpaid act of performing intricately contrived work for others that ensnares all? Hello World public class HelloWorld { public java.lang.String getMessage() { return "Hello World!" ; } public static void main(String[] args) { HelloWorld hw = new HelloWorld() ; System.out.print(hw.getMessage()) ; } }

What are Web Services? A Service that is accessed via the Web! Who is in Control? W3C (www.w3c.org) WSDL SOAP/XML Protocol Web Service Activity Oasis (www.oasis-open.org) ebXML UDDI

WS-I (www.ws-i.org) W3C Web Services Activity OASIS WS-I How is this different? RPC Model Exists! CORBA COM/DCOM RMI

Web Services use XML!!!!! Practical Examples Business to Business Inventory Records Bill of Laden Purchase Orders Business to Consumer Financial Data Spelling/Searching Product Listings Airline Reservations

Google Amazon Multiple Invocations Practical Benefits Programmatic Access Platform/Language Independent Compose/Distribute What about Astronomy Name Resolution NED/SIMBAD Models

Image Access virtualsky Catalog Access Intelligent Archive Queries Catalog Joins Cross Identification Servers Cone Search Profiles

SDSS EDR Cone Search Web Service Paradigm Service Oriented Programming Dynamically Locate Services Services are ON the Network Services can be coupled Multiple Transport Protocols HTTP, SMTP, FTP,

Multiple Message Encodings SOAP, XML-RPC, XP(?), Web Services for Astronomers What are Web Services Web Service Architecture Building Web Services The Future of Web Services Web Service Architecture Three Primary Roles

1. 2. 3. Service provider Service requester Service broker Broker Provider Requestor Web Service Architecture

Framework must support 1. 2. 3. Publishing Service Finding a Service Binding a Service Web Service Lifecycle 1. Service Must be Created 2. Service Must Be Published 3. Service Must be Easily Located

4. Service Must be Invoked/Called 5. Service must be Unpublished Service Provider Creates the Service New Service Wrap Legacy Service Wrap Other Services Publishes the Service Registries Standard Hierarchies

Supports the Web Service Unpublishes the Service Service Broker Maintains Service Registry Simplifies Service Location Categorization Query Support Service Requestor Locates Service Invokes Service Direct

Request Indirect Request The Big Three Service Description WSDL The most important, everything else derives from this Service Invocation SOAP Dominant Communication Protocol (XML Protocol) Service Publication UDDI Being Pushed Hard, but future not clear. (OGSA)

Describing a Service Web Services Description Language (WSDL) http://www.w3.org/2002/ws/desc/ XML Document that provides the public interface to a Web Service

Public Methods Data Type Information (IN/OUT) Transport Protocol Binding Information Service Location The What, Where, and How! Invoking a Service Simple Object Access Protocol (SOAP) Although as of V1.2 SOAP is no longer an acronym http://www.w3.org/2000/xp/Group/ XML protocol for exchanging messages Platform/Language Independent

Different Transport Protocols (General Case) HTTP/HTTPR SMTP FTP BEEP Publishing a Service Universal Description, Discovery, and Integration (UDDI) http://www.uddi.org (Now under OASIS) Technical specification for building WSDL

document repositories Documents can be published Document can be searched Formal Hierarchy UDDI Registry implements the specification IBM, Microsoft, SAP, etc. have public Registries astrouddi.org (?) Hello World (WSDL Style)

WSDL Definitions Element xmlns="http://schemas.xmlsoap.org/wsdl/"> WSDL Document Elements The datatypes used by the Web Service The abstract definition of the data being transmitted The abstract operations that constitute the Web service

The concrete protocol and data format used by the Web service The address for a single communication endpoint An aggregation of related ports WSDL Types Define the datatypes used as arguments to the Web service as well as the return values from a Web service

Preferably XML Schema XSD namespace Must Handle nillable (Java Wrapper Classes) SOAP WSDL Types Map WSDL (XSD) to Language (e.g., Java) xsd:boolean

boolean xsd:byte byte xsd:double double xsd:float float

xsd:int int xsd:long long xsd:short short xsd:dateTime java.util.Calendar

xsd:decimal java.math.BigDecimal xsd:hexBinary byte[] xsd:base64Binary byte[] xsd:QName

javax.xml.namespace.QName xsd:integer java.math.BigInteger xsd:string java.lang.String WSDL Types Recommended approach Use Elements not Attributes

Only define types that refer to abstract content of messages (not protocols) Array types should extend the SOAP Array type Name scheme: ArrayOfXXX Xsd:anyType used to represent any type. Web service Messages

Interactions between Web service client and server are called messages Message element describes the messages that can be exchanged Logical definition of a type of message that may be used by operations listed in portType element Input Output Fault Message

Components Message must have a local name Web service Messages Components (wsdl:message element) Message must have a local name Use WSDL Namespace Zero or more Part descriptions part name

part type Arguments or return parameters. Should follow XML Schema Message element Future?

WSDL Port Types WSDL defines four transmission primitives (or operations) that an endpoint can support One-way (input element) Request-response (input then output element)

The endpoint receives a request, and sends a correlated response. Solicit-response (output then input element) The endpoint receives a request, but does not send a response. The endpoint sends a response, and receives a correlated response.

Notification (output element) The endpoint sends a response, but does not receive a request. WSDL portType A portType element defines the interfaces that a Web service exposes. Similar to a Class

Module or Function Library The operations are the class/module/library methods.

WSDL Binding Defines message format For a given portType, defines protocol for operations for messages Requires unique name attribute Type attribute is portType Qname

WSDL Services A port defines a single endpoint The port can then be used for binding Multiple ports can reference the same address with different protocols A Service consists of one or more ports A service defines a single serviceType

& name="HelloWorld" binding="impl:HelloWorldSoapBinding"> Invoking a Service Use SOAP to communicate messages SOAP Sender to SOAP Receiver

Potential SOAP Intermediaries Essentially a one-way communication between SOAP nodes. RPC style Document style SOAP Basics Message is wrapped in the Envelope Envelope consists of Header (Optional) used by intermediaries Body contains the actual message

Document Service Call Fault Handling Child element of body Contains Reason and Code elements SOAP Basics Fault Handling (V1.2) Fault Element is a child element of body No other elements in the body

Contains Reason element (Mandatory) Code element (Mandatory) Standard List Detail element (Optional) Node element (Optional) Role element (Optional) SOAP Request (HelloWorld)

POST /axis/HelloWorld.jws HTTP/1.0 Content-Type: text/xml; charset=utf-8 Accept: application/soap+xml, application/dime, multipart/related, text/* User-Agent: Axis/1.0 Host: localhost Cache-Control: no-cache Pragma: no-cache SOAPAction: "" Content-Length: 407

SOAP Response (HelloWorld) HTTP/1.1 200 OK Content-Type: text/xml; charset=utf-8 Connection: close Date: Wed, 09 Oct 2002 21:34:47 GMT Server: Apache Tomcat/4.0.6 (HTTP/1.1 Connector) Set-Cookie: JSESSIONID=8A6802F3136B882A53BC0E8E1E30F8CC;Path=/axis Hello World! Web Service Registries UDDI Currently Dominant Public Registries

IBM, MS, SAP, etc. Private Registries UDDI Functions Describe services Discover businesses

Integrate business services The MetaData Problem UDDI Registry Business Can have multiple Services Business Entity Service

Has an associated specification Specification Pointers Detailed information on service Service Types Defined by a tModel

tModel and WSDL UDDI Registry UDDI Private Registry Some development tools or products provide private UDDI registry server Java WS Developer pack. Oracle JDeveloper IBM WS toolkit MS VS .NET

Greater control, no registration! Web Services for Astronomers What are Web Services Web Service Architecture Building Web Services The Future of Web Services Building Web Services Simple Demonstration of Deploying a Web service Use Java (but other options exist: .NET,

Perl, python, etc.) Tomcat Server AXIS SOAP Server Installation & Setup Install Tomcat Deploy Axis web apps into Tomcat webapps directory. Start Tomcat Server Validate AXIS Installation

Web Service Deployment Simple Technique (JWS) Copy Java Source file containing the method(s) to be exposed to axis directory HelloWorld.java -> HelloWorld.jws Complex Technique (WSDD) AXIS solution Web Service Deployment Descriptor Annotations (.NET approach) [WebMethod] Hello World (Java)

public class HelloWorld { public java.lang.String getMessage() { return "Hello World!" ; } } Hello World (CSharp) using System.Web.Services ; public class HelloWorld : WebService { [WebMethod] public string getMessage() { return "Hello World!" ; } }

View WSDL Web Service Client Generate Client Stub from WSDL wsdl2java tool included with AXIS >java org.apache.axis.wsdl.WSDL2Java http://localhost:8080/axis/HelloWorld.jws?wsdl Generates

localhost\HelloWorld.java localhost\HelloWorldService.java localhost\HelloWorldServiceLocator.java localhost\HelloWorldSoapBindingStub.java Utilizing the Stub Classes HelloWorldClient.java package localhost ; public class HelloWorldClient { public static void main(String[] args) throws Exception { // Make a service HelloWorldService service = new HelloWorldServiceLocator(); // Now use the service to get a stub

HelloWorld port = service.getHelloWorld(); System.out.println(port.getMessage()); } } AXIS Extras Generate Server Skeleton Stub from WSDL >java org.apache.axis.wsdl.WSDL2Java s http://localhost:8080/axis/HelloWorld.jws?wsdl Generates

localhost\HelloWorldSoapBindingImpl.java More arguments for additional functionality AXIS TCP Monitor Web Services for Astronomers What are Web Services Web Service Architecture Building Web Services The Future of Web Services

Roadblocks or Speedbumps? Reliable Protocol Needed (HTTPR, BEEP) Lack of State Implementation Inconsistencies unsigned multipart/structures Security! Reliable Protocols HTTP Reliable HTTP IBM Initiative http://www-106.ibm.com/developerworks/library/ws-phtt/

Adds Persistence to HTTP BEEP (Blocks Extensible Exchange Protocol http://www.ietf.org/rfc/rfc3080.txt Connection-oriented Asynchronous interactions

DIME (Direct Internet Message Encapsulation) General purpose binary message format Enable Web services to efficiently handle multiple attachments Encrypted messages Graphics Multimedia content General Documents

DIME Message (application/dime) 1+ records to deliver payload Chunking http://www.ietf.org/internet-drafts/draft-nielsen-dime-02.txt BPEL4WS Business Process Execution Language for Web Services Implementing executable business processes. Describing non-executable abstract processes.

Merging of WSFL and Xlang Ugliest WS Acronym award Define new Web service as a composition of existing Web services Simplify Development/Deployment J2EE Web Services Java APIs for XML JAX-RPC JAXM (SAAJ) JAXR JSR 109 - Implementing Enterprise Web Services

JSR 110 - Java APIs for WSDL Security Issues include Message Integrity Message Confidentiality Authentication Technologies include Secure Sockets Layer (SSL) Transport Layer Security (TLS)

Message Encryption Digital Signatures But Standards !!!!! Summary Web services provide a powerful programming paradigm Mucho Hype Looking for Real Applications (NVO) Open Grid Services Architecture (OGSA)

Recently Viewed Presentations

  • Chapter 4

    Chapter 4

    Dihydrogen Monoxide. H 2 O, commonly known as water. Can mutate DNA!!— water can get ionized to react with DNA. Death due to accidental inhalation of DHMO, even in small quantities--- when someone drowns
  • Spinal Cord Injuries - Community Colleges Oklahoma

    Spinal Cord Injuries - Community Colleges Oklahoma

    ADULT MEN BETWEEN 15 AND 30 YEARS Anyone in a risk-taking occupation or lifestyle SCI in older clients increasing largely due to MVAs Spinal Cord Injuries Causes (in order of frequency) MVA Gunshot wounds/acts of violence Falls Sports injuries Spinal...


    The DSA-Sec Exercise allows participating schools to select some Primary 6 students for admission to Secondary 1 based on their achievements and talents before the PSLE results are released. Listed are the types of schools that your child can apply...
  • Russia Under the Tsars - White Plains Middle School

    Russia Under the Tsars - White Plains Middle School

    Russia Under the Tsars In this lesson, students will be able to define the following terms: Tsar (Czar) Peter the Great Westernization Catherine the Great Peter the Great Peter the Great was an important Russian tsar or czar from 1682...
  • Explanations of attachment Bowlby&#x27;s Theory

    Explanations of attachment Bowlby's Theory

    Explanations of attachmentBowlby's Theory. Objectives: To be able to explain Bowlby's monotropic theoryincluding the critical period and an internal working model. To evaluate Bowlby's monotropic theory, using research.
  • Appendix A - Weber State University

    Appendix A - Weber State University

    The Trend to Agile Methodologies (continued) Three Phases of Systems Analysis Since Inception (continued) Agile Methodologies. Object-oriented approach. Sacrifices milestones and multiple phases. Close cooperation between developers and clients. Many life cycle phases combined into one. Multiple rapid releases of...
  • Bio-Based Polyethylene Blends Including Non-Wood Biomass Materials Bo

    Bio-Based Polyethylene Blends Including Non-Wood Biomass Materials Bo

    PCL, PBS, aliphatic aromatic copolyesters, etc. Market Success Criteria of Bio-Based and/or Biodegradable Polymers . Performance. Must meet application requirements. ... food vs. non-food, etc. Fillers in Plastics. Calcium carbonate as a major engineered filler. Ground . Precipitated .
  • UMCARES plus Sheridan Gove Jodi Chadwell Application Developer

    UMCARES plus Sheridan Gove Jodi Chadwell Application Developer

    UMCARES allows candidates to progress through their candidacy process, allows others to assist and manage them along the way. UMCARES. ... Timed login lockouts to help prevent against hacking. In system profile access based on interrelationship between . location, access...