CLOUD COMPUTING IN LIBRARIES Basic concepts and library applications Marshall Breeding Independent Consult, Author, Founder and Publisher, Library Technology Guides www.librarytechnology.org/ twitter.com/mbreeding 9 Nov, 2012 Library Services in the Cloud Summary
Cloud computing in Libraries: trends related to the adoption of cloud computing technologies for library management and discovery products. Summary Cloud computing is one of the most important technology trends of the times. The phase of client/server computing is fading into obsolescence, replaced by entirely webbased systems, increasingly deployed through SaaS. Libraries and other technology-oriented organizations now have options through infrastructure-as-a-service offerings such as Amazons Elastic Compute Cloud and Simple Storage Service to ramp up computing capabilities quickly, enjoy free access for smaller projects, and take advantage of usage- based subscription models for largerscale production projects. Breeding expands on these
topics and provides a basic explanation of cloud computing that focuses on real advantages and disadvantages for libraries. Cloud Computing for Libraries Book Image Publication Info:
Volume 11 in The Tech Set Published by Neal-Schuman / ALA TechSource ISBN: 781555707859 http://www.nealschuman.com/ccl Cloud computing as marketing term Cloud computing used very freely,
tagged to almost any virtualized environment Any arrangement where the library relies on some kind of remote hosting environment for major automation components Includes almost any vendor-hosted product offering Cloud computing characteristics
Web-based Interfaces Externally hosted Pricing: subscription or utility Highly abstracted computing model Provisioned on demand Scaled according to variable needs Elastic consumption of resources can contract and expand according to demand Fundamental technology shift Mainframe computing
Client/Server Cloud Computing http://www.flickr.com/photos/carrick/619 52845/ p://soacloudcomputing.blogspot.com/2008/10/cloud-computing.html ttp://www.javaworld.com/javaworld/jw-10-2001/jw-1019-jxta.html Local Computing
Traditional model Locally owned and managed Shifting from departmental to enterprise Departmental servers co-located in central IT data centers Increasingly virtualized Virtualization The ability for multiple computing images to
simultaneously exist on one physical server Physical hardware partitioned into multiple instances using virtual machine management tools such as VMware Applicable to local, remote, and cloud models Gartner Hype Cycle 2009 Gartner Hype Cycle 2010 Gartner Hype Cycle 2011
Cloud computing layers Mobile Computing Infrastructure-as-a-service Provisioning of Equipment Servers, storage Virtual server provisioning Examples:
Amazon Elastic Compute Cloud (EC2) Amazon Simple Storage Service (S3) Rackspace Cloud www.rackspacecloud.com/ ) EMC2 Atmos (www.atmosonline.com/) Web-scale computing tp://googleblog.blogspot.com/2012/10/googles-data-centers-inside-look.html Amazon EC2
Amazon Machine Instances (AMI) Red Hat Enterprise Linux Debian Fedora Ubuntu Linux Open Solaris Windows Server 2003/2008
Amazon Web Services Console Software-as-a-Service Complete software application, customized for customer use Software delivered through cloud infrastructure, data stored on cloud Eg: Salesforce.comwidely used business
infrastructure Multi-tenant: all organizations that use the service share the same instance (codebase, hardware resources, etc) Often partitioned to separate some groups of subscribers Types of SaaS tp://www.samanage.com/blog/2011/08/not-all-saas-offerings-are-created-equal/ Application service provider
Legacy business applications hosted by software vendor Standalone application on discrete or virtualized hardware Staff and public clients accessed via the Internet Same user interfaces and functionality as if installed locally Established as a deployment model in the 1990s Can be implemented through Infrastructure-as-a Service
Individual instances of legacy system hosted in EC2 ASP vs SaaS rom: THINKstrategies: CIOs Guide to Software-as-a-Service Multi-tenant Salesforce: classic multitenant Salesforce.com: multi-tenant cloud infrastructure used by organizations across many industries
http://news.cnet.com/8301-30685_3-10400538-264.html Multi-Tenant vs MultiInstance http://www.zdnet.com/blog/saas/google-apps-vs-office-365-your-choice/1357 Private vs Public http://en.wikipedia.org/wiki/File:Cloud_computing_types.svg Storage-as-a-Service Provisioned, on-demand storage
Bundled to, or separate from other cloud services Examples: Enterprise: Amazon S3 (Simple Storage Service) Consumer: Dropbox Data as a service General opportunity to move away from libraryby-library metadata management to globally
shared workflows Shared knowledge bases E-resource holdings Bibliographic services Linked data applications Key Issues
Data ownership Creative commons license Data portability across competing providers Common Library Examples Cloud computing in action Cloud computing trends for libraries Increased migration away from local
computing toward some form of remote / hosted / virtualized alternative Cloud computing especially attractive to libraries with few technology support personnel Adequate bandwidth will continue to be a limiting factor Operation of a librarys Web site Fewer libraries choosing to operate their Web sites on local servers
Simple sites: Web hosting services Intermediate sites: Hosted CMS Drupal consulting firm + hosting service Complex sites Custom programming EC2 or other Infrastructure as a service Mail and Calendaring
Many libraries just use individual accounts on Gmail or similar services A more sophisticated approach uses mail services from Google, Microsoft, or others institutionally Same interface, but e-mail addresses carry the institutional domain name
Google Apps for Businesses Microsoft Exchange Online Free or low-cost for small organizations Professional levels for larger organizations Supplemental services: No advertising Back-up and recovery services Service Level agreement
Document creation and collaboration Google Docs / Google Drive Microsoft Office 365 Zoho.com Concerns / Issues: Documents as official institutional records Backup and recovery process
Private or Subject to FOIA? Data in the cloud Storage as a service Informal / small-scale Dropbox (2GB+) Microsoft Skydrive (7GB+) Mostly used as supplemental storage and for
sharing Institutional / Larger-scale Local storage still dominant When using cloud storage for institutional data Multiple tiers of backup with SLA DuraCloud, S3, many others Platform-as-a-Platform as a Service
Virtualized computing environment for deployment of software Application engine, no specific server provisioning Examples: Google App Engine SDKs
for Java, Python Heroku: ruby platform Amazon Web Service Library Specific platforms Library automation through SaaS Almost all library automation products
offered through hosted options SaaS or ASP? Data as a service SaaS provides opportunity for highly shared data models General opportunity to move away from library-by-library metadata management to globally shared workflows ILS Data Web-scale Index-based Discovery
(2009- present) Search Results Consolidated Index Search: Digital Collections Web Site Content Institution al Repositori es Aggregate
d Content packages E-Journals Reference Sources Pre-built harvesting and indexing Repositories in the cloud
Dspace institutional repository application Fedora generalized repository platform DuraSpace organization now over both Dspace and Fedora DuraCloud shared, hosted repository platform Pilot since 2009, production in early 2011 www.duraspace.org/duracloud.php Caveats and concerns with
SaaS Libraries must have adequate bandwidth to support access to remote applications without latency Quality of service agreements that guarantee performance and reliability factors Configurability and customizability
limitations Access to APIs Ability to interoperate with 3rd party applications Maintain institutional branding Using cloud computing does not mean giving up your identity Be sure that your services delivered through
your own URL Most cloud services support domain aliases Accomplished through DNS configuration Implemented by your network administrator Create CNAME entry to redirect cloud service to a subdomain associated with your library: S3.mylibrary.org = s3.amazonaws.com. Cost implications
Total cost of ownership Do all cost components result in increased or decreased expense Personnel costs need less technical administration Hardware server hardware eliminated Software costs: subscription, license, maintenance/support Indirect costs: energy costs associated with power and cooling of servers in data center
IaaS: balance elimination of hardware investments for ongoing usage fees Especially attractive for development and prototyping Personnel Distribution Local Computing
Server Administration Application maintenance Staff client software updates Operational tasks Cloud Computing Application configuration or profiling Operational tasks
Budget Allocations Local Computing Server Purchase Server Maintenance Application software license Data Center overhead
Energy costs Facility costs Cloud Computing Annual Subscription Measured
Service? Fixed fees Factors Hosting Software Licenses Optional modules Benefits of Cloud Computing Libraries
Elimination of capital expenses for equipment Lower annual costs Redeployment of technical staff to more meaningful activities Providers / Vendors
Higher revenues relative to softwareonly arrangements Provision of infrastructure at scale with lower unit costs Longer-term relationships with customers Risks and concerns Privacy of data
Ownership of data Policies, regulations, jurisdictions Avoid vendor lock-in Integrity of Data Backups and disaster recovery
Opportunities for increased redundancy Required infrastructure Adequate bandwidth Web-based applications do not necessarily require the highest-performance connectivity Able to function well in remote and rural areas?
Business applications consume less bandwidth than audio or video streaming services Reliable Internet and local network infrastructure Critical paths: Users --> provider Library locations --> provider Not: users --> library Security issues
Most providers implement stronger safeguards beyond the capacity of local institutions Virtual instances equally susceptible to poor security practices as local computing Cloud computing trends for libraries Increased migration away from local
computing toward some form of remote / hosted / virtualized alternative Cloud computing especially attractive to libraries with few technology support personnel Adequate bandwidth will continue to be a limiting factor Relevant trends
No technical limitations on scalability of infrastructure General move toward ever larger implementations of automation infrastructure National infrastructure (beginning with smaller countries) US: Statewide and regional projects Resource sharing opportunities Larger instances of automation systems
or participation in multi-tenant services provide inherent resource sharing capabilities Ever larger repositories of metadata Simpler mechanisms for patron requests of items not in local collections Increased pressure Library automation vendors promoting SaaS offerings Some companies already exclusively SaaS
Software pricing increasingly favorable to SaaS Caveat Critically assess viability of the technology and its appropriateness for your organization Start with low-risk projects before making strategic commitments Questions and Discussion