Data Strategy: Survival Guide for the Information Age

Data Strategy: Survival Guide for the Information Age

Strategy for Data Governance Replace with your name & organization Las Vegas February 18, 2008 Copyright 2012 Your organization 1 Outline Benefits of a data governance strategy Components of a data governance strategy Organization, roles and responsibilities Impact of a data governance strategy on BI and IT

How to implement a data governance strategy program Copyright 2012 Your organization 2 Why you need a data governance strategy I would like an accounting of the companys financial assets CEO Uhh let me see. I think we still have enough money in our bank accounts to cover payroll this month, and uhh Im not sure if there are any outstanding accounts receivables Uhh and hmm let me think Copyright 2012 Your organization

CFO 3 Why you need a data governance strategy I would like an accounting of the companys information assets CEO Uhh let me see. I dont really have an inventory of all the data, and Im not sure what data is in which database, or how much of that data is redundant and inconsistent. I also cant vouch for the quality of the data Uhh and hmm let me think Copyright 2012 Your organization

CIO 4 Do these problems exist in your organization? Replace with your problems Copyright 2012 Your organization 5 Do these problems exist in your organization? Room for more problems and issues Copyright 2012 Your organization 6 Motivations for Data Governance

SEC audits and risk of losing investors Risk of fines and incarceration due to inaccurate regulatory reporting Risk of losing customers due to poor data quality Loss of productivity due to excessive and uncontrolled redundancy Suboptimal business performance Copyright 2012 Your organization 7 Technology Solutions Enterprise Resource Planning (ERP) Data Warehousing (DW & BI) Customer Relationship Management (CRM) Supply Chain Management (SCM)

Copyright 2012 Your organization 8 Data Warehousing DW Promises DW Reality Data integration No more uncontrolled data redundancy Consistency of data content

Improved data quality Historical enterprise data Unlimited ad-hoc reporting Reliable trend analysis reporting Business intelligence capabilities Stove-pipe data marts and departmental data warehouses Continued redundancy, sometimes even increased data redundancy Data is still inconsistent among data marts and data warehouses (no central staging area, no

reconciliation totals) Little improvement to data quality Historical data is limited to departmental views Limited ad-hoc reporting (too complicated, missing relationships, poor performance) Inconsistent trend analysis reports among data marts BI capabilities compromised by inconsistent and unreliable key performance indicators (KPI) Copyright 2012 Your organization 9 Customer Relationship Management CRM Promises CRM Reality Data integration Non-redundant customer data

Data quality Increased customer satisfaction Product pricing customization Knowledge of customer wallet share More stove-pipe systems Continued redundancy, more departmental views, purchased packages not integrated Dirty customer data continues Decreased customer dissatisfaction because of poor-quality customer data Wrong pricing because of departmental views, still

not cross-organizational Privacy issues and dirty data led to government regulations Copyright 2012 Your organization 10 The Lesson? You cannot keep doing what you have always done and expect the results to be different. Not even with new technology. That wouldnt be logical Spock, Star Trek Copyright 2012 Your organization 11 Data Governance Defined Consultants The execution and enforcement of authority over the

management of data assets and the performance of data functions (Robert Seiner) The process by which you manage the quality, consistency, usability, security, and availability of your organizations data (Jane Griffin) A process and structure for formally managing information as a resource. Ensures the appropriate people representing business processes, data, and technology are involved in the decisions that affect them; includes an escalation and decision path for identifying and resolving issues, implementing changes, and communicating resulting actions (Danette McGilvray) Copyright 2012 Your organization 12 Data Governance Defined Clients

A framework of accountabilities and processes for making decisions and monitoring the execution of data management. (BMO) Resolving data issues using a horizontal perspective of the organization and focusing on the major pain points for our business areas. (Sallie Mae) Unites people, process, and technology to change the way data assets are acquired, managed, maintained, transformed into information, shared across the company as common knowledge, and consistently leveraged by the business to improve profitability. (Wachovia) Copyright 2012 Your organization 13 Data Governance Defined

Vendors The orchestration of people, process, and technology to enable the leveraging of data as an enterprise asset. It includes policies, procedures, organization, roles, and responsibilities, with associated communication and training required to design, develop, and provide ongoing support for the effort. (SAP) An organization-wide commitment to data quality, with data stewardship recognized as an essential business role. (DataFlux) Copyright 2012 Your organization 14 Data Governance Defined Other The execution of authority over the management of data

Data quality including conformance to valid values, uniqueness, non-redundant, complete, accurate, understood, timely, referential integrity Metadata creation and maintenance information about data, both technical and business Master data management (MDM) Data integration Data categorization for performance, availability, and security Copyright 2012 Your organization 15 Outline

Benefits of a data governance strategy Components of a data governance strategy Organization, roles and responsibilities Impact of a data governance strategy on BI and IT How to implement a data governance strategy program Copyright 2012 Your organization 16 Components of a DG strategy

Data standardization Data integration Data modeling Data quality Metadata management Security and privacy Performance and measurement DBMS and product selection Business intelligence Copyright 2012 Your organization 17 Data standardization

Formal data definitions Business data naming standards Class words lexicon Technical data naming standards Common words lexicon Data domain standards Copyright 2012 Your organization 18 Our Situation with Standardization Insert your standardization status Copyright 2012 Your organization 19 Formal Data Definitions A data definition must reflect the real-world meaning A data definition explains the content and meaning of the

unique data element A data definition must be complete enough to ensure a thorough understanding of the data element Example: Well Depth Feet Bad definition: The depth of the well in feet Good definition: The total depth of the well in feet from the surface of the surrounding ground to the deepest point dug or drilled regardless of the depth of the well casing. Data definitions are short and precise (one paragraph) and (optionally) may contain examples Data definitions should never contain information about the source or use of the data elements Source: The DW Challenge by Michael Brackett Copyright 2012 Your organization

20 Data Naming Standards - Business The name of an attribute should be derived from its definition Attribute names are always fully spelled out Attribute names should have 3 components: Prime word Example: Qualifiers (modifiers) Checking Account Monthly Average Balance Class word Attribute names should be fully qualified

Attribute names should always end with an approved class word Use only class words from an approved class words lexicon Attribute name components should be business terms, not technical terms Copyright 2012 Your organization 21 Class Words Lexicon Approved and Published Amount . . . Dec 9,2 Indicator . . . Char 1 Balance . . . Dec 13,2 Name . . . Char 15-40 Code . . . Char 1-5

Number . . . Integer Count . . . Small Int Percent . . . Dec 5,2 Date . . . Date Quantity . . . Small Int Description . . .Vchar Rate . . . Dec 6,4 Identifier . . Integer Text . . . Varchar 250 Business Data Domains Copyright 2012 Your organization 22 Data Naming Standards - Technical

The name of a column is composed of abbreviated attribute name components Use only abbreviations from an approved common words lexicon (abbreviations list) Column name components should always be abbreviated if an approved abbreviation exists whether the column name is too long or not Example: CHKG_ACCT_MTHLY_AVG_BAL When column names are too long, qualifiers should be eliminated starting with the least significant qualifier to the second least significant qualifier, etc. Copyright 2012 Your organization 23 Common Words Lexicon Approved and Published

Account . . . ACCT Amount . . . AMT Average . . . AVG Balance . . . BAL Checking . . . CHKG Certificate of Deposit ...CD Code . . . CDE Count . . . CNT Date . . . DTE Description . . .DESC Identifier . . . ID Indicator . . . IND Monthly . . . MTHLY Name . . . NM Number . . . NBR Percent . . . PCT Quantity . . . QTY Rate . . . RTE Savings . . . SVG Text . . . TXT Abbreviations List Copyright 2012 Your organization 24 Data Domain Standards

Every attribute (data element) must be atomic Every attribute must be unique (no synonyms, no homonyms) Every attribute identifies or describes only one business object (entity) in the real world Every attribute must have business metadata (name, definition, business rules, owner, source, etc.) Every attribute must have a predefined data domain Data domains must be based on EDM data quality rules

Business metadata and data domains are defined and maintained by business people Copyright 2012 Your organization 25 Data Standardization Best Practices Provide training in data administration principles Create formal data definitions Create fully qualified business data names Apply the data domain standards Create and use class words and common words lexicons Publish the data standards Copyright 2012 Your organization 26

Standardization What we need to do Enter your proposed actions Copyright 2012 Your organization 27 Data Integration Look for potential duplicate entities by examining: Entity definitions Semantic intent Entity content Ensure that each entity has one unique business identifier

Put one fact (attribute) in one place (entity) using the normalization rules Look for potential duplicate attributes by examining: Attribute definitions Semantic intent Domains Capture real world business actions between entities as data relationships (not reporting patterns) Copyright 2012 Your organization 28 Single Version of The Truth Customer Account Payment

Account Customer Method Product Order Product Part Product Existing Customer Potential Customer Payment Salesperson Based on normalization rules

Product Category Part Salaried Salesperson Org Unit Supplier Shipment Commissioned Salesperson Org Structure Warehouse Copyright 2012 Your organization 29 Unstructured data

Storage and administration Enterprise content management systems (ECMS) Check-in and check-out functionality Retention and archiving Backup and recovery Secure objects Content reusability Search and delivery Combining structured and unstructured data Copyright 2012 Your organization 30 Data Integration Best Practices Determine data integration benefits and costs

Create an inventory of all your data Use logical data modeling and normalization rules to find and remove synonyms and homonyms Use a metadata repository to document the names and definitions of your business data Dont forget to integrate unstructured data with structured data Copyright 2012 Your organization 31 Data Integration Our Status Focus on the important data such as customer, supplier, agents, inventory, parts, loans, or whatever it is that runs your business. Include examples of where you are integrated and where not. Copyright 2012 Your organization 32 Data Integration This is what we need to do

Enter your integration actions Copyright 2012 Your organization 33 Data modeling Logical Data Model Business view of data Process Independent Project-specific model Business model Enterprise Data Model Business view of data Process Independent Enterprise-wide model Enterprise information architecture Physical Data Model Database model Database view of data Process Dependent

Database-specific model Copyright 2012 Your organization 34 Data Modeling Our Situation Copyright 2012 Your organization 35 Logical Data Model Captures what an organization is and what it does in terms of: Business objects (entities) Business data (attributes) Business activities (relationships)

Business rules (metadata) Business policies (metadata) Not tailored for: Query or reporting pattern or tool Access or storage requirements Performance Copyright 2012 Your organization 36 Process Independence Access path independent Program independent Query / report independent Database independent

Tool independent (OLAP) Language independent Platform independent Copyright 2012 Your organization 37 Purpose of Logical Data Modeling Facilitate data integration Facilitate business analysis

Facilitate communication among business people Improve productivity through reusability Focus on data ownership as opposed to system ownership Bring data quality problems to the surface Separate process logic from data Serve as the baseline data architecture for database design Copyright 2012 Your organization 38 Enterprise Data Model Single Version of the Truth

Customer Account Payment Payment Account o Integrated 360 business view! Customer Product Order Product Part Product Existing Customer Potential Customer

Method Salesperson Supported by common data definitions, domains, and business rules. Product Category Part Salaried Salesperson Org Unit Supplier Shipment Commissioned Salesperson Org Structure Warehouse

Copyright 2012 Your organization 39 Physical Data Model Database design based on physical attributes: Access patterns Size of tables Number of business users Location of business users Platform (Processor, DBMS) OLAP tools Tailored for: Query or reporting pattern or tool Access and storage requirements

Performance Copyright 2012 Your organization 40 Process Dependent Access path dependent Program dependent Query / report dependent Database dependent Tool dependent (OLAP)

Language dependent Platform dependent Copyright 2012 Your organization 41 Purpose of Physical Data Modeling Facilitate database design Focus on performance Architect database structures:

Tables Columns Primary keys Foreign keys Referential integrity rules Copyright 2012 Your organization 42 Data Modeling Best Practices Always create a logical business data model do not just focus on database modeling Sell the importance of creating an enterprise information architecture (enterprise data model) to management Assign data modeling responsibilities (the enterprise data model should not be created by database designers) Create a process to link the physical data models

to the enterprise data model Copyright 2012 Your organization 43 Data Modeling This is what we need to do Enter your proposed data modeling actions Copyright 2012 Your organization 44 Data quality At what level of DQ maturity is your organization? Program abends 1 Data profiling Data cleansing

Discovery by accident 2 Correcting source data and programs Limited data analysis 3 short term 1 2 3 4 5 Uncertainty Awakening Enlightenment Wisdom Certainty (based on CMM)

Enterprise-wide DQ methods & techniques Addressing root causes 4 Proactive prevention 5 long term Copyright 2012 Your organization Continuous process improvements Optimization 45 Data quality costs Direct Costs of Non-Quality Information Marketing Campaign Per Instance

Number of Instances Total Number Per Year Time: ($60/hour loaded rate) Creating redundant occurrence Researching correct address Correcting address errors Handling complaints from customers Mail preparation 2.4 min 10 min 0.3 min 5.5 min 0.1 min 167,141 5,000/mo 6,000/mo 974/yr 393,273

1 12 12 1 4 Materials, Facilities, Equipment: Marketing brochure Postage Warehouse storage Shipping equipment and maintenance $1.96 $0.52 $0.01 $5,000/yr 393,273 393,273 393,273 36% 4 4 4 1 $0.02/trans

$0.001/mo $0.005/mo 393,273 393,273 393,273 4 12 12 Computing resources: CPU transactions Data storage Data backup Larry English, Improving DW and BI Quality Total Annual Costs Total Cost Per Year $ $ $

$ $ 401,138 600,000 21,600 5,357 157,309 $3,083,260 $ 818,008 $ 15,731 $ 1,800 $ $ $ 31,462 4,719 23,596 $5,163,980 Copyright 2012 Your organization 46

Data quality costs Information Development Cost Analysis Portfolio Total Number Category Infrastructure Basis: Enterprise architected DBs Enterprise reusable create/update programs + Total Infrastructure expenses Value Basis: Total retrieve equivalent pgms + Total value-adding expenses Cost-adding Basis: Redundant create/update pgms Interface/extract programs Redundant database files Total cost-adding expenses Lifetime Total ** Relative Weight Factor*

Average Unit Dev/Maint Costs Total Infrastructure Total Value-adding Dev/Maint Cost-adding Expenses** Expenses 200 0.75 $ 15,000 $ 3,000,000 300 1.50 $ 30,000 $ 9,000,000

300 500 400 600 1,500 1.00 1.50 1.00 0.75 $ 20,000 $ 30,000 $ 20,000 $ 15,000 Larry English, Improving DW and BI Quality % of Budget Expenses $12,000,000

24% $ 6,000,000 12% $32,000,000 64% $ 6,000,000 $15,000,000 $ 8,000,000 $ 9,000,000 3,800 $50,000,000 100% * Determine relative effort to develop average unit of each category using effort to develop a retrieve program as 1.00 + For programs that retrieve some data and create/update other data, determine the percent of retrieve only attributes and percent of create/update attributes (e.g., to retrieve customer data to create an order) **Based on 3,800 application programs and database files in portfolio and $50 Million in development

Copyright 2012 Your organization 47 Dummy (default) values Defaults for mandatory fields SSN 999-99-9999 Age 999 Zip 99999 Income 9,999,999.99 Inability to determine customer profiles Inability to determine customer demographics Copyright 2012 Your organization 48 Intelligent dummy values Defaults with meaning SSN 888-88-8888 Income 999,999.99 Age 000

Source Code FF Non-resident alien Employee Corporate customer Account closed prior to 1991 Inability to write straight forward queries without knowing how to filter data Copyright 2012 Your organization 49 Missing Values Operational systems do not always require informational or demographic data Gender Ethnicity Age Income Referring Source Inability to analyze marketing channels Copyright 2012 Your organization 50

Multi-purpose fields ONE field explicitly has MANY meanings Which business unit enters the data At what time in history it was entered A value in one or more other fields Appraisal Amount redefined as Advertised Amount redefined as Sold Date Loan Type Code 25 25redefines redefines==25 25attributes attributes! ! Not Notmutually mutuallyexclusive exclusive! ! Only Onlythe thevalue

valueofofone one isisknown for each record known for each record! ! redefined as ... Inability to judge product profitability Copyright 2012 Your organization 51 Cryptic values (1) Often found in Kitchen Sink fields Usually one byte (if not one bit) Highly cryptic (A, B, C, 1, 2, 3, ...) Non-intelligent, non-intuitive codes Often not mutually exclusive Inability to empower end users to write their own queries Copyright 2012 Your organization 52

Cryptic values (2) ONE field implicitly has MANY meanings Master_Cd {A, B, C, D, E, F, G, H, I} {A, B, C} {D, E, F} {G, H, I} Copyright 2012 Your organization Type of customer Type of supplier Regional constraints 53 Free-form address lines Unstructured text no discernable pattern cannot be parsed address-line-1: address-line-2: address-line-3: address-line-4:

ROSENTHAL, LEVITZ, A TTORNEYS 10 MARKET, SAN FRANC ISCO, CA 95111 Inability to perform market analysis Copyright 2012 Your organization 54 Contradicting values Values in one field are inconsistent with values in another related field 1488 Flatbush Avenue New York, NY 75261 Texas Zip Type of real property: Single Family Residence Number of rental units: four Income property Inability to make reliable business decisions Copyright 2012 Your organization 55

Violation of business rules Business Rule: Adjustable Rate Mortgages must have Maximum Interest Rate ( Ceiling) Minimum Interest Rate ( Floor) Business Rule: A Ceiling is always higher than a Floor ceiling-interest-rate: floor-interest-rate: 8.25 14.75 switched ? Inability to calculate product profitability Copyright 2012 Your organization 56 Reused primary key Little history, if any, stored in operational files primary keys are customarily re-used may have a different rollup structure January 94: August 97:

branch 501 = San Francisco Main region 1 area SW branch 501 = San Luis Obispo region 2 area SW Inability to evaluate organizational performance Copyright 2012 Your organization 57 Non-unique primary key Duplicate identification numbers Multiple customer numbers Customer Name Philip K. Sherman Philip K. Sherman Philip K. Sherman Phone Number 818.357.5166 818.357.7711 818.357.8911 Cust. Number 960601

960105 960003 Multiple employee numbers Employee Name July 1995: Bob Smith January 1996: Bob Smith August 1999: Bob Smith Department 213 (HR) 432 (SRV) 206 (MKT) Empl. Number 21304762 43218221 20684762 Inability to determine customer relationships Inability to analyze employee benefits trends Copyright 2012 Your organization 58 Missing data relationships Data that should be related to other data in a dependent (parent-child) relationship

Branch Employee Benefit Branch number 0765 does not exist in the BRANCH table Inability to produce accurate rollups Copyright 2012 Your organization 59 Inappropriate data relationships Data that is inadvertently related, but should not be two entity types with the same key values Purchaser: Seller: Jackie Schmidt Robert Black 837221 837221 Inability to determine customer or vendor

relationships Copyright 2012 Your organization 60 Management Support Management awareness of importance of data quality Cost justification of data quality initiative Ongoing commitment Finding a business management sponsor Copyright 2012 Your organization 61 Triage - Prioritization Which data to cleanse

Justification for cleansing Ease of cleansing Possibility of cleansing Political support for cleansing Copyright 2012 Your organization 62 Cost of Cleansing Automatic versus manual Tools to perform automatic cleansing Effort to support use of tools Use of defaults Knowledge/experience of those performing manual cleansing Copyright 2012 Your organization 63 Responsibility for Data Quality

Its not enough to say that data quality is everyones responsibility. Data Quality Administrator Ongoing commitment Data ownership responsibility Operational versus data warehouse responsibility Copyright 2012 Your organization 64 Data Quality Best Practices Inventory the quality of your data Sell the importance of data quality to management Assign data quality responsibility Triage the cleansing process

Copyright 2012 Your organization 65 Data Quality Our Status Enter all the major problems you have or anticipate with data quality and dont limit yourself to one slide. Copyright 2012 Your organization 66 Data Quality What Steps We Should Take to Improve Enter all the practical steps you should take and prioritize them. Dont limit yourself to one slide. Copyright 2012 Your organization 67 Metadata Management Users View

tion iga Nav Tables Columns Keys (primary/foreign) Ref. Integrity Rules Indexes ETL rules Process logic Business Names Data Definitions Data Domains Data Relationships Business Rules DQ Rules Data Integrity Rules Business Metadata Do cu me nt ati

on Master Metadata Data Lineage Data Location Data Usage Data Volumes Load Statistics Error Statistics ion t ra t s ni i m Ad Developers View Administrators View Technical Metadata

Usage Metadata Copyright 2012 Your organization 68 Metadata is everywhere Technicians and Business People Word Processing Files Business Analysts Data Administrator Spreadsheets CASE Tools Database ETL Administrator Developer DBMS

Dictionaries ETL Tools Application Developer Data Mining Expert OLAP Tools Data Mining Tools Metadata Migration Process Documentation Metadata Repository Navigation Technicians View

Business Persons View Technical Metadata Business Metadata Copyright 2012 Your organization 69 Metadata as the Keystone Single version of the truth Its the inventory of information Tears down dysfunctional information fiefdoms Opportunities for data standardization Copyright 2012 Your organization 70 Management Support for Metadata

IT and the Business Management understanding of the importance of metadata Impact on project schedules Long term benefit of metadata Importance for operational and data warehouse Copyright 2012 Your organization 71 Which Metadata to Capture Dont boil the ocean What metadata is valuable Ease and cost of capture Political issues relating to capture Copyright 2012 Your organization

72 Responsibility for Capturing Metadata Incentive for capturing Management direction Automatic and manual Copyright 2012 Your organization 73 Responsibility for Maintaining Metadata Where does Metadata Repository Administration report? Why is administration and maintenance important? Long-term commitment Copyright 2012 Your organization

74 How Metadata Is Used Business Understanding the data Understanding the meaning of results Avoiding incorrect conclusions IT Research Impact analysis Tool interchange Copyright 2012 Your organization 75 Metadata Best Practices Determine which metadata to capture and use

Determine how the tools will capture and use metadata Sell management on the importance of metadata Assign metadata responsibility Copyright 2012 Your organization 76 Metadata Where are we? Include anything you have done including a glossary or business and IT definitions. Copyright 2012 Your organization 77 Metadata What Should We be Doing As you enter these actions, consider including responsibility but make sure you have talked to those people or departments before presenting to management. Copyright 2012 Your organization

78 Security and privacy A A Workstation Terminals Communication Server B B Remote Access C C G G Database Server Mainframe

H H D D E E LAN File Server F F Internet Access Legend: Security exists No security Conn. Path Mainframe Security Package LAN Security

Package PC Security Package Password Security Encryption Function DBMS Security Generic Security Package A B C D E F G H Copyright 2012 Your organization

79 Categorization for Security/Privacy Does all data have the same security/privacy requirements? Who determines security/privacy requirements of data? What are the regulatory requirements for security and privacy? Does your organization have a Security Office? What authority do they have? Copyright 2012 Your organization 80 Responsibility For Data Security

Security Office Internal auditors Data Owners Responsibility for administering Testing security and privacy Copyright 2012 Your organization 81 Mechanism For Establishing Security Procedures Security requirements Internal Regulatory Tools that implement security Communicating security requirements to those who implement Copyright 2012 Your organization 82

Security Audit Validating procedures Validating training Testing and probing Recommending mitigation Frequency of audits Copyright 2012 Your organization 83 Regulatory Issues Health Care HIPPA Finance Brokerage - SEC Insurance

Media FCC Copyright 2012 Your organization 84 Security & Privacy Best Practices Raise the consciousness of security and privacy requirements Connect with your Security Office Determine security capabilities of tools Assign responsibilities Test and validate Copyright 2012 Your organization 85 Security & Privacy What exposures do we have?

Hopefully you have talked to your Security Officer and anyone else who is responsible for the security of data. Copyright 2012 Your organization 86 Security & Privacy What Steps do we Need to Take Be sure to clear these actions with those responsible for security and privacy. Copyright 2012 Your organization 87 Performance Benchmarking

Capacity planning Designing (optimal schemas) Coding (efficient SQL calls) Monitoring and measuring Tuning Database structures DBMS parameters and OS Communication links Hardware Copyright 2012 Your organization 88 Categorization for Performance How good does response time need to be? How does it differ from application to application? What is the cost-benefit of excellent response time? Were performance considerations included in the architecture? Copyright 2012 Your organization

89 Categorization for Availability Scheduled hours (24 X 7, 18 X 6,) Availability during scheduled hours How does it differ from system to system? Is excellent availability cost justified? Was availability included in the architecture? Copyright 2012 Your organization 90 Capacity Planning

Database size Number of users Number of transactions Number of queries/reports Time and day of usage Complexity of transactions/queries/reports Proactive response to capacity increase Copyright 2012 Your organization 91 Monitoring/Measuring Response time Resource utilization (CPU, disk access, network) Who is using the system When is the system being used Chargebacks Copyright 2012 Your organization 92

Service Level Agreements Response time Availability Schedule hours (hours/day, days/week) Availability during scheduled hours Timeliness of data Response to problems Response to new requests Who establishes agreements? Whats realistic? Incentives to meet SLAs Copyright 2012 Your organization 93 Reporting performance

IT Who needs to take action Who needs to see reports/alerts Business Matching project agreements Expectations Copyright 2012 Your organization 94 Tuning Awareness of problems measurement tools and responsibilities Tuning capability of platform, RDBMS, tools Responsibility for tuning Copyright 2012 Your organization

95 Measurement Tools Performance Usage Resource utilization Network Copyright 2012 Your organization 96 Performance & Measurement Best Practices Determine what is advantageous to measure Assign responsibilities Designate tools for measurement

Report metrics to management Copyright 2012 Your organization 97 DBMS/Product Selection Industrial-strength Enterprise Server Mid-range Workgroup Server Desktop Remote Client Copyright 2012 Your organization 98 Relational DBMS Which RDBMS is the standard Relation to platform What applications is it being used for Copyright 2012 Your organization

99 Why standardize the RDBMS? Minimize the number of RDBMSs Less training required More leverage on RDBMS vendor Flexible assignments Fewer interface problems Fewer interface programs Copyright 2012 Your organization 100 Relation to platform

RDBMS performance impacted by platform Platform may dictate (or strongly recommend) RDBMS choice Which decision comes first? Desktop Remote Client Copyright 2012 Your organization Mid-range Workgroup Server Industrial-strength Enterprise Server 101 How DBMS is being used Operational/OLTP Data Warehouse/Business Intelligence OM ODS

EDW Operational Systems DM DW Databases Copyright 2012 Your organization 102 Tools/Utilities Platform dependent DBMS dependent Expensive 33% on the shelf Lots of product duplication Necessary? Copyright 2012 Your organization

103 Standards for Products Who sets standards? Are the standards known? Are they standards or guidelines? Who can give dispensation? Copyright 2012 Your organization 104 Criteria for Selection Need Cost Vendor Support Reputation

Financial stability Copyright 2012 Your organization 105 Responsibility for Selection Technical evaluators Strategic architect Management Copyright 2012 Your organization 106 Single Vendor vs Best of Breed Single vendor Possibly a better relationship Leverage Not always the best products Products should all work together

Best-of-breed Need to integrate yourself Finger pointing when problems Potential incompatibilities Copyright 2012 Your organization 107 Deals/Negotiations Have someone else negotiate Dont let vendor know you have chosen them before you negotiate www.dobetterdeals.com (Joe Auer ComputerWorld) Copyright 2012 Your organization 108 Relationship with Vendors

Partnerships Money Issues Support Conferences Being a reference Copyright 2012 Your organization 109 Databases Required by the Application Packages Packages do not support all DBMSs Packages do not support all DBMSs equally well Does preferred DBMS violate database standard Are support personnel (DBAs) available? Copyright 2012 Your organization

110 Impact of Package Machine Requirements Performance Availability Copyright 2012 Your organization 111 DBMS/Product Selection Best Practices Determine real requirements Establish software standards Make use of existing software whenever possible Talk to organizations who are using the products

Copyright 2012 Your organization 112 Business intelligence (BI) Source: TDWI trend metric same store sales customer retention new customers charge cards issued 30 day past-due accounts 60 day past-due accounts 90 day past-due accounts

merchandise return rate inventory turnover rate Financial Performance Meters actual target variance $108.0m $120.0m - 10% 96% 95% +0.9% 3.8k 5.0k -24.0% 8.5k

12.0k -33.3% 500 400 +2.0% regulatory warning Daily Sales Market Growth market opportunity provides decision makers a 360o view of their business compliance

violation Alerts Trends Forecasts Copyright 2012 Your organization 113 Goals and Objectives Why have a data warehouse? Have goals and objectives been identified? Have they been communicated? Are they measured post-implementation? Copyright 2012 Your organization 114 Architecture

Platform Tools/products How the data flows Copyright 2012 Your organization 115 DW and BI Tools RDBMS Data Modeling ETL Access and Analysis Data quality (Cleansing) Measurement Copyright 2012 Your organization

116 Data Mining Data farming Data mining Verification of assumptions Discovery of the unknown Results based on known data relationships Inferred results from data

found in database Deductive method Inductive method Yields information that can be proven to be factual Yields information that is assumed to be true for some probability Copyright 2012 Your organization 117 Data Sources for Data Mining Operational databases

DW databases Orders Shipments Account Master E T L Enterprise Data Warehouse Customer DM Sales DM Billing Data Mining Databases Data Mining Applications Copyright 2012 Your organization 118

Spiral BI/DW Methodologies Business Assessment Goals Assessment &&Strategy Strategy Project Project Plan Plan Business Business Opportunity Opportunity Post-Impl. Post-Impl. Review Review Data Data Requirement Requirement

BI/DW Applications Business Business Analysis Analysis Data Inventory Application Application Design Design Implementation Implementation Testing Testing Development Development Copyright 2012 Your organization 119

Software Release Concept Extreme scoping Projects First Release - Larissa Moss Second Release Final Release BI Application Reusable & Expanding feels like prototyping Third Release

Fifth Release Fourth Release Refactoring - Kent Beck Project =/ Application Copyright 2012 Your organization 120 Using the Software Release Approach Unstable requirements can be tested and enhanced in small increments Scope is very small and manageable Technology infrastructure can be tested and proven

Data volumes (per release) are relatively small Project schedules are easier to estimate because the scope is very small Development activities can be iteratively refined, honed, and adapted Mistakes are less expensive to fix early in the development Copyright 2012 Your organization 121 Using the Software Release Approach Unstable requirements can be tested and enhanced in small increments Scope is very small and manageable Technology infrastructure can be tested and proven Data volumes (per release) are relatively small

Project schedules are easier to estimate because the scope is very small Development activities can be iteratively refined, honed, and adapted And the quality of the release deliverables (and ultimately the quality of the BI applications) will be higher! And the development process will get faster and Copyright 2012 Your organization 122 Software Release Guidelines Deliver every three to six months (first release will take longer) Strictly control the scope and keep it very small First Release Second Release

Final Release BI Application Fifth Release Third Release Fourth Release Keep expectations realistic The enterprise infrastructure must be robust (technical and non-technical) Metadata must be an integral part of each release; otherwise, the releases will not be manageable

Designs, programs, and tools must be flexible Copyright 2012 Your organization 123 Iterative BI Application Development Release 6 Business Case Assessment Release Implementatn Release Implementatn Planning Post-Impl. Review Meta Data Reposit ory Testing

Application Testing Application Prototyping ETL Design Business Case Assessment Meta Data Repository Analysis ETL Design Data Mining BI Application Requiremts & Data

Analysis ETL Testing Meta Data Reposit ory Testing Requiremts & Application Prototyping Meta Data Repository Analysis Application Development Release MetaImplementatn Data Repository Design ETL Design

Data Analysis Planning Application Prototyping Application Prototyping Data Analysis Post-Impl. Review Release 5 Requiremts & Application Prototyping ETL Development Meta Data

Repository Design Data Mining Application Testing Meta Data Repository Development ETL Design ETL Development Release Implementatn Meta Data Reposit ory Testing Application

Testing Meta Data Repository Analysis Application Development Application Prototyping Requiremts & Data Analysis ETL Testing Requiremts & Application Prototyping Application Development Meta Data Repository Development Planning

Post-Impl. Review Requiremts & Data Analysis ETL Testing Release 1 Business Case Assessment Business Case Assessment Planning Post-Impl. Review Requiremts & Data Analysis ETL

Testing Application Testing Meta Data Reposit ory Testing Requiremts & Application Prototyping Meta Data Repository Analysis Application Development Meta Data Repository Development Application Prototyping Application

Prototyping ETL Design ETL Development Meta Data Repository Development Application Prototyping Application Prototyping ETL Design ETL Design ETL Development ETL Design Release

Meta Data Implementatn Repository Design Data Mining Data Analysis Business Case Assessment Post-Impl. Review Requiremts & Data Analysis Meta Data Reposit ory Testing Post-Impl. Review

Application Testing Meta Data RepositoryApplication Analysis Development Application Prototyping Meta Data Repository Development Application Prototyping ETL Design ETL Development ETL Design Data Mining Data Analysis

Meta Data Repository Design Data Analysis Planning Requiremts & Data Analysis ETL Testing Requiremts & Application Prototyping Application Development Meta Data Repository Development Release Implementatn

Planning ETL Testing Application Testing Meta Data Repository Design Data Mining Business Case Assessment Release 2 Meta Data Reposit ory Testing Requiremts & Application Prototyping

Meta Data Repository Analysis Application Prototyping Application Prototyping ETL Design ETL Development ETL Design Data Mining Meta Data Repository Design Release 3 Data Analysis Release 4

Copyright 2012 Your organization 124 Business Intelligence Best Practices Set goals and objectives Set expectations early and often Establish cost justification Find a terrific sponsor Use a spiral methodologies Deliver often with software releases Copyright 2012 Your organization 125 BI & DW How well are we doing? Include applications, departments, number of users,

usage, user satisfaction, ROI, management perception, Copyright 2012 Your organization 126 DW & BI What are we going to do to make our DW and BI Sing? This might include training, selling to management and end users, new BI tools, new organizational responsibilities, Copyright 2012 Your organization 127 Outline Benefits of a data governance strategy Components of a data governance strategy

Organization, roles and responsibilities Impact of a data governance strategy on BI and IT How to implement a data governance strategy program Copyright 2012 Your organization 128 Organization, roles and responsibilities Data owner

Data steward Data strategist Strategic architect Database administrator/designer Data administrator (EIM) Metadata administrator (EIM) Data quality analyst (EIM) Security officer Copyright 2012 Your organization 129 Data owner Assigned to business people (often data originators) Typically hold a senior position (directors or managers) Have authority to set policies and dictate business rules and security for the data

Are accountable to the information consumers in the organization Copyright 2012 Your organization 130 Data steward Should be assigned to business people, but could be performed by senior business analysts from IT Must know the industry and the organization very well (often people with seniority) Requires an enterprise-wide understanding of the data and the business rules Have authority to communicate and enforce policies, business rules, and security for the data Mediate data disputes among business people and

facilitate resolutions Copyright 2012 Your organization 131 Data strategist Understands the strategic business goals Knows the government regulations and governmental reporting requirements Understands the DBMS platforms and operating systems Knows the internal application databases (operational and BI)

Is aware of future data demands and data volumes Creates and maintains the data governance strategy Copyright 2012 Your organization 132 Strategic architect Develops the overall architecture for both operational and BI environments to include: Software Utilities Tools Interfaces Determines if the BI/DW environment will be one-tier or

multi-tier and what the platform components should be Participates in architecting databases and data flows Copyright 2012 Your organization 133 Database administrator/designer Understands user requirements and how databases are accessed and updated Knows different database design techniques (relational, multi-dimensional) and when to apply them Is responsible for the physical aspects of application databases:

Logical and physical database design Partitioning and indexing Dataset placement Performance and tuning (databases and SQL) Backup and recovery Maintains the application databases Copyright 2012 Your organization 134 Data administrator Knows the industry and the business processes Understands the data and the business rules that are used by those processes

Has expertise in E/R modeling and knows the normalization rules Standardizes and integrates the data (logically) through the enterprise information architecture Creates and enforces data naming standards Collects and maintains business metadata: Data names (fully spelled out business names) Data definitions and metrics definitions Business rules (data rules and process rules) Copyright 2012 Your organization 135 Metadata administrator

Knows industry metadata standards Understands DW databases and ETL architectures Builds and maintains a metadata repository or administers a purchased MDR product Selects and installs metadata integration and access tools Integrates and loads metadata from various BI and developer tools (Data Modeling, Data Profiling, DBMS, ETL, OLAP) Copyright 2012 Your organization 136 Data quality analyst Knows the internal application databases and how to extract data from them Is familiar with data profiling and data cleansing tools

Understands the user requirements, the business processes, and the business rules Audits operational source data to find and report violations of business rules and other DQ problems Participates in writing data cleansing specs Identifies root causes for dirty data Facilitates negotiations between data originators and information consumers about DQ improvements Copyright 2012 Your organization 137 Security officer Knows the governmental security and privacy regulations (HIPAA) Understands the business requirements for securing the data Understands security features and capabilities of the application components (DBMS, BI tools, Web portals) Ensures that appropriate security settings are placed on:

Databases BI tools Developer tools Web portals Copyright 2012 Your organization 138 Organization Do we have the right roles and responsibilities? Include and responsibilities that overlap and identify any gaps where some roles are not be filled. Copyright 2012 Your organization 139 Organization What should we be considering?

Be careful here. You are likely to step on toes. Be sure to vet any proposed changes with the appropriate management. Copyright 2012 Your organization 140 Outline Benefits of a data governance strategy Components of a data governance strategy Organization, roles and responsibilities Impact of a data governance strategy on BI and IT

How to implement a data governance strategy program Copyright 2012 Your organization 141 Impact of a data governance strategy on BI and IT Better and faster decisions Increased analyst productivity Employee empowerment RELIABLE Cost containment INFORMATION Cash flow acceleration

Revenue enhancement Fraud reduction Demand chain management Better customer service Lower customer attrition Better relationships with suppliers and customers Public relations and reputation Copyright 2012 Your organization 142 Gain Control Consistent security implementation Understand, define and assign ownership Understand, define and assign stewardship Minimize redundancy Inventory data Develop consistent terminology Copyright 2012 Your organization

143 Support the IT Strategy Provide departments, projects and personnel with guidelines for storing and accessing data Minimize the number of RDBMSs Establish, disseminate and maintain standards for shared data resources Deliver a high level of service Performance Availability Response time Responsiveness to user requests Copyright 2012 Your organization 144

Outline Benefits of a data governance strategy Components of a data governance strategy Organization, roles and responsibilities Impact of a data governance strategy on BI and IT How to implement a data governance strategy Copyright 2012 Your organization 145 Incremental Data Governance Strategy Implementation

Dont get into the details too soon Dont be seen as a theorist -- your actions must be pragmatic Dont lead with long-term deliverables Dont commit more than you can deliver Avoid unproven technology Copyright 2012 Your organization 146 Steps to Implement a Data Governance Strategy Conduct a data environment assessment Establish a target data environment

Develop an implementation plan Sell data governance strategy within the organization Evaluate progress and justify your existence Revisit the plan Copyright 2012 Your organization 147 Summary Pitch the importance of a data governance strategy to your CIO or CTO Ask to either lead the effort or to be a permanent member of the team Copyright 2012 Your organization 148 Thank you ISBN 0-201-61635-1 ISBN 0-321-24099-5

ISBN 0-201-78420-3 Larissa Moss ISBN 0-201-76033-9 Method Focus, Inc. [email protected] Copyright 2012 Your organization Sid Adelman Sid Adelman & Associates [email protected] 149

Recently Viewed Presentations

  • FLORIDA DEPARTMENT OF TRANSPORTATION AASHTO Spring Meeting Transportation

    FLORIDA DEPARTMENT OF TRANSPORTATION AASHTO Spring Meeting Transportation

    Changing Industry Composition. Future—Targeted Clusters and Industries . Emerging Technologies. Construction. Agriculture. Tourism. Agriculture. Tourism. Health Care &
  • IADC General Presentation - Greater Houston STEPS

    IADC General Presentation - Greater Houston STEPS

    IADC Issues Rig Pass / Safe Land Basic Orientation IADC Safety Alerts IADC Safety Data Program (ISP) Gin Pole Truck Guidelines IADC ISSUES Worldwide Reg. & Offshore - Alan Spackman HSE & North America Land - Joe Hurt Drilling Services...
  • Une deuxième chance pour les statines grâce à la vitamine D ...

    Une deuxième chance pour les statines grâce à la vitamine D ...

    Références. Michalska-Kasiczak M, Sahebkar A, Mikhailidis DP, et al; Lipid and Blood Pressure Meta-analysis Collaboration (LBPMC) Group. Analysis of vitamin D levels in patients with and without statin-associated myalgia—a systematic review and metaanalysis of 7 studies with 2420 patients.
  • Economic Goal 3: External Stability - Weebly

    Economic Goal 3: External Stability - Weebly

    Goal of External Stability. Definition of Goal: The Australian government's goal of external stability is defined as the desirable situation where Australia is living within its means and able to pay its way in its international transactions, without the burden...
  • How to Create a New Guardianship Program

    How to Create a New Guardianship Program

    HOW TO CREATE A NEW GUARDIANSHIP PROGRAM Monica Mitchell Judicial Staff Counsel San Bernardino Superior Court STEP ONE DETERMINE THE MODEL OF THE PROGRAM STEP ONE: PICK A MODEL Facilitation (passive, one on one) Direct (active, one on one) Workshop...
  • African American Parent Academy October 24, 2015

    African American Parent Academy October 24, 2015

    All students in HCPSS took the Cognitive Abilities Test (CogAT) in December of their 5th grade year. The CogAT is an ability measure (not an achievement test) that is used as an initial indicator of eligibility for G/T content area...
  • HCV and IR

    HCV and IR

    Excess extrahepatic mortality associated with HCVThe REVEAL HCV Cohort Study. 19,636 HBsAg-seronegative adults, aged 30-65 yrs. 1,095 anti-HCV+ [5.6%] 2,394 deaths after an average FU of 16.2 years
  • Psychology of Emotions - Rutgers University