August 01, 2018

Brief about Book Library Data Warehouse System

Prepared by: Sumit

Q1) Identify the business processes of interest to senior management in the industry (domain) allocated to your group.
Major libraries have large collections and circulation. Managing libraries electronically has resulted in the creation and management of large library databases, Library to the students and teachers who are cooperating in this e-learning environment.

Below are some of the business processes of interest to senior management:
  • Variety of Books: Need to better understand what books customers wanted and were willing to pay for. 
  • Fund the Books: Need to change its costs and cash flow so that the book library could continue to operate. 
  • Make Library Reliable: It has to be a library that has its customers to their wanted books on-time.
  • Book Borrowing
A crucial part of a library is the human intermediary the librarian. This intermediary connects the users to the information needed and can assist with advice about using the information retrieval systems and working with information.

Q2) List some questions that would be raised by senior management for improving the business process.
There are many questions that can be asked by senior management for improving the above business process.
Some of the questions that will be asked are :
  • When the item was collected?
  • Which librarian registered it?
  • What is the item about?
  • Which branch library the item was registered at?
Q3) To address the above-mentioned questions; propose a DW design (schema diagram).
In general for a DW Design basically four main characteristics are used:
Step 1: Identify the Business Process
Step 2: Declare the Grain
Step 3: Identify the Dimensions
Step 4: Identify the Facts

Our Book Library case, the following are steps:
  1. Business Process: Book borrowing is the business process.
  2. Declare the Grain: The second step is to declare the grain of the business process. In the book borrowing process, we declare a transaction issued in library automation system as the grain, which means an item is borrowed by a patron.
  3. Identify the Dimensions: The third step is to choose the dimensions. Dimensions represent how people describe and inspect the data from the process. Following are dimension table I will be using :
    • The Patron-Dimension describes the library patron’s characteristics. The attributes of Patron-Dimension include the name of the patron, gender, occupation, patron type, department, college, and so on.
    • The Item-Dimension describes every item belonging to the library, and its attributes indicating what relating to this item, including call number, title, author, subject, classification, language, location, MARC, collecting source, and so on. 
    • The Location-Dimension describes branch libraries supervised by the city library, and its attributes include the name of the branch library, named of the district it is located and the name of region library.
    • The Date-Dimension describes every hour of one day, and its attributes include hour, date, week, month and year. 
  4. Identify the Facts: The fourth step is to identify the facts. In the case of book borrowing, we identify the fact to measure the number of books borrowed. We declared a transaction that an item was borrowed by a patron as the grain in the prior step. Thus, the number of books borrowed here is equal to one.
  • The star schema is perhaps the simplest data warehouse schema.
  • It is called a star schema because the entity-relationship diagram of this schema resembles a star, with points radiating from a central table. 
  • The center of the star consists of a large fact table and the points of the star are the dimension tables.
Star Schema for Library Book Borrowing:

Q4) List aggregations to improve the DW performance. Justify.
  • Aggregates provide improvements in performance because of the significantly smaller number of records.
  • Aggregates allow quick access to Book Dimension data during reporting. Similar to database indexes, they serve to improve performance.
  • Aggregates are particularly useful in the following cases:
    • Executing and navigating in query data leads to delays if you have a group of queries
    • You want to speed up the execution and navigation of a specific query
    • You often use attributes in queries
    • You want to speed up reporting with specific hierarchies by adding a level of a specific hierarchy.
  • Aggregates are particularly useful in the following cases:
  • If the aggregate contains data that is to be evaluated by a query, the query data is read automatically from the aggregate.
  • Query: Total sales for books during the first week of December 2000 for location Mumbai.

Q5) List and justify any 5 metadata items that will be of interest to various stakeholders.
  • Metadata means "data about data". 
  • Data that provides information about one or more aspects of metadata data is defined as; It is used to summarize the basic information about the data that can be tracked and can work with specific data.
  • Below are metadata items of various interest to stakeholders:
    • Purpose of the book
    • Time and date of issuing the book
    • Creator or author of the book
    • Location on a computer network where the book was issued.
    • Book quantity
    • Book quality
  • Below are metadata items of various interest to stakeholders:
Types of Meta Data:
  • Descriptive metadata is usually used for search and identification, such as searching and finding an object, such as title, author, topic, keyword, and publisher.
  • Administrative metadata provides information to help manage the source. Administrative metadata refers to the technical information, including file type, or when and how the file was created.
  • Structural metadata describes how components of an object are organized. An example of structural metadata will be how the pages are ordered to make chapters of a book.
Following are some key points that to be included in MetaData:

Definition of data warehouse − It includes the description of the structure of data warehouse. The description is defined by schema, view, hierarchy, derivative data definitions, and data mart locations and materials.

Operational Metadata − It includes currency of data and data lineage. The currency of the data means that the data is active, stored or pure, or not. The genealogy of the data means the history of the migrated data and the changes applied to it.

Business metadata − It has the data ownership information, business definition, and changing policies