August 14, 2018

Important QnA on Database Management Systems (DBMS) - Comprehensive

Question) Draw an ER diagram that captures the following information:
The Prescriptions-R-X chain of pharmacies has offered to give you a free lifetime supply of medicine if you design its database. Assume the rising cost of health care, you approve. Here’s the information that you gather:
Patients are recognized by an SSN, and their names, addresses, and ages must be logged.
Doctors are known by an SSN. For individual doctor, the name, specialty and field, and years of experience need to be logged.Each pharmaceutical company is recognized by name and has a phone number.
For each drug, the trade name and formula must be noted. Each drug is traded by a certain  pharmaceutical company, and the trade name identifies a drug distinctively from amongst the products of that company. If a pharmaceutical company is removed, you must not keep track of its products any longer.
Each pharmacy has a name, address, and phone number.Every patient has a primary physician. Each doctor has at least one patient.Each pharmacy vends several drugs and has a certain price for each. A drug might be sold at numerous pharmacies, and the value could differ from one pharmacy to other.
Doctors prescribe drugs for patients. A doctor can prescribe one or additional drugs for quite a few patients, and a patient can acquire prescriptions from numerous doctors. Each prescription has a date and a quantity associated with it. You can accept that, if a doctor recommends the identical drug for the same patient more than once, only the last such medicament needs to be kept.
Pharmaceutical companies have long-term contracts with pharmacies. A pharmaceutical company can bond with quite a lot of pharmacies, and a pharmacy can deal with a number of pharmaceutical companies. For every contract/bond or deal, you have to collect  a start date, an end date, and the text of the contract.
Pharmacies appoint a supervisor for each contract. There should constantly be a supervisor for each contract, but the contract supervisor or the managing authority may change over the period of the contract.
Key points for ER diagram:
  • Entities correctly identified.
  • Attributes correctly identified.
  • Primary keys correctly identified.
  • Relationships and cardinality correctly identified.

Question)A Consider the following relational schema and translate the following SQL-query into an expression of the relational algebra.
• Student (snum, sname, major, level, age) 
• Class (name, meets at, room, fid) 
• Enrolled (snum, cname) 
• Faculty (fid, fname, deptid) 
                      SELECT S.sname 
                      FROM Student S 
                      WHERE S.snum 
                      NOT IN (
                      SELECT E.snum 
                      Enrolled E)

                              FROM Class C 
                              WHERE = ’R128’ OR IN 
                      (SELECT E.cname 
                      FROM Enrolled E 
                     GROUP BY E.cname 
                     HAVING COUNT(*) >= 5)

EmpMaster (eid: number, ename: varchar, sal: number, age: number, deptid: number)   
Dept(deptid: number, budgetAmt: number, floor: number, mgreid: number)
Salaries range from Rs.10,000 to Rs.100,000, ages vary from 20 to 80, each department has about five employees on average, there are 10 floors, and budgets vary from RS.10,000 to Rs 100,000. You can assume uniform distribution of values.
Query: Print ename, age, and sal for all employees.
Query: Find the deptids of departments that are on the 10th floor and have a budget of less than Rs25,000     
a) For each of the following queries, which index would you choose to speed up the query? 
b)If your database system does not consider index-only plans (i.e.,data records are always retrieved even if enough information is available in the index entry), how would your answer change? 
Explain briefly.

a)We can build an unclustered hash index on ename, age, sal fields of EmpMaster as then we could do “On index only” test. If our system does not comprise of” index only “plans then we should not make index for this query. Subsequently this query needs us to access all the Emp records, an index won’t help us, and so we can access the records using a filescan.

b)For the second scenario we can create a clustered dense B+tree index on floor, budget fields of Dept, since the records would be well-ordered on these fields then. So when performing this query, the first record with floor=10 must be fetched, and then the other records with floor =10 can be read in ascending order of budget. This plan, which is the best for this query, is not an index-only plan.

Question) Modern disk drives store more sectors on the outer tracks than the inner tracks.
As the rotation speed is persistent, the sequential data transferal rate is also greater on the outer tracks.The seek time and rotational delay are unchanged. Given this information, explain good strategies for placing files with the following kinds of access patterns:
a. Frequent, random accesses to a small file (e.g., catalog relations).
b. Sequential scans of a large file (e.g., selection from a relation with no index).
c. Random accesses to a large file via an index (e.g., selection from a relation via the index).
d. Sequential scans of a small file.

a.Place the file in the middle tracks. Sequential speed is not an issue due to the small size of the file, and the seek time is minimized by placing files in the center.
b. Place the file in the outer tracks. Sequential speed is most important and outer tracks maximize it. 
c.   Place the file and index on the inner tracks. The DBMS will alternately access pages of the index and of the file, and so the two should reside in close proximity to reduce seek times. When we place the file and the index on the inner tracks we save valuable space as well on the faster (outer) tracks for additional files that are retrieved successively.
d. Place small files in the inner half of the disk. A scan of a minor  file is effectually random I/O because the cost is conquered by the cost of the first initial seek to the commencement of the file.

Question) What are checkpoints and why are they important? List the actions taken by the recovery manager during checkpoints? 

Answer) Checkpoint is a type of entry in the log. A checkpoint record is printed into the log from time to time at that point when the system prints out to the database on disk all DBMS buffers that have been reformed. As a result of this, all transactions that have their [commit, T] entries in the log before a checkpoint entry do not need to have their WRITE operations redone in case of a system crash, since all their updates will be recorded in the database on disk during check pointing. Actions taken by the recovery manager during Checkpoint The recovery manager of a DBMS must decide at what intervals to take a checkpoint. The interval may be measured in time say, every m minutes or in the number t of committed transactions since the last checkpoint, where the values of m or t are system parameters.  
Checkpoints taken by recovery manager consists of the following actions:
 1. Suspend execution of transactions temporarily. 
2. Force-write all main memory buffers that have been modified to disk..
3. Write a [checkpoint] record to the log, and force-write the log to disk. 
4. Resume executing transactions. As a consequence of Step 2, a checkpoint record in the log may also include additional information, such as a list of active transaction ids, and the locations (addresses) of the first and most recent (last) records in the log for each active transaction. This can facilitate undoing transaction operations in the event that a transaction must be rolled back.

Question) Consider an empty B+-tree with n=4, insert the following record keys in the specified order: 
80, 25, 69, 14, 58, 3, 47, 91, 36, 81. 

Question)  Consider the following partial Schedule S involving two transactions T1 and T2. Only the read and the write operations have been shown. The read action on data item P is signified by read (P) and the write action on data item P is signified by write (P).  What would the result when the transaction T1 fails instantly after time instance 9.
Schedule S is non-recoverable and cannot ensure transaction atomicity because T2 reads value of 'A' which is written by T1 and T2 is committed before T1. 

Question) Consider two transactions:
                               T1: BEGIN A=A+100, B=B-100 END
                               T2: BEGIN A=1.06*A, B=1.06*B END
• 1st TXN transfers $100 from B’s account to A’s
• 2nd TXN credits both accounts with 6% interest.
Assume at first A and B each have $1000. What are the authorized outcomes of running T1 and T2???

Answer) There is no guarantee that T1 will execute before T2 or vice-versa, if both are submitted together. But, the net effect must be equal to these two transactions running consecutively in some random order.
Legal outcomes: A=1166, B=954 or A=1160, B=960

Question) Suppose you are given a relation R with four attributes ABCD. For each of the following sets of FDs, assuming those are the only dependencies that hold for R.                                                 
(a) Identify the candidate key(s) for R. 
(b) Identify the best normal form that R fulfills (1NF, 2NF, 3NF, or BCNF).
(c) If R is not in BCNF, decompose it into a set of BCNF relations that preserve the dependencies. 
 Perform the above tasks for the following set of functional Dependencies:         
1. C → D, C → A, B → C
2. B → C, D → A
3. ABC → D, D → A
4. A → B, BC → D, A → C
5. AB → C, AB → D, C → A, D → B

1. (a) Candidate keys: B  
   (b) R is in 2NF but not 3NF.
   (c) C → D and C → A both cause violations of BCNF. One way to obtain a (lossless) join preserving decomposition is to decompose R into AC, BC, and CD.

2. (a) Candidate keys: BD  
    (b) R is in 1NF form and not 2NF.
    (c) Equally B → C and D → A cause BCNF desecration or violation. The decomposition: AD, BC, BD (achieved by first decomposing to AD, BCD) is BCNF and lossless and join-preserving.

3. (a) Candidate keys: ABC, BCD  
    (b) R is in 3NF and not BCNF.
    (c) ABCD is not in BCNF form because D → A and D is not a key for it. However when we split up R as AD, BCD we cannot reserve the dependence ABC → D. Hence there is no BCNF decomposition.

4. (a) Candidate keys: A
    (b) R is in 2NF and not 3NF (as FD: BC → D).
    (c) BC → D violates BCNF since BC does not comprise a key. Therefore we divide up R as in: BCD, ABC.

5. (a) Candidate keys: AB, BC, CD, AD
    (b) R is in 3NF and is not BCNF (as FD: C → A). 
    (c) C → A and D → B both cause violations. Hence we decompose into: AC, BCD but this does not preserve AB → C   and AB → D, and BCD is yet still not BCNF because D → B. Therefore we need to decompose more into: AC, BD, CD. However, when we try to revive the lost functional dependencies by adding ABC and ABD,  these relations are not in BCNF form. Therefore, we can say there is no BCNF decomposition.

No comments:

Post a Comment