Thursday, July 31, 2008

RDBMS Concepts - Part 2


Continues....

40. What is Functional Dependency?
A Functional dependency is denoted by X Y between two sets of attributes X and Y that are subsets of R specifies a constraint on the possible tuple that can form a relation state r of R. The constraint is for any two tuples t1 and t2 in r if t1[X] = t2[X] then they have t1[Y] = t2[Y]. This means the value of X component of a tuple uniquely determines the value of component Y.

41. When is a functional dependency F said to be minimal?
¢ Every dependency in F has a single attribute for its right hand side.
¢ We cannot replace any dependency X A in F with a dependency Y A where Y is a proper subset of X and still have a set of dependency that is equivalent to F.
¢ We cannot remove any dependency from F and still have set of dependency that is equivalent to F.

42. What is Multivalued dependency?
Multivalued dependency denoted by X Y specified on relation schema R, where X and Y are both subsets of R, specifies the following constraint on any relation r of R: if two tuples t1 and t2 exist in r such that t1[X] = t2[X] then t3 and t4 should also exist in r with the following properties
¢ t3[x] = t4[X] = t1[X] = t2[X]
¢ t3[Y] = t1[Y] and t4[Y] = t2[Y]
¢ t3[Z] = t2[Z] and t4[Z] = t1[Z]
where [Z = (R-(X U Y)) ]

43. What is Lossless join property?
It guarantees that the spurious tuple generation does not occur with respect to relation schemas after decomposition.

44. What is 1 NF (Normal Form)?
The domain of attribute must include only atomic (simple, indivisible) values.

45. What is Fully Functional dependency?
It is based on concept of full functional dependency. A functional dependency X Y is full functional dependency if removal of any attribute A from X means that the dependency does not hold any more.

46. What is 2NF?
A relation schema R is in 2NF if it is in 1NF and every non-prime attribute A in R is fully functionally dependent on primary key.

47. What is 3NF?
A relation schema R is in 3NF if it is in 2NF and for every FD X A either of the following is true
¢ X is a Super-key of R.
¢ A is a prime attribute of R.
In other words, if every non prime attribute is non-transitively dependent on primary key.

48. What is BCNF (Boyce-Codd Normal Form)?
A relation schema R is in BCNF if it is in 3NF and satisfies an additional constraint that for every FD X A, X must be a candidate key.

49. What is 4NF?
A relation schema R is said to be in 4NF if for every Multivalued dependency X Y that holds over R, one of following is true
¢ X is subset or equal to (or) XY = R.
¢ X is a super key.

50. What is 5NF?
A Relation schema R is said to be 5NF if for every join dependency {R1, R2, ..., Rn} that holds R, one the following is true
¢ Ri = R for some i.
¢ The join dependency is implied by the set of FD, over R in which the left side is key of R.

51. What is Domain-Key Normal Form?
A relation is said to be in DKNF if all constraints and dependencies that should hold on the the constraint can be enforced by simply enforcing the domain constraint and key constraint on the relation.

RDBMS concepts - Part 1


1. What is database?

A database is a logically coherent collection of data with some inherent meaning, representing some aspect of real world and which is designed, built and populated with data for a specific purpose.


2. What is DBMS?

It is a collection of programs that enables user to create and maintain a database. In other words it is general-purpose software that provides the users with the processes of defining, constructing and manipulating the database for various applications.


3. What is a Database system?

The database and DBMS software together is called as Database system.


4. Advantages of DBMS?

Ø Redundancy is controlled.

Ø Unauthorised access is restricted.

Ø Providing multiple user interfaces.

Ø Enforcing integrity constraints.

Ø Providing backup and recovery.


5. Disadvantage in File Processing System?

Ø Data redundancy & inconsistency.

Ø Difficult in accessing data.

Ø Data isolation.

Ø Data integrity.

Ø Concurrent access is not possible.

Ø Security Problems.


6. Describe the three levels of data abstraction?

The are three levels of abstraction:

Ø Physical level: The lowest level of abstraction describes how data are stored.

Ø Logical level: The next higher level of abstraction, describes what data are stored in database and what relationship among those data.

Ø View level: The highest level of abstraction describes only part of entire database.

7. Define the "integrity rules"

There are two Integrity rules.

Ø Entity Integrity: States that “Primary key cannot have NULL value”

Ø Referential Integrity: States that “Foreign Key can be either a NULL value or should be Primary Key value of other relation.


8. What is extension and intension?

Extension -

It is the number of tuples present in a table at any instance. This is time dependent.

Intension -

It is a constant value that gives the name, structure of table and the constraints laid on it.


9. What is System R? What are its two major subsystems?

System R was designed and developed over a period of 1974-79 at IBM San Jose Research Center. It is a prototype and its purpose was to demonstrate that it is possible to build a Relational System that can be used in a real life environment to solve real life problems, with performance at least comparable to that of existing system.

Its two subsystems are

Ø Research Storage

Ø System Relational Data System.


10. How is the data structure of System R different from the relational structure?

Unlike Relational systems in System R

Ø Domains are not supported

Ø Enforcement of candidate key uniqueness is optional

Ø Enforcement of entity integrity is optional

Ø Referential integrity is not enforced


11. What is Data Independence?

Data independence means that “the application is independent of the storage structure and access strategy of data”. In other words, The ability to modify the schema definition in one level should not affect the schema definition in the next higher level.

Two types of Data Independence:

Ø Physical Data Independence: Modification in physical level should not affect the logical level.

Ø Logical Data Independence: Modification in logical level should affect the view level.

NOTE: Logical Data Independence is more difficult to achieve


12. What is a view? How it is related to data independence?

A view may be thought of as a virtual table, that is, a table that does not really exist in its own right but is instead derived from one or more underlying base table. In other words, there is no stored file that direct represents the view instead a definition of view is stored in data dictionary.

Growth and restructuring of base tables is not reflected in views. Thus the view can insulate users from the effects of restructuring and growth in the database. Hence accounts for logical data independence.


13. What is Data Model?

A collection of conceptual tools for describing data, data relationships data semantics and constraints.


14. What is E-R model?

This data model is based on real world that consists of basic objects called entities and of relationship among these objects. Entities are described in a database by a set of attributes.


15. What is Object Oriented model?

This model is based on collection of objects. An object contains values stored in instance variables with in the object. An object also contains bodies of code that operate on the object. These bodies of code are called methods. Objects that contain same types of values and the same methods are grouped together into classes.


16. What is an Entity?

It is a 'thing' in the real world with an independent existence.


17. What is an Entity type?

It is a collection (set) of entities that have same attributes.


18. What is an Entity set?

It is a collection of all entities of particular entity type in the database.


19. What is an Extension of entity type?

The collections of entities of a particular entity type are grouped together into an entity set.


20. What is Weak Entity set?

An entity set may not have sufficient attributes to form a primary key, and its primary key compromises of its partial key and primary key of its parent entity, then it is said to be Weak Entity set.


21. What is an attribute?

It is a particular property, which describes the entity.


22. What is a Relation Schema and a Relation?

A relation Schema denoted by R(A1, A2, …, An) is made up of the relation name R and the list of attributes Ai that it contains. A relation is defined as a set of tuples. Let r be the relation which contains set tuples (t1, t2, t3, ..., tn). Each tuple is an ordered list of n-values t=(v1,v2, ..., vn).


23. What is degree of a Relation?

It is the number of attribute of its relation schema.


24. What is Relationship?

It is an association among two or more entities.


25. What is Relationship set?

The collection (or set) of similar relationships.


26. What is Relationship type?

Relationship type defines a set of associations or a relationship set among a given set of entity types.


27. What is degree of Relationship type?

It is the number of entity type participating.


25. What is DDL (Data Definition Language)?

A data base schema is specifies by a set of definitions expressed by a special language called DDL.


26. What is VDL (View Definition Language)?

It specifies user views and their mappings to the conceptual schema.


27. What is SDL (Storage Definition Language)?

This language is to specify the internal schema. This language may specify the mapping between two schemas.


28. What is Data Storage - Definition Language?

The storage structures and access methods used by database system are specified by a set of definition in a special type of DDL called data storage-definition language.


29. What is DML (Data Manipulation Language)?

This language that enable user to access or manipulate data as organised by appropriate data model.

Ø Procedural DML or Low level: DML requires a user to specify what data are needed and how to get those data.

Ø Non-Procedural DML or High level: DML requires a user to specify what data are needed without specifying how to get those data.


31. What is DML Compiler?

It translates DML statements in a query language into low-level instruction that the query evaluation engine can understand.


32. What is Query evaluation engine?

It executes low-level instruction generated by compiler.


33. What is DDL Interpreter?

It interprets DDL statements and record them in tables containing metadata.


34. What is Record-at-a-time?

The Low level or Procedural DML can specify and retrieve each record from a set of records. This retrieve of a record is said to be Record-at-a-time.


35. What is Set-at-a-time or Set-oriented?

The High level or Non-procedural DML can specify and retrieve many records in a single DML statement. This retrieve of a record is said to be Set-at-a-time or Set-oriented.


36. What is Relational Algebra?

It is procedural query language. It consists of a set of operations that take one or two relations as input and produce a new relation.


37. What is Relational Calculus?

It is an applied predicate calculus specifically tailored for relational databases proposed by E.F. Codd. E.g. of languages based on it are DSL ALPHA, QUEL.


38. How does Tuple-oriented relational calculus differ from domain-oriented relational calculus

The tuple-oriented calculus uses a tuple variables i.e., variable whose only permitted values are tuples of that relation. E.g. QUEL

The domain-oriented calculus has domain variables i.e., variables that range over the underlying domains instead of over relation. E.g. ILL, DEDUCE.

Operating Systems - Basic Stuffs



Following are a few basic questions that cover the essentials of OS:


1. Explain the concept of Reentrancy.

It is a useful, memory-saving technique for multiprogrammed timesharing systems. A Reentrant Procedure is one in which multiple users can share a single copy of a program during the same period. Reentrancy has 2 key aspects: The program code cannot modify itself, and the local data for each user process must be stored separately. Thus, the permanent part is the code, and the temporary part is the pointer back to the calling program and local variables used by that program. Each execution instance is called activation. It executes the code in the permanent part, but has its own copy of local variables/parameters. The temporary part associated with each activation is the activation record. Generally, the activation record is kept on the stack.

Note: A reentrant procedure can be interrupted and called by an interrupting program, and still execute correctly on returning to the procedure.


2. Explain Belady's Anomaly.

Also called FIFO anomaly. Usually, on increasing the number of frames allocated to a process' virtual memory, the process execution is faster, because fewer page faults occur. Sometimes, the reverse happens, i.e., the execution time increases even when more frames are allocated to the process. This is Belady's Anomaly. This is true for certain page reference patterns.


3. What is a binary semaphore? What is its use?

A binary semaphore is one, which takes only 0 and 1 as values. They are used to implement mutual exclusion and synchronize concurrent processes.


4. What is thrashing?

It is a phenomenon in virtual memory schemes when the processor spends most of its time swapping pages, rather than executing instructions. This is due to an inordinate number of page faults.


5. List the Coffman's conditions that lead to a deadlock.

Ø Mutual Exclusion: Only one process may use a critical resource at a time.

Ø Hold & Wait: A process may be allocated some resources while waiting for others.

Ø No Pre-emption: No resource can be forcible removed from a process holding it.

Ø Circular Wait: A closed chain of processes exist such that each process holds at least one resource needed by another process in the chain.



6. What are short-, long- and medium-term scheduling?

Long term scheduler determines which programs are admitted to the system for processing. It controls the degree of multiprogramming. Once admitted, a job becomes a process.

Medium term scheduling is part of the swapping function. This relates to processes that are in a blocked or suspended state. They are swapped out of real-memory until they are ready to execute. The swapping-in decision is based on memory-management criteria.

Short term scheduler, also know as a dispatcher executes most frequently, and makes the finest-grained decision of which process should execute next. This scheduler is invoked whenever an event occurs. It may lead to interruption of one process by preemption.


7. What are turnaround time and response time?

Turnaround time is the interval between the submission of a job and its completion. Response time is the interval between submission of a request, and the first response to that request.


8. What are the typical elements of a process image?

Ø User data: Modifiable part of user space. May include program data, user stack area, and programs that may be modified.

Ø User program: The instructions to be executed.

Ø System Stack: Each process has one or more LIFO stacks associated with it. Used to store parameters and calling addresses for procedure and system calls.

Ø Process control Block (PCB): Info needed by the OS to control processes.


9. What is the Translation Lookaside Buffer (TLB)?

In a cached system, the base addresses of the last few referenced pages is maintained in registers called the TLB that aids in faster lookup. TLB contains those page-table entries that have been most recently used. Normally, each virtual memory reference causes 2 physical memory accesses-- one to fetch appropriate page-table entry, and one to fetch the desired data. Using TLB in-between, this is reduced to just one physical memory access in cases of TLB-hit.


10. What is the resident set and working set of a process?

Resident set is that portion of the process image that is actually in real-memory at a particular instant. Working set is that subset of resident set that is actually needed for execution. (Relate this to the variable-window size method for swapping techniques.)


11. When is a system in safe state?

The set of dispatchable processes is in a safe state if there exists at least one temporal order in which all processes can be run to completion without resulting in a deadlock.


12. What is cycle stealing?

We encounter cycle stealing in the context of Direct Memory Access (DMA). Either the DMA controller can use the data bus when the CPU does not need it, or it may force the CPU to temporarily suspend operation. The latter technique is called cycle stealing. Note that cycle stealing can be done only at specific break points in an instruction cycle.


13. What is meant by arm-stickiness?

If one or a few processes have a high access rate to data on one track of a storage disk, then they may monopolize the device by repeated requests to that track. This generally happens with most common device scheduling algorithms (LIFO, SSTF, C-SCAN, etc). High-density multisurface disks are more likely to be affected by this than low density ones.


14. What are the stipulations of C2 level security?

C2 level security provides for:

Ø Discretionary Access Control
Ø Identification and Authentication
Ø Auditing
Ø Resource reuse


15. What is busy waiting?

The repeated execution of a loop of code while waiting for an event to occur is called busy-waiting. The CPU is not engaged in any real productive activity during this period, and the process does not progress toward completion.


16. Explain the popular multiprocessor thread-scheduling strategies.

Ø Load Sharing: Processes are not assigned to a particular processor. A global queue of threads is maintained. Each processor, when idle, selects a thread from this queue. Note that load balancing refers to a scheme where work is allocated to processors on a more permanent basis.

Ø Gang Scheduling: A set of related threads is scheduled to run on a set of processors at the same time, on a 1-to-1 basis. Closely related threads / processes may be scheduled this way to reduce synchronization blocking, and minimize process switching. Group scheduling predated this strategy.

Ø Dedicated processor assignment: Provides implicit scheduling defined by assignment of threads to processors. For the duration of program execution, each program is allocated a set of processors equal in number to the number of threads in the program. Processors are chosen from the available pool.

Ø Dynamic scheduling: The number of thread in a program can be altered during the course of execution.


17. When does the condition 'rendezvous' arise?

In message passing, it is the condition in which, both, the sender and receiver are blocked until the message is delivered.


18. What is a trap and trapdoor?

Trapdoor is a secret undocumented entry point into a program used to grant access without normal methods of access authentication. A trap is a software interrupt, usually the result of an error condition.


19. What are local and global page replacements?

Local replacement means that an incoming page is brought in only to the relevant process' address space. Global replacement policy allows any page frame from any process to be replaced. The latter is applicable to variable partitions model only.


20. Define latency, transfer and seek time with respect to disk I/O.

Seek time is the time required to move the disk arm to the required track. Rotational delay or latency is the time it takes for the beginning of the required sector to reach the head. Sum of seek time (if any) and latency is the access time. Time taken to actually transfer a span of data is transfer time.


21. Describe the Buddy system of memory allocation.

Free memory is maintained in linked lists, each of equal sized blocks. Any such block is of size 2^k. When some memory is required by a process, the block size of next higher order is chosen, and broken into two. Note that the two such pieces differ in address only in their kth bit. Such pieces are called buddies. When any used block is freed, the OS checks to see if its buddy is also free. If so, it is rejoined, and put into the original free-block linked-list.


22. What is time-stamping?

It is a technique proposed by Lamport, used to order events in a distributed system without the use of clocks. This scheme is intended to order events consisting of the transmission of messages. Each system 'i' in the network maintains a counter Ci. Every time a system transmits a message, it increments its counter by 1 and attaches the time-stamp Ti to the message. When a message is received, the receiving system 'j' sets its counter Cj to 1 more than the maximum of its current value and the incoming time-stamp Ti. At each site, the ordering of messages is determined by the following rules: For messages x from site i and y from site j, x precedes y if one of the following conditions holds....(a) if Ti (b) if Ti=Tj and i


23. How are the wait/signal operations for monitor different from those for semaphores?

If a process in a monitor signal and no task is waiting on the condition variable, the signal is lost. So this allows easier program design. Whereas in semaphores, every operation affects the value of the semaphore, so the wait and signal operations should be perfectly balanced in the program.



24. In the context of memory management, what are placement and replacement algorithms?

Placement algorithms determine where in available real-memory to load a program. Common methods are first-fit, next-fit, best-fit. Replacement algorithms are used when memory is full, and one process (or part of a process) needs to be swapped out to accommodate a new program. The replacement algorithm determines which are the partitions to be swapped out.


25. In loading programs into memory, what is the difference between load-time dynamic linking and run-time dynamic linking?

For load-time dynamic linking: Load module to be loaded is read into memory. Any reference to a target external module causes that module to be loaded and the references are updated to a relative address from the start base address of the application module.

With run-time dynamic loading: Some of the linking is postponed until actual reference during execution. Then the correct module is loaded and linked.


26. What are demand- and pre-paging?

With demand paging, a page is brought into memory only when a location on that page is actually referenced during execution. With pre-paging, pages other than the one demanded by a page fault are brought in. The selection of such pages is done based on common access patterns, especially for secondary memory devices.


27. Paging a memory management function, while multiprogramming a processor management function, are the two interdependent?

Yes.


28. What is page cannibalizing?

Page swapping or page replacements are called page cannibalizing.


29. What has triggered the need for multitasking in PCs?

Ø Increased speed and memory capacity of microprocessors together with the support fir virtual memory and
Ø Growth of client server computing


30. What are the four layers that Windows NT have in order to achieve independence?

Ø Hardware abstraction layer
Ø Kernel
Ø Subsystems
Ø System Services.


31. What is SMP?

To achieve maximum efficiency and reliability a mode of operation known as symmetric multiprocessing is used. In essence, with SMP any process or threads can be assigned to any processor.


32. What are the key object oriented concepts used by Windows NT?

Ø Encapsulation

Ø Object class and instance


33. Is Windows NT a full blown object oriented operating system? Give reasons.

No Windows NT is not so, because its not implemented in object oriented language and the data structures reside within one executive component and are not represented as objects and it does not support object oriented capabilities .


34. What is a drawback of MVT?

It does not have the features like

Ø ability to support multiple processors
Ø virtual storage
Ø source level debugging


35. What is process spawning?

When the OS at the explicit request of another process creates a process, this action is called process spawning.


36. How many jobs can be run concurrently on MVT?

15 jobs


37. List out some reasons for process termination.

Ø Normal completion
Ø Time limit exceeded
Ø Memory unavailable
Ø Bounds violation
Ø Protection error
Ø Arithmetic error
Ø Time overrun
Ø I/O failure
Ø Invalid instruction
Ø Privileged instruction
Ø Data misuse
Ø Operator or OS intervention
Ø Parent termination.


38. What are the reasons for process suspension?

Ø swapping

Ø interactive user request

Ø timing

Ø parent process request


39. What is process migration?

It is the transfer of sufficient amount of the state of process from one machine to the target machine


40. What is mutant?

In Windows NT a mutant provides kernel mode or user mode mutual exclusion with the notion of ownership.


41. What is an idle thread?

The special thread a dispatcher will execute when no ready thread is found.


42. What is FtDisk?

It is a fault tolerance disk driver for Windows NT.


43. What are the possible threads a thread can have?

Ø Ready
Ø Standby
Ø Running
Ø Waiting
Ø Transition
Ø Terminated.


44. What are rings in Windows NT?

Windows NT uses protection mechanism called rings provides by the process to implement separation between the user mode and kernel mode.


45. What is Executive in Windows NT?

In Windows NT, executive refers to the operating system code that runs in kernel mode.


46. What are the sub-components of I/O manager in Windows NT?

Ø Network redirector/ Server
Ø Cache manager.
Ø File systems
Ø Network driver
Ø Device driver


47. What are DDks? Name an operating system that includes this feature.

DDks are device driver kits, which are equivalent to SDKs for writing device drivers. Windows NT includes DDks.


48. What level of security does Windows NT meets?

C2 level security.