Articles On Testing

Wecome to http://www.articlesontesting.com !!!

Articles On Testing

Wecome to http://www.articlesontesting.com !!!

Articles On Testing

Wecome to http://www.articlesontesting.com !!!

Articles On Testing

Wecome to http://www.articlesontesting.com !!!

Articles On Testing

Wecome to http://www.articlesontesting.com !!!

Showing posts with label SQL. Show all posts
Showing posts with label SQL. Show all posts

Tips for Database Query Optimization

  1. Read queries are frequently very inefficient. Consider writing a stored procedure for complex Read queries.
  2. Ensure that the SQL Server instance has ‘Optimize for Ad Hoc’ enabled. This will store a plan stub in memory the first time a query is passed, rather than storing a full plan. This can help with memory management.
  3. SELECT * is not always a good idea and you should only move the data you really need to move and only when you really need it, in order to avoid network, disk and memory contention on your server.
  4. Keep transactions as short as possible and never use them unnecessarily. The longer a lock is held the more likely it is that another user will be blocked. Never hold a transaction
open after control is passed back to the application – use optimistic locking instead.
  1. For small sets of data that are infrequently updated such as lookup values, build a method of caching them in memory on your application server rather than constantly querying
them in the database.
  1. When processing within a transaction, do not  updates until last if possible, to minimize the need for exclusive locks.
  2. Cursors within SQL Server can cause severe performance bottlenecks.
  3. The WHILE loop within SQL Server is just as bad as a cursor.
  4. Ensure your variables and parameters are the same data types as the columns. An implicit or explicit conversion can lead to table scans and slow performance.
  5. A function on columns in the WHERE clause or JOIN criteria means that SQL Server can’t use indexes appropriately and will lead to table scans and slow performance.
  6. Use the  DISTINCT, ORDER BY,  UNION carefully.
  7. Table variables do not have any statistics within SQL Server. This makes them useful for working in situations where a statement level recompile can slow performance. But, that lack of statistics makes them very inefficient where you need to do searches or joins. Use table variables only
where appropriate.
  1. Multi-statement user-defined functions work through table variables- which don’t work well in situations where statistics are required. Avoid using them If a join or filtering is required.
  2. One of the most abused query hints is NO_LOCK. This can lead to extra or missing rows in data sets. Instead of using NO_LOCK consider using a snapshot isolation level such as READ_COMMITTED_SNAPSHOT.
  3. Avoid creating stored procedures that have a wide range of data supplied to them as parameters. These are compiled to use just one query plan.
  4. Try not to interleave data definition language with your data manipulation language queries within SQL Server. This can lead to recompiles which hurts performance.
  5. Temporary tables have statistics which get updated as data is inserted into them. As these updates occur, you can get recompiles. Where possible, substitute table
variables to avoid this issue.
  1. If possible, avoid NULL values in your database. If not, use the appropriate IS NULL and IS NOT NULL code.
  2. A view is meant to mask or modify how tables are presented to the end user. These are fine constructs. But when you start joining one view to another or nesting views within views, performance will suffer. Refer only to tables within a view.
  3. Use extended events to monitor the queries in your system in order to identify slow running queries.
  4. You get exactly one clustered index on a table. Ensure you have it in the right place. First choice is the most frequently accessed column, which may or may not be the primary key. Second choice is a column that structures the storage in a way that helps performance.
This is a must for partitioning data.
  1. Clustered indexes work well on columns that are used a lot for ‘range’ WHERE clauses such as BETWEEN and LIKE, where it is frequently used in ORDER BY clauses or in GROUP BY clauses.
  2. If clustered indexes are narrow (involve few columns) then this will mean that less storage is needed for non-clustered indexes for that table.
  3. Avoid using a column in a clustered index that has values that are frequently updated.
  4. Keep your indexes as narrow as possible. This means reducing the number and size of the columns used in the index key. This helps make the index more efficient.
  5. Always index your foreign key columns if you are likely to delete rows from the referenced table. This avoids a table scan.
  6. A clustered index on a GUID can lead to serious fragmentation of the index due to the random
nature of the GUID. You can use the function NEWSEQUENTIALID() to generate a GUID that will
not lead to as much fragmentation.

  1. Performance is enhanced when indexes are placed on columns used in WHERE, JOIN, ORDER BY, GROUP, and TOP.
  2. A unique index absolutely performs faster than a non-unique index, even with the same values.
  3. Data normalization is a performance tuning technique as well as a storage mechanism.
  4. Referential integrity constraints such as foreign keys actually help performance, because the optimizer can recognize these enforced constraints and make better choices for joins and other data access.
  5. Make sure your database doesn’t hold ‘historic’ data that is no longer used. Archive it out, either into a special ‘archive’ database, a reporting OLAP database, or on file. Large tables mean longer table scans and deeper indexes. This in turn can mean that locks are held for longer. Admin tasks such as Statistics  updates, DBCC checks, and index builds take longer, as do backups.
  6. Separate out the reporting functions from the OLTP production functions. OLTP databases usually have short transactions with a lot of updates whereas reporting databases, such as OLAP and data warehouse systems, have longer data-heavy queries. If possible, put them on different servers.

MS Sql Server installation testing

Installation testing is perhaps one of the most time taking testing category I have ever come across in my professional career. Installation is something that keeps happening for any time range between 20 minutes to 40 minutes, and during this time frame only thing you can do is bird watching. And if you donot have  enough birds in your locality passing time becomes tough.
However , a workaround does exist in this case, during the course of installation you can proceed with some negative testing activities. Negative testing approach can be termed in the sense of parameter validations performed by passing the wrong parameters, such as invalid port, invalid db instance name, invalid collation etc.
Types Of installations :
There are numerous database installation types that can be tested, the major ones are underlined as under:

1. Standalone installation :
  This installation type refers to the one which we generally do during the MSSql Server installation on our personal machines. It however in enterprise installation needs to be assessed for its validation in terms of the port range within which  the installation has to be able to commenced. Other basic features such as the database instance names and all will definitely go fine as that needs to be validated from the sql server's set up file directly. However , we need to have utmost care in the administrator group account we should be able to configure for the database instance getting created, apart from that the service account that should be utilized for running the database instance also needs to be validated. All these validations can be directly performed from the configuration manager, where in the details of the database instance installed is available.Several other validations that are of importance are to verify if the security settings are as per the expectations, such as the error logs have to be captured for what type of events such failed logins only or both successfull and failed logins both. Do not forget to validate the connectivity of the database instance using the port number used at the time of installation.


2. Cluster Installation
 This installation type has some complicated aspects associated with it . There is some restriction in terms of the IP range that can be used for this installation. Generally the best approach to target this installation is the Static IP utilization to consume the service. Get a static IP configured from the lab team on the domain you are performing testing. for my case my case I used some six virtual machines to perform my installation activities. The way I have approached is have three machines blocked just for testing the Cluster installation and the add node to cluster installation. Just have the Windows cluster failover manager installed one any one of the three machine. Do remember to have the DTC running on the machine. Again from validation perspective assure that the properties of the installed database instance is the same as expected with the service being consumed by the account it is expected to. For cluster installation you need to have some cluster network created which you can always do with perfect ease, and pass this as a parameter during the course of installation of the Cluster in the machine.


3. Add Node to a cluster installation :
 Adding node to a cluster is nothing but creating a MSSql server database instance of same name as is already existing within the windows cluster failover manager. So I just consumed the database instance that was created as part of the Cluster installation. So what I meant to say is when you do the add node a pre-requisite is you should have a Cluster installation performed beforehand, first to meet the optimized approach to testing and second to have the test bed created for the add node functionality testing to proceed with. What are the various things we nee to have at the back of mind, use the Cluster network name as used during cluster installation, database instance will be the same as the one used during installation of the Cluster. And this add node has to be done on the same network wherein the cluster network has been configured , that is the machine must be a part of the same network , as already mentioned I hace three machines on one machine I have the Cluster installation done, On remaining two machines I would be performing the add node installation. Once I have performed installation od Add node on second machine its time for the validation which is nothing but go to the main cluster machine where in the windows cluster failover manager has been installed and perform the move instance service to a different node on the cluster. So in my case I would move the service from node one to node two, because on node one I have my cluster installed and node has been added onto the node 2, so I now move the service to node 2 so that the node 2 database instance running and node 1 database instance getting stopped. Do verify that before doing this move node activity and just after having completed the Add node installation , the database service on the added node is installed in Stopped  state. Hoever when we move the service from node 1 to node 2, it is the database service on node 1 that gets stopped and the database service on node 2 that gets started. Similarly add node onto the third node as well ad keep ,moving nodes from one to another among the three machines  namely 1, 2 3 cluster machiness.


4. Enrolment of a Standalone instance to a CMS/MDW server :
5. Security groups enrolment :
6. Access provisioning :
7. Memory range provided to the database instance installed




 To Be Contd.

How to create test data using MS Sql Server

What is test data creation in testing Business Intelligence application ?
What is the relevance of test data creation ?
How is test effectiveness measured in terms of the amount of test data we create to do BI testing ?

There are many other questions which can be religiously and profusely asked by the test enthusiasts, but most of us may not be in a position to answer all of them without touch basing the basic point that no testing can be performed in a BI based application without a proper test data generator being in place. When I say proper , it means a lot specially with so many tools available we still need the ability to code and get this test data preparation done as per our needs.

Manual creation of test data may not be a good way of performing the testing activities in the BI domain. The simple reason accompanying the ideology to this conjecture is application performance .
Performance of an application gets deteriorated when we have huge chunk of data in place and a simple job is run to load data from one source to another one as per the business model's needs.
For example we might need to load some SAP systems data into a simple flat file type data. There can be numerous such occasions when we need to load data from some typical Line of business applications into some separate systems, in this case just a flat file.

So going to the main point how do we create test data ?
Suppose we have  a need to create a thousand record within a excel sheet, we may utilise the excel tips like dragging , or Control + D etc, but to what extent ?

It has always been a known fact that coding is the only way an engineering excellence can be achieved especially in the software world, there is absolutely no other choice if you are really interested to bring the best quality product in place.

Very recently I had come across a small challenge in one of my activities. I had to create a couple of million test data in a specific format to suit the testing of the application under test. Let us have a sample case to make things easy for us to understand.
            I have a simple job which when run, transforms various record sets from some flat file into a table. So basically the source file is my test data in current context, and when I run the job , the records from source file will be loaded onto the table within a particular database. By the way I have been using a term here 'job'.
Has it got anything to do with the daily office going job ? Jokes apart, I know people from Business Intelligence domain are pretty accustomed with the term job but for others I would just like to add some details. So job is basically a sequence of coded steps which can be configured using MS Sql Server/ or many other tools . This combination of steps results in a set of activities as needed by the development architecture.In general for ETL activities is what I shall be discussing this article on.
So just to give some insight on the business requirement front, I have MS Sql server on my machine, on which the job has been configured, I have a source file that will have some record sets in it and the same is placed on some folder within the same machine. And now what I will do is just run the job . After some time job completes with a success message displayed implying that job has been successfully able to extract data from the source file and load the valid data in the table within the database.
That is as simple as moving out from one home into another one, just to make office commuting easy in day to day life.. I hope I did not break your rhythm.

By the way I was just looking at the title of this blog post, "How to create test data using MS Sql Server". And was just thinking if my post till now did any justice to this topic. Yeah of course creating a base on which now we can understand the relevance of test data creation and importance of coding skills in creating huge chunks of data in a matter of seconds.

But will continue some time later , for now feeling sleepy !!!






What is Business Intelligence Testing


What is a BI software application all about ?
What is BI testing all about ?
How does BI application ease out the business process in the current world scenario ?
What type of industry enterprise look forward to developing a BI application ?
What are the challenges one faces while testing a BI application ?
How much of an impact can a BI application make to an organization within its peer group ?

So many questions strike at the back of one's mind when we start with a Business Intelligence domain software application with a testing perspective at the back of one's mind.

What is a BI software application all about ?
 
Let me just list out the terms of BI testing.
Source data, target data, data warehouse, data mart, reports, ETL - Extract Transform Load, etl logic - incremental or full , integration services, reporting services, analysis services, database engine, cubes, measures, measure groups,dimensions, facts, reports testing - drill down, drill through, build deployment. I mean there are just so many of them .

How does the chain look like ? May be something like this :
Source  -> Staging -> Datawarehouse -> Cube -> Reports

Source : Multiple data sources  such as MSSql Server, Oracle, MySql, SAP, Teradata, DB2, Sybase. It can be any combination of the listed as well as non listed ones for example as simple as flat text files. These multiple data sources can be an input data source to an application software that might help the end user take future strategic decisions.

Staging : An intermediary state of source data that act as an input to the datawarehouse where in data is loaded by the ETL logic using integration services from the source data. It is more or less the source data in its original form with those tables getting discarded that might not be needed for the report generation within the application.

Datawarehouse : It is the final entity developed within the database engine that gets created in the sequence mentioned above. It is here that the final objects get created and based on these objects the cubes are created for the analysis purpose. Here the object types that the complete data is organized into is called dimensions and facts. They as well as just like simple tables but the attributes within each dimensions and facts have a specific association within itself to resemble some realistic facts and associated information as per the business requirement.

Cube :  This is the Analysis services objects within the complete  BI application development phase. As the name suggests it is not just the two dimension tables that hold its relevance in a typical RDBMS . It is indeed a three dimensional data modeling technique wherein we analyze a data set in more than just two dimensions. The facts that are associated with an application are analyzed as per the association they have with the various parameters which in BI terms are called as measures, or measure groups. Measure groups are actually a combination of more than one related measure. Thus we get greater insights into how a specific aspect of any business decision making gets impacted with a variation in various parameters.

Reports : Reports are nothing but the cubes data getting represented onto a user friendly interface with the option to parameterise the reports as per the business needs. In simple terms it is the end product that gets developed and the data we look for in the cubes are available for view purpose in them. We can drill down and drill through them based on the scenario we need. For example we can by default see the report for any specific fiscal year as to how much sales have been materialized and then drill down onto the quarter basis , and then monthly basis, then the weekly basis and finally on the daily basis . Similarly drill through also gets applied on to the reports and the data can be visualized as per needs. Authorization and authentication is another feature that has its role to be played in the reporting services but then the authorization of the cubes over ride the privileges granted on the reports.

I hope by now we have gain some insights on what these applications resemble like and how is it different from any other typical web application . Its more of an application that delves into data and data modeling and management techniques.

What is BI testing all about ?
 BI testing is pretty different from any other application in ways more than we can just list out. Domain expertise can only make the tester's life easy in an otherwise unknown nature that such application develops . There is no way one can wait till the reports get developed that we have any UI to start testing with !!! And thats what makes it even more interesting and challenging . 

So what do we do ?
When do we do ?
How do we do ?

Do we wait till the last phase of development cycle where in the reports get deployed onto a web application and then we start testing or is it that we have scopes of starting with our testing activities well within the initial development cycle or may be even before the full flow development activities start ?
There is a scope for automation as well, loads of scope for performance testing and of course the manual has to be the core of any testing activity.
 Things really start once the Integration services packages gets into a shaped state wherein the data from the raw format gets extracted ,transformed and loaded into the new environment.That is what we call as ETL execution as well in short.Test cases authoring can be started once the ETL packages gets developed because that is where a tester's activity gets eased out. This is also the area of high automation ROI wherein the etl related test cases  can be automated using the various database automation techniques.
 Once integration services are in place we target the Analysis services section wherein the  cube testing starts at the base of which exists the dimensions and measures and measure groups. These cubes have the raw data organized in a very effectively designed and inter related manner that the real time data analysis is possible. These cubes form the basis of the reporting service that is the base of the UI testing .
 UI testing or the Reporting services testing is more on the filter and various combination of filter that can be provided with a set of inter - related measures and measure groups for the various facts as per the dimensions supplied. Within the same lies the validation of the drill down and drill through capabilities of the reports by just right click and left clicks in the reports on its graphs and the various axes.The three terms that makes sense during the data validation for the complete application is the dimensions, and fact tables . These are the actual tables that form the basis of the cube data.

 As per the data quality is concerned , the major target area is the Integration services packages that is the ETL packages and the logic with which the data gets extracted , transformed and  loaded into the target systems . It is here that we design the data loading logic that shall ensure that the junk data or data that may in no way be utilized for the reports getting generated do not load into the staging system. It is simply because the data mart is the next stage within the complete BI application development logic and for the data mart to have deal with only relevant data , staging environment must have only needed data with quality in tact. Then stored procedures come into play to arrange the data from the staging tables into the data mart tables and two different categories of data tables get created namely the dimensions and the facts. The data arrangement logic has to be validated in these tables with the major target areas being the stored procedure. These also get created during the   ETL execution phase that is the Integration services phase.Once the mart is finalized the cubes need to be tested wherein the testing involves no more the Sql queries execution but the MDX queries . Here in the measures and measure groups and their data for the various inter-related entities come up.It is done by browsing the cubes within the Analysis service engine. Finally these data becomes the base for showing the relevant reports in the UI as per the business user's requirements.

There we end up with a BI application in place ready to be delivered to the business users. Post production defects are unavoidable in these type of applications. The major cause behind the same is that the application is being developed offshore at a location far away from the end users. This results in most of the cases data not being provided that resembles their real time data due to various data security issues. However, that is how the nature of applications is . So do learn to live with that as a challenge.

Common Sql Queries Interview Questions 4

Common SQL Queries Interview Questions : While we have quite a good number of DBMS softwares in the industry, they all follow some standardization to bring a sense of uniformity of technology .This is the basic reason as to why inspite of so many dbms softwares such as oracle, sql server, db2,sybase and the much well known open source stuff mysql, they all hold the same language in place to perform various database related operations which is called Sequel and abbreviated as SQL. SQL stands for standard query language. Here in below post and quite a number of other database testing related post you will have ample exposure to how SQL queries make our database testing possible.

10. SQL CREATE VIEW Syntax

CREATE VIEW view_name AS
SELECT column_name(s)
FROM table_name
WHERE condition

11. In a table how can one find all of the employees whose first name does not start with 'M' or 'A'.
SELECT EmployeeID, FirstName, LastName, HireDate, City FROM Employees
WHERE (FirstName NOT LIKE 'M%') AND (FirstName NOT LIKE 'A%')

12.Find the wine(s) sold for the highest 
Sells(bar, wine, price)

SELECT wine
FROM Sells
WHERE price >= ALL(
SELECT price
FROM Sells
);
wines(name, manf)


13.Find the wines that are the unique wine by their manufacturer
SELECT nameFROM wines w1
WHERE NOT EXISTS(
SELECT *
FROM wines
WHERE manf = w1.manf AND
name <> w1.name
);

14.Find the name and manufacturer of wines that  James like
wines (name, manf) Likes(drinker, wine)
SELECT *
FROM wines
WHERE name IN
(SELECT wine
FROM Likes
WHERE drinker = ‘James’
);

15.Find bars that serve Mike at the same price James
charges for Bille.
Sells(bar, wine, price)

SELECT bar
FROM Sells
WHERE wine = 'Mike' AND
price =
(SELECT price
FROM Sells
WHERE bar = 'James''s Bar' AND
wine = 'Bille'
);

16.Find pairs of wines by the same manufacturer.
Wines(name, manf)
SELECT w1.name, w2.name
FROM Wines w1, Wines w2
WHERE w1.manf = w2.manf AND
w1.name < w2.name;


17.Find the wines that the frequenters of James’s Bar
like.
Likes(drinker, wine)
Frequents(drinker, bar)
SELECT wine
FROM Frequents, Likes
WHERE bar = 'James''s Bar' AND
Frequents.drinker = Likes.drinker;

18.Find drinkers whose phone has exchange 555.
Drinkers(name, addr, phone)
SELECT name
FROM Drinkers
WHERE phone LIKE '%555- ';


19.Find the price James's Bar charges for Bille.
Sells(bar, wine, price)
SELECT price
FROM Sells
WHERE bar = 'James''s Bar' AND wine = 'Bille';

20. How to find out the 10th highest salary in SQL query?
Table - Tbl_Test_Salary 
Column - int_salary

select max(int_salary) 
from Tbl_Test_Salary 
where int_salary in(select top 10 int_Salary from Tbl_Test_Salary order by int_salary)

Browse for more Database Testing Related Posts 



Common Sql Queries Interview Questions 4
For more articles on Manual Testing Log On to Manual Testing Articles


Common Sql Queries Interview Questions - 3

Common SQL Queries Interview Questions :


1. Determine the name sex and age of the oldest student.
SELECT Name, Gender, (CURRENT_DATE-Dtnaiss)/365 AS Age
FROM Student
WHERE (CURRENT_DATE-Dtnaiss) /365 =
( SELECT MAX(( CURRENT_DATE-Dtnaiss) /365) FROM Student);



2. Display the marks of the student number 1 which are equal to the marks of the student number 2.
SELECT Note
FROM NOTES
WHERE Num=1 AND Note IN
(SELECT Note FROM NOTES WHERE Num=2);


3. Finding the names of everybody who works in the same department as a person called James
SELECT name FROM emp WHERE dept_no =
(SELECT dept_no FROM emp WHERE name = 'James')

or as a join statement, like this:-
SELECT e1.name FROM emp e1,emp e2
WHERE e1.dept_no = e2.dept_no AND e2name = 'James'



4. The SQL statement to find the departments that have employees with a salary higher than the average employee salary
 SELECT name FROM dept
WHERE id IN
     (
      SELECT dept_id FROM emp
      WHERE sal >
          (SELECT avg(sal)FROM emp)
     )

5. Write the SQL to use a sub query which will not return any rows - when just the table structure is required and not any of the data.

CREATE TABLE new_table AS
SELECT * from table_orig WHERE 1=0;
The sub query returns no data but does return the column names and data types to the 'create table' statement.

6. How do you find the Second highest Salary?

SELECT MAX(SALARY) FROM EMPLOYEE WHERE SALARY NOT IN (SELECT MAX(SALARY) FROM EMPLOYEE)



7. Finding duplicates in a table
SELECT name, COUNT (name) AS NumOccurrences FROM users 
GROUP BY email HAVING (COUNT (name) > 1)

You could also use this technique to find rows that occur exactly once:
SELECT name FROM users GROUP BY name HAVING (COUNT (name) = 1) 

8.While deleting a row from the database,i need to check based upon the Name of the Employee and The Employee Number. How to give 2 Conditions in an SQL Query ?

 "DELETE FROM Employees WHERE Emp_Name='" & Txt_Name.Text.Trim & "' 
AND Emp_Number= '" & Txt_Empno.Text.Trim & "'"
 
 9. Common SQL Syntax used in database interaction

9a.   Select Statement
SELECT "column_name" FROM "table_name"

9b.    Distinct
SELECT DISTINCT "column_name" FROM "table_name"

9c.    Where
SELECT "column_name" FROM "table_name" WHERE "condition"

9d.   And/Or
SELECT "column_name" FROM "table_name" WHERE "simple condition" {[AND|OR] "simple condition"}+

9e.   In
SELECT "column_name" FROM "table_name" WHERE "column_name" IN ('value1', 'value2', ...)

9f.    Between
SELECT "column_name" FROM "table_name" WHERE "column_name" BETWEEN 'value1' AND 'value2'

9g.   Like
SELECT "column_name" FROM "table_name" WHERE "column_name" LIKE {PATTERN}

9h.   Order By
SELECT "column_name" FROM "table_name" [WHERE "condition"] ORDER BY "column_name" [ASC, DESC]

9i.    Count
SELECT COUNT("column_name") FROM "table_name"

9j.    Group By
SELECT "column_name1", SUM("column_name2") FROM "table_name" GROUP BY "column_name1"

9k.   Having
SELECT "column_name1", SUM("column_name2") FROM "table_name" GROUP BY "column_name1" HAVING (arithematic function condition)

9l.   Create Table Statement
CREATE TABLE "table_name" ("column 1" "data_type_for_column_1","column 2" "data_type_for_column_2",…)

9m.  Drop Table Statement
DROP TABLE "table_name"

9n.   Truncate Table Statement
TRUNCATE TABLE "table_name"

9m.  Insert Into Statement
INSERT INTO "table_name" ("column1", "column2", ...) VALUES ("value1", "value2", ...)

9o.   Update Statement
UPDATE "table_name" SET "column_1" = [new value] WHERE {condition}

9p.   Delete From Statement
DELETE FROM "table_name" WHERE {condition}




Browse for more Database Testing Related Posts 


Common Sql Queries Interview Questions 4

For more articles on Manual Testing Log On to Manual Testing Articles





Common SQL Queries Interview Questions 1 :

Common SQL Queries Interview Questions :


What is the system function to get the current user's details such as userid etc. ?
USER
USER_ID
USER_NAME
CURRENT_USER
SUSER_SID
HOST_NAME
SYSTEM_USER
SESSION_USER

What is Stored Procedure?
A stored procedure is a named group of SQL statements that have been previously created and stored in the server database. Stored procedures accept input parameters so that a single procedure can be used over the network by several clients using different input data. And when the procedure is modified, all clients automatically get the new version. Stored procedures reduce network traffic and improve performance. Stored procedures can be used to help ensure the integrity of the database.

What is an extended stored procedure? Can you instantiate a COM object by using T-SQL?
An extended stored procedure is a function within a DLL (written in a programming language like C, C++ using Open Data Services (ODS) API) that can be called from T-SQL, just the way we call normal stored procedures using the EXEC statement. See books online to learn how to create extended stored procedures and how to add them to SQL Server. You can instantiate a COM (written in languages like VB, VC++) object from T-SQL by using sp_OACreate stored procedure.

What is Trigger?
A trigger is a SQL procedure that initiates an action when an event (INSERT, DELETE or UPDATE) occurs. Triggers are stored in and managed by the DBMS. Triggers are used to maintain the referential integrity of data by changing the data in a systematic fashion. A trigger cannot be called or executed; DBMS automatically fires the trigger as a result of a data modification to the associated table. Triggers can be viewed as similar to stored procedures in that both consist of procedural logic that is stored at the database level. Stored procedures, however, are not event-drive and are not attached to a specific table as triggers are. Stored procedures are explicitly executed by invoking a CALL to the procedure while triggers are implicitly executed. In addition, triggers can also execute stored procedures.

What is Nested Trigger?
A trigger can also contain INSERT, UPDATE and DELETE logic within itself, so when the trigger is fired because of data modification it can also cause another data modification, thereby firing another trigger. A trigger that contains data modification logic within itself is called a nested trigger.

What is a Linked Server?
Linked Servers is a concept in SQL Server by which we can add other SQL Server to a Group and query both the SQL Server databases using T-SQL Statements. With a linked server, you can create very clean, easy to follow, SQL statements that allow remote data to be retrieved, joined and combined with local data. Stored Procedure SP_AddLinkedServer, SP_AddLinkedSerevrLogin will be used add new Linked Server.

What is Cursor?
Cursor is a database object used by applications to manipulate data in a set on a row-by- row basis, instead of the typical SQL commands that operate on all the rows in the set at one time.
In order to work with a cursor we need to perform some steps in the following order:
  1. Declare cursor
  2. Open cursor
  3. Fetch row from the cursor
  4. Process fetched row
  5. Close cursor
  6. Deallocate cursor
What is sub-query? Explain properties of sub-query?
Sub-queries are often referred to as sub-selects, as they allow a SELECT statement to be executed arbitrarily within the body of another SQL statement. A sub-query is executed by enclosing it in a set of parentheses. Sub-queries are generally used to return a single row as an atomic value, though they may be used to compare values against multiple rows with the IN keyword.
A sub-query is a SELECT statement that is nested within another T-SQL statement. A sub-query SELECT statement if executed independently of the T-SQL statement, in which it is nested, will return a result-set. Meaning a sub-query SELECT statement can standalone and is not depended on the statement in which it is nested. A sub-query SELECT statement can return any number of values, and can be found in, the column list of a SELECT statement, a FROM, GROUP BY, HAVING, and/or ORDER BY clauses of a T-SQL statement. A Sub-query can also be used as a parameter to a function call. Basically a sub-query can be used anywhere an expression can be used.
What are different Types of Join?
  1. Cross Join A cross join that does not have a WHERE clause produces the Cartesian product of the tables involved in the join. The size of a Cartesian product result set is the number of rows in the first table multiplied by the number of rows in the second table. The common example is when company wants to combine each product with a pricing table to analyze each product at each price.
  2. Inner Join A join that displays only the rows that have a match in both joined tables is known as inner Join. This is the default type of join in the Query and View Designer.
  3. Outer Join A join that includes rows even if they do not have related rows in the joined table is an Outer Join. You can create three different outer join to specify the unmatched rows to be included:
    1. Left Outer Join: In Left Outer Join all rows in the first-named table i.e. "left" table, which appears leftmost in the JOIN clause are included. Unmatched rows in the right table do not appear.
    2. Right Outer Join: In Right Outer Join all rows in the second-named table i.e. "right" table, which appears rightmost in the JOIN clause are included. Unmatched rows in the left table are not included.
    3. Full Outer Join: In Full Outer Join all rows in all joined tables are included, whether they are matched or not.
  4. Self Join This is a particular case when one table joins to itself, with one or two aliases to avoid confusion. A self join can be of any type, as long as the joined tables are the same. A self join is rather unique in that it involves a relationship with only one table. The common example is when company has a hierarchal reporting structure whereby one member of staff reports to another. Self Join can be Outer Join or Inner Join.

What is User Defined Functions? What kind of User-Defined Functions can be created?
User-Defined Functions allow defining its own T-SQL functions that can accept 0 or more parameters and return a single scalar data value or a table data type.
Different Kinds of User-Defined Functions created are:
  1. Scalar User-Defined Function A Scalar user-defined function returns one of the scalar data types. Text, ntext, image and timestamp data types are not supported. These are the type of user-defined functions that most developers are used to in other programming languages. You pass in 0 to many parameters and you get a return value.
  2. Inline Table-Value User-Defined Function An Inline Table-Value user-defined function returns a table data type and is an exceptional alternative to a view as the user-defined function can pass parameters into a T-SQL select command and in essence provide us with a parameterized, non-updateable view of the underlying tables.
  3. Multi-statement Table-Value User-Defined Function A Multi-Statement Table-Value user-defined function returns a table and is also an exceptional alternative to a view as the function can support multiple T-SQL statements to build the final result where the view is limited to a single SELECT statement. Also, the ability to pass parameters into a TSQL select command or a group of them gives us the capability to in essence create a parameterized, non-updateable view of the data in the underlying tables. Within the create function command you must define the table structure that is being returned. After creating this type of user-defined function, It can be used in the FROM clause of a T-SQL command unlike the behavior found when using a stored procedure which can also return record sets.
What is Identity?
Identity (or AutoNumber) is a column that automatically generates numeric values. A start and increment value can be set, but most DBA leave these at 1. A GUID column also generates numbers; the value of this cannot be controlled. Identity/GUID columns do not need to be indexed.

What is Data-Warehousing?
  1. Subject-oriented, meaning that the data in the database is organized so that all the data elements relating to the same real-world event or object are linked together;
  2. Time-variant, meaning that the changes to the data in the database are tracked and recorded so that reports can be produced showing changes over time;
  3. Non-volatile, meaning that data in the database is never over-written or deleted, once committed, the data is static, read-only, but retained for future reporting.
  4. Integrated, meaning that the database contains data from most or all of an organization's operational applications, and that this data is made consistent.
How do you implement one-to-one, one-to-many and many-to-many relationships while designing tables?
One-to-One relationship can be implemented as a single table and rarely as two tables with primary and foreign key relationships. One-to-Many relationships are implemented by splitting the data into two tables with primary key and foreign key relationships. Many-to-Many relationships are implemented using a junction table with the keys from both the tables forming the composite primary key of the junction table.
What are the different isolation levels ?
An isolation level determines the degree of isolation of data between concurrent transactions. The default SQL Server isolation level is Read Committed. Here are the other isolation levels (in the ascending order of isolation): Read Uncommitted, Read Committed, Repeatable Read, Serializable.
What would happen if you create an index on each column of a table ?
If you create an index on each column of a table, it improves the query performance, as the query optimizer tool of the Database engine can choose from all the existing indexes to come up with an efficient execution plan. At the same time, data modification operations (such as INSERT, UPDATE, DELETE) will become slow, as every time data changes in the table, all the indexes need to be updated. Another disadvantage is that, indexes need disk space, the more indexes you have, more disk space is used.

What is Lock escalation?
Lock escalation is the process of converting many fine-grain locks into fewer coarse-grain locks, reducing system overhead. When a transaction exceeds its escalation threshold, automatic  escalation of  row locks and page locks into table locks happens.
When a transaction requests rows from a table, SQL Server automatically acquires locks on those rows affected and places higher-level intent locks on the pages and table, or index, which contain those rows. When the number of locks held by the transaction exceeds its threshold, a  stronger lock is acquired, and all page and row level locks held by the transaction on the table are released, reducing lock overhead.
Lock escalation thresholds are determined dynamically by SQL Server and do not require configuration.
What is a live lock?
A live lock is one, where a request for an exclusive lock is repeatedly denied because a series of overlapping shared locks keeps interfering. SQL Server detects the situation after four denials and refuses further shared locks. A live lock also occurs when read transactions monopolize a table or page, forcing a write transaction to wait indefinitely.

What is blocking and how would you troubleshoot it?
Blocking happens when one connection from an application holds a lock and a second connection requires a conflicting lock type. This forces the second connection to wait, blocked on the first.

Common SQL Queries Interview Questions :

Common SQL Queries Interview Questions :

What is the difference between oracle, sql and sql server ?
  • Oracle is based on RDBMS.
  • SQL is Structured Query Language.
  • SQL Server is another tool for RDBMS provided by MicroSoft.
Why you need indexing? where that is Stored and what you mean by schema object? For what purpose we are using view?
·         We can't create an Index on Index.. Index is stoed in user_index table. Every object that has been created on Schema is Schema Object like Table, View etc. If we want to share the particular data to various users we have to use the virtual table for the Base table. So that is a view.
·         Indexing is used for faster search or to retrieve data faster from various table. Schema containing set of tables, basically schema means logical separation of the database. View is crated for faster retrieval of data. It's customized virtual table. we can create a single view of multiple tables. Only the drawback is..view needs to be get refreshed for retrieving updated data.

Difference between Stored Procedure and Trigger?
  • we can call stored procedure explicitly.
  • but trigger is automatically invoked when the action defined in trigger is done.
    ex: create trigger after Insert on
  • this trigger invoked after we insert something on that table.
  • Stored procedure can't be inactive but trigger can be Inactive.
  • Triggers are used to initiate a particular activity after fulfilling certain condition.It need to define and can be enable and disable according to need.
What is the advantage to use trigger in your PL?
A trigger is a database object directly associated with a particular table. It fires whenever a specific statement/type of statement is issued against that table. The types of statements are insert,update,delete and query statements. Basically, trigger is a set of SQL statements A trigger is a solution to the restrictions of a constraint. For instance: 1.A database column cannot carry PSEUDO columns as criteria where a trigger can. 2. A database constraint cannot refer old and new values for a row where a trigger can.
Triggers are fired implicitly on the tables/views on which they are created. There are various advantages of using a trigger. Some of them are:
  • Suppose we need to validate a DML statement(insert/Update/Delete) that modifies a table then we can write a trigger on the table that gets fired implicitly whenever DML statement is executed on that table.
  • Another reason of using triggers can be for automatic updation of one or more tables whenever a DML/DDL statement is executed for the table on which the trigger is created.
  • Triggers can be used to enforce constraints. For eg : Any insert/update/ Delete statements should not be allowed on a particular table after office hours. For enforcing this constraint Triggers should be used.
  • Triggers can be used to publish information about database events to subscribers. Database event can be a system event like Database startup or shutdown or it can be a user even like User loggin in or user logoff.
What the difference between UNION and UNIONALL?
·         Union will remove the duplicate rows from the result set while Union all does'nt.

What is the difference between TRUNCATE and DELETE commands?
·         Both will result in deleting all the rows in the table .TRUNCATE call cannot be rolled back as it is a DDL command and all memory space for that table is released back to the server. TRUNCATE is much faster.Whereas DELETE call is an DML command and can be rolled back.

Which system table contains information on constraints on all the tables created ?
USER_CONSTRAINTS,
system table contains information on constraints on all the tables created

Explain Normalization ?
Normalisation means refining the redundancy and maintain stablisation. there are
Four types of Normalization :first normal forms, second normal forms, third normal forms and fourth Normal forms.

How to find out the database name from SQL*PLUS command prompt?
Select * from global_name;
This will give the database name which u r currently connected to.

What is the difference between SQL and SQL Server ? SQLServer is an RDBMS just like oracle,DB2 from Microsoft
whereas
Structured Query Language (SQL), pronounced "sequel", is a language that provides an interface to relational database systems. It was developed by IBM in the 1970s for use in System R. SQL is a de facto standard, as well as an ISO and ANSI standard. SQL is used to perform various operations on RDBMS.

What is difference between Co-related sub query and nested sub query?
Correlated subquery runs once for each row selected by the outer query. It contains a reference to a value from the row selected by the outer query.
Nested subquery runs only once for the entire nesting (outer) query. It does not contain any reference to the outer query row.
For example,
Correlated Subquery:
select e1.empname, e1.basicsal, e1.deptno from emp e1 where e1.basicsal = (select max(basicsal) from emp e2 where e2.deptno = e1.deptno)
Nested Subquery:
select empname, basicsal, deptno from emp where (deptno, basicsal) in (select deptno, max(basicsal) from emp group by deptno)

WHAT OPERATOR PERFORMS PATTERN MATCHING?
Pattern matching operator is LIKE and it has to used with two attributes
1. %  means matches zero or more characters and  
2. _ ( underscore ) means matching exactly one character

What is cluster.cluster index and non cluster index ?
Clustered Index:- A Clustered index is a special type of index that reorders the way records in the table are physically stored. Therefore table may have only one clustered index.Non-Clustered Index:- A Non-Clustered index is a special type of index in which the logical order of the index does not match the physical stored order of the rows in the disk. The leaf nodes of a non-clustered index does not consists of the data pages. instead the leaf node contains index rows.

What is the difference between a "where" clause and a "having" clause?
"Where" is a kind of restriction statement. You use where clause to restrict all the data from DB. Where clause is using before result retrieving. But Having clause is using after retrieving the data. Having clause is a kind of filtering command.

What structure can you implement for the database to speed up table reads?  
Follow the rules of DB tuning we have to:
1] properly use indexes ( different types of indexes)
2] properly locate different DB objects across different tablespaces, files and so on.
3] create a special space (tablespace) to locate some of the data with special datatype ( for example CLOB, LOB and …)

What are the tradeoffs with having indexes?
1. Faster selects, slower updates.
2. Extra storage space to store indexes. Updates are slower because in addition to updating the table you have to update the index.

What is "normalization"? "Denormalization"? Why do you sometimes want to denormalize?
 Normalizing data means eliminating redundant information from a table and organizing the data so that future changes to the table are easier.
 Denormalization means allowing redundancy in a table. The main benefit of denormalization is improved performance with simplified data retrieval and manipulation. This is done by reduction in the number of joins needed for data processing.

What is a "constraint"?
A constraint allows you to apply simple referential integrity checks to a table. There are four primary types of constraints
PRIMARY/UNIQUE - enforces uniqueness of a particular table column. But by default primary key creates a clustered index on the column, where are unique creates a non-clustered index by default. Another major difference is that, primary key doesn't allow NULLs, but unique key allows one NULL only DEFAULT - specifies a default value for a column in case an insert operation does not provide one. FOREIGN KEY - validates that every value in a column exists in a column of another table. CHECK - checks that every value stored in a column is in some specified list. Each type of constraint performs a specific type of action. Default is not a constraint. NOT NULL is one more constraint which does not allow values in the specific column to be null. And also it the only constraint which is not a table level constraint.

What types of index data structures can you have?
- An index helps to faster search values in tables. The three most commonly used index-types are: -
B-Tree: builds a tree of possible values with a list of row IDs that have the leaf value. Needs a lot of space and is the default index type for most databases.
Bitmap: string of bits for each possible value of the column. Each bit string has one bit for each row. Needs only few space and is very fast.(however, domain of value cannot be large, e.g. SEX(m,f); degree(BS,MS,PHD).
 Hash: A hashing algorithm is used to assign a set of characters to represent a text string such as a composite of keys or partial keys, and compresses the underlying data. Takes longer to build and is supported by relatively few databases.

Why can a "group by" or "order by" clause be expensive to process?
Processing of "group by" or "order by" clause often requires creation of Temporary tables to process the results of the query. Which depending of the result set can be very expensive.

What is a SQL view?
 An output of a query can be stored as a view. View acts like small table which meets our criterion. View is a pre-complied SQL query which is used to select data from one or more tables. A view is like a table but it doesn't physically take any space. View is a good way to present data in a particular format if you use that query quite often. View can also be used to restrict users from accessing the tables directly.

What is GROUP BY?
The GROUP BY keywords has been added to SQL because aggregate functions (like SUM) return the aggregate of all column values every time they are called. Without the GROUP BY functionality, finding the sum for each individual group of column values was not possible.

What are defaults? Is there a column to which a default can't be bound?
A default is a value that will be used by a column, if no value is supplied to that column while inserting data. IDENTITY columns and timestamp columns can't have defaults bound to them.

Explain different isolation levels
An isolation level determines the degree of isolation of data between concurrent transactions. The default SQL Server isolation level is Read Committed. Here are the other isolation levels (in the ascending order of isolation):
Read Uncommitted,
Read Committed,
Repeatable Read,
Serializable

What type of Index will get created after executing the below statement?
CREATE INDEX myIndex ON myTable(myColumn)
Non-clustered index. By default a clustered index gets created on the primary key, unless specified otherwise.
What's the maximum size of a row?
8060 bytes.