Articles On Testing

Wecome to !!!

Articles On Testing

Wecome to !!!

Articles On Testing

Wecome to !!!

Articles On Testing

Wecome to !!!

Articles On Testing

Wecome to !!!

Showing posts with label database testing. Show all posts
Showing posts with label database testing. Show all posts

How do you Test a database split

What is a database split ?
When do we really come across with such a business scenario in our enterprise business ?
What is it that holds paramount importance in planning and strategizing the testing activities in such an application ?
How does the test related timelines impact the overall estimation in such a project ?
What are the areas of scope which have scope of optimization in test estimation of such project ?

I hope this post is best suited for your expectation to be met in case these are the areas of your interest when browsing through this post .

Database split is nothing but separation of a single entity into two child entities just the similar way an amoeba fission is known to occur. Here in the risk involved may be as high as ad infinitum in case there is some legal aspects involved .
So the best approach to test is validate the data that has been entered in the child databases. Only relevant data should get loaded into each of the respective sinister databases.
Bring in the approach of database test automation for validation of data content using the rowcount and datacheck as we discussed in a previous article listed as under.

By following the approach mentioned in this article you can reduce heavily in test execution timelines. Scripting of the queries is a one time activity and you can use it in all subsequent cycles. Also the script preparation is a parallel activity and can be done during the course of development pahse itself. there is no need to maintain any test case document as well as the dbunit test scripts are themselves self explanatory. however in some cases depending on customer's expectation, some comments in the dbunits test script itslef can be provided.
Do you know how long it takes for the scripts to run ?
One script runs in times of milliseconds.
I ran close to ten databases split operation testing in five minutes and results were logged into TFS.
This included close to 450 tables validation with close to query counts in the range of 1275 odd !!!
And all was done in minutes. However you would have taken close to four man days for validation of tables to extent of these by manually executing these queries in sql management studio.

Enjoy smart test execution technique by implementing the dbunit test of visual studio 2010, while in 2012 version has been separately labelled as SqlDatabaseUnit test and you need to install the SSDT along with the Visual Studio 2012 IDE.

You will really have a ease of time handling this project, but without this approach the time taken for verification is just way high and not a feasible approach in the current trending industry.

Do post your queries and I shall help you get them answered herein asap.

How to create test data using MS Sql Server

What is test data creation in testing Business Intelligence application ?
What is the relevance of test data creation ?
How is test effectiveness measured in terms of the amount of test data we create to do BI testing ?

There are many other questions which can be religiously and profusely asked by the test enthusiasts, but most of us may not be in a position to answer all of them without touch basing the basic point that no testing can be performed in a BI based application without a proper test data generator being in place. When I say proper , it means a lot specially with so many tools available we still need the ability to code and get this test data preparation done as per our needs.

Manual creation of test data may not be a good way of performing the testing activities in the BI domain. The simple reason accompanying the ideology to this conjecture is application performance .
Performance of an application gets deteriorated when we have huge chunk of data in place and a simple job is run to load data from one source to another one as per the business model's needs.
For example we might need to load some SAP systems data into a simple flat file type data. There can be numerous such occasions when we need to load data from some typical Line of business applications into some separate systems, in this case just a flat file.

So going to the main point how do we create test data ?
Suppose we have  a need to create a thousand record within a excel sheet, we may utilise the excel tips like dragging , or Control + D etc, but to what extent ?

It has always been a known fact that coding is the only way an engineering excellence can be achieved especially in the software world, there is absolutely no other choice if you are really interested to bring the best quality product in place.

Very recently I had come across a small challenge in one of my activities. I had to create a couple of million test data in a specific format to suit the testing of the application under test. Let us have a sample case to make things easy for us to understand.
            I have a simple job which when run, transforms various record sets from some flat file into a table. So basically the source file is my test data in current context, and when I run the job , the records from source file will be loaded onto the table within a particular database. By the way I have been using a term here 'job'.
Has it got anything to do with the daily office going job ? Jokes apart, I know people from Business Intelligence domain are pretty accustomed with the term job but for others I would just like to add some details. So job is basically a sequence of coded steps which can be configured using MS Sql Server/ or many other tools . This combination of steps results in a set of activities as needed by the development architecture.In general for ETL activities is what I shall be discussing this article on.
So just to give some insight on the business requirement front, I have MS Sql server on my machine, on which the job has been configured, I have a source file that will have some record sets in it and the same is placed on some folder within the same machine. And now what I will do is just run the job . After some time job completes with a success message displayed implying that job has been successfully able to extract data from the source file and load the valid data in the table within the database.
That is as simple as moving out from one home into another one, just to make office commuting easy in day to day life.. I hope I did not break your rhythm.

By the way I was just looking at the title of this blog post, "How to create test data using MS Sql Server". And was just thinking if my post till now did any justice to this topic. Yeah of course creating a base on which now we can understand the relevance of test data creation and importance of coding skills in creating huge chunks of data in a matter of seconds.

But will continue some time later , for now feeling sleepy !!!

ETL testing in a Business Intelligence application

What is ETL testing all about ?
                   ETL is the extract , transform and load operation performed on the source data in order to build a strong base for developing an application that does the needful for the decision making teams in large enterprise systems.

Such is the importance of a Business Intelligence(BI) application that at times big enterprise end up in developing a complete application for just a single user to access and analyze the reports available in them. And developing the same itself involves huge effort and cost not only due to huge chunk of data and their testing but also the degree of complexity associated with the logic development for achieving the same . Reports however are not solely dependent on the ETL but several other logically inter-related objects and the access and authorization rules implemented on the cube level as well as the UI level.Someone who is an expert at BI application development might just not be sufficient to develop such an application as the end user's requirement needs to be documented and loads of data analysis on its nature needs to be done to frame a volley of questions for the end user and clarification documents needs to be tracked right through out the project development life cycle. this is because in these systems a bug that gets discovered at a very late stage generally has very high cost associated with it for being fixed.

Lets not get out of context and try to understand the approach and priority of testing just the ETL logic within the application under test. For a Business Intelligence application software developed using the Microsoft technologies we have the Microsoft SQL Server in place . And developing the ETL can be achieved by the  SSIS - Sql Server Integration Services feature available there in.
Using the SSIS feature, the packages can be developed which have the ETL logic in them, based on which the source data is filtered as per the requirement traced out for the application.

Once the ETL packages get developed , the testing of the same becomes an uphill task due to the huge amount of data in the source environment which gets extracted, transformed and loaded into the destination rather a sort of pre -destination environment. Just imagine the verification of each and every record set that was in the source against the target environment. From common principle we might just conclude that the data load is based on the Sql queries which in any case will go fine, but the main target area is the logic verification as to which data needs to be ignored for loading in the target environment. there might be cases wherein we have duplicate records in the source and we might not be able to load both the records just because that will create high level of discrepancy when we browse through the reports in the end product. Running ETL packages gets the data loaded from the source to the staging environment and depending on the context and nature of the application , same gets loaded into the data mart as well based on the transformation logic applied and also the nature of load which may be a full load, that is truncate and load, or a incremental load that is just the additional data gets loaded into the environment.

When we have the uphill task of validating such huge chunks of data we take the help of some database automation tools that helps us to verify each and every record . There are many tools available in the industry  and at times we can ourselves create tools using the excel and ,macro programming but then what I prefer to do is utilize the DB unit test projevct feature available within the Visual Studio IDE. Now the time is to build up the logic that will help us validate each record set in the source against the target. The general approach that is considered to be sufficient to to do the same is in a two staged sql queries verification. One being the count of the source and target data environment must match on applying the filters as has been documented by the client and the second is that the data must match . We genuinely find the option of empty return value as sufficient for doing the same . What we all need to apply is the except keyword between the two query execution logic and the addition of the Test condition from within the added DB unit test file.
Just browse through the some screenshots of the working analogy for doing the same that would definitely make things easy for the SSIS testers.


Just hit cancel for the above dialog box. This is actually the database configuration file creation which we can create directly by adding an app.config file which is as under. To get it done we can add a new item into the project by right clicking the project and clicking add new item and then  selecting an application configuration file as shown :

The content of the same will be something as under :
 <?xml version="1.0" encoding="utf-8" ?> <configuration>   <configSections>     <section name="DatabaseUnitTesting" type="Microsoft.Data.Schema.UnitTesting.

Microsoft.Data.Schema.UnitTesting, Version=, Culture=neutral, 
PublicKeyToken=b03f5f7f11d50a3a" />
    <DataGeneration ClearDatabase="true" />
    <ExecutionContext Provider="System.Data.SqlClient"  
ConnectionString="Data Source=DB_Server_Name;Initial Catalog=Master;
Integrated Security=True;Pooling=False"
        CommandTimeout="220" />

Once this has been set up , we can go ahead with the logic to validate the same, which we do as under.
Here we have just renamed the method from databasetest to more relevant onw that is Employee_RowCount, similarly, we add another test method by clicking the plus icon to add another method to the same DBUnit class file that is employee to validate the data content as under.

So what is it that I have done in this First level of verifictaion as in above image : Its is simple I have just written the query to fetch the row count and utilized the Except keyword. So now if the count matches of the two query , due to the except query in place we are expecting the "Empty resultset" as the return value of the complete query execution. Hence in the below section of the image you can see, I have removed the default condition that got added and added a new condition namely the empty resultset. We are hence ready with one validation that is on the row count.

Second level of verification is for the data match as well. We add a new method by the upper plus icon and rename it to Employee_DataCheck method name, provide the query for the same and add the except keyword in between the two queries and rest is as was done above to get the empty resultset as the return value for the query execution. This will look as under :
As part of some experience tips we at times have issue with data validation especially for the string datatype attributes. Just check in for the collation conflict that creates such issues. Do provide the collation to sort out those issues.

A third level of verification that adds quality to the testing of the SSIS packages and ETL execution is verification of the schema of the database objects in the source against the destination environment. This provides an added quality as it helps us to verify if the data will be loaded with same precision values or not.
The general query that will fetch the schema details of any database table is as under :

select column_name  collate Latin1_General_CI_AI,
 DATA_TYPE  collate Latin1_General_CI_AI
where TABLE_NAME='Employee'
select column_name  collate Latin1_General_CI_AI,
 DATA_TYPE  collate Latin1_General_CI_AI
where TABLE_NAME='Employee'
and COLUMN_NAME Not in ('clumn not to be validated especially some ID reltaed that gets auto generated')

The code above validates the column names in the two environment,  do keep in mind that there are certain columns that get auto generated especially in staging environment we have staging id and so on, so do exclude their verification as has been addressed.these columns get verified on their attributes which we provide above namely "datatype", "Size" that is character maximum legth,"numeric precision","datetime precision". We could have easily validated another aspect such as IsNull. But the general approach in Business Intellignece(BI) is that if we have some column to validate the ID of the complete record set we tend to oignore the IsNull feature and the same gets reflected in the code above.

Thus we have successfully automated the ETL testing using three levels of verification namely rowcount, datacheck and schemacheck. This provides the team with more level of confidence on the quality of testing of the ETL packages as data verification has been done at a rigorous level rather a sampling manner.

So we have explored the ETL testing and ways to achieve high degree of quality for the same..

What is Business Intelligence Testing

What is a BI software application all about ?
What is BI testing all about ?
How does BI application ease out the business process in the current world scenario ?
What type of industry enterprise look forward to developing a BI application ?
What are the challenges one faces while testing a BI application ?
How much of an impact can a BI application make to an organization within its peer group ?

So many questions strike at the back of one's mind when we start with a Business Intelligence domain software application with a testing perspective at the back of one's mind.

What is a BI software application all about ?
Let me just list out the terms of BI testing.
Source data, target data, data warehouse, data mart, reports, ETL - Extract Transform Load, etl logic - incremental or full , integration services, reporting services, analysis services, database engine, cubes, measures, measure groups,dimensions, facts, reports testing - drill down, drill through, build deployment. I mean there are just so many of them .

How does the chain look like ? May be something like this :
Source  -> Staging -> Datawarehouse -> Cube -> Reports

Source : Multiple data sources  such as MSSql Server, Oracle, MySql, SAP, Teradata, DB2, Sybase. It can be any combination of the listed as well as non listed ones for example as simple as flat text files. These multiple data sources can be an input data source to an application software that might help the end user take future strategic decisions.

Staging : An intermediary state of source data that act as an input to the datawarehouse where in data is loaded by the ETL logic using integration services from the source data. It is more or less the source data in its original form with those tables getting discarded that might not be needed for the report generation within the application.

Datawarehouse : It is the final entity developed within the database engine that gets created in the sequence mentioned above. It is here that the final objects get created and based on these objects the cubes are created for the analysis purpose. Here the object types that the complete data is organized into is called dimensions and facts. They as well as just like simple tables but the attributes within each dimensions and facts have a specific association within itself to resemble some realistic facts and associated information as per the business requirement.

Cube :  This is the Analysis services objects within the complete  BI application development phase. As the name suggests it is not just the two dimension tables that hold its relevance in a typical RDBMS . It is indeed a three dimensional data modeling technique wherein we analyze a data set in more than just two dimensions. The facts that are associated with an application are analyzed as per the association they have with the various parameters which in BI terms are called as measures, or measure groups. Measure groups are actually a combination of more than one related measure. Thus we get greater insights into how a specific aspect of any business decision making gets impacted with a variation in various parameters.

Reports : Reports are nothing but the cubes data getting represented onto a user friendly interface with the option to parameterise the reports as per the business needs. In simple terms it is the end product that gets developed and the data we look for in the cubes are available for view purpose in them. We can drill down and drill through them based on the scenario we need. For example we can by default see the report for any specific fiscal year as to how much sales have been materialized and then drill down onto the quarter basis , and then monthly basis, then the weekly basis and finally on the daily basis . Similarly drill through also gets applied on to the reports and the data can be visualized as per needs. Authorization and authentication is another feature that has its role to be played in the reporting services but then the authorization of the cubes over ride the privileges granted on the reports.

I hope by now we have gain some insights on what these applications resemble like and how is it different from any other typical web application . Its more of an application that delves into data and data modeling and management techniques.

What is BI testing all about ?
 BI testing is pretty different from any other application in ways more than we can just list out. Domain expertise can only make the tester's life easy in an otherwise unknown nature that such application develops . There is no way one can wait till the reports get developed that we have any UI to start testing with !!! And thats what makes it even more interesting and challenging . 

So what do we do ?
When do we do ?
How do we do ?

Do we wait till the last phase of development cycle where in the reports get deployed onto a web application and then we start testing or is it that we have scopes of starting with our testing activities well within the initial development cycle or may be even before the full flow development activities start ?
There is a scope for automation as well, loads of scope for performance testing and of course the manual has to be the core of any testing activity.
 Things really start once the Integration services packages gets into a shaped state wherein the data from the raw format gets extracted ,transformed and loaded into the new environment.That is what we call as ETL execution as well in short.Test cases authoring can be started once the ETL packages gets developed because that is where a tester's activity gets eased out. This is also the area of high automation ROI wherein the etl related test cases  can be automated using the various database automation techniques.
 Once integration services are in place we target the Analysis services section wherein the  cube testing starts at the base of which exists the dimensions and measures and measure groups. These cubes have the raw data organized in a very effectively designed and inter related manner that the real time data analysis is possible. These cubes form the basis of the reporting service that is the base of the UI testing .
 UI testing or the Reporting services testing is more on the filter and various combination of filter that can be provided with a set of inter - related measures and measure groups for the various facts as per the dimensions supplied. Within the same lies the validation of the drill down and drill through capabilities of the reports by just right click and left clicks in the reports on its graphs and the various axes.The three terms that makes sense during the data validation for the complete application is the dimensions, and fact tables . These are the actual tables that form the basis of the cube data.

 As per the data quality is concerned , the major target area is the Integration services packages that is the ETL packages and the logic with which the data gets extracted , transformed and  loaded into the target systems . It is here that we design the data loading logic that shall ensure that the junk data or data that may in no way be utilized for the reports getting generated do not load into the staging system. It is simply because the data mart is the next stage within the complete BI application development logic and for the data mart to have deal with only relevant data , staging environment must have only needed data with quality in tact. Then stored procedures come into play to arrange the data from the staging tables into the data mart tables and two different categories of data tables get created namely the dimensions and the facts. The data arrangement logic has to be validated in these tables with the major target areas being the stored procedure. These also get created during the   ETL execution phase that is the Integration services phase.Once the mart is finalized the cubes need to be tested wherein the testing involves no more the Sql queries execution but the MDX queries . Here in the measures and measure groups and their data for the various inter-related entities come up.It is done by browsing the cubes within the Analysis service engine. Finally these data becomes the base for showing the relevant reports in the UI as per the business user's requirements.

There we end up with a BI application in place ready to be delivered to the business users. Post production defects are unavoidable in these type of applications. The major cause behind the same is that the application is being developed offshore at a location far away from the end users. This results in most of the cases data not being provided that resembles their real time data due to various data security issues. However, that is how the nature of applications is . So do learn to live with that as a challenge.

Common Sql Queries Interview Questions 4

Common SQL Queries Interview Questions : While we have quite a good number of DBMS softwares in the industry, they all follow some standardization to bring a sense of uniformity of technology .This is the basic reason as to why inspite of so many dbms softwares such as oracle, sql server, db2,sybase and the much well known open source stuff mysql, they all hold the same language in place to perform various database related operations which is called Sequel and abbreviated as SQL. SQL stands for standard query language. Here in below post and quite a number of other database testing related post you will have ample exposure to how SQL queries make our database testing possible.


CREATE VIEW view_name AS
SELECT column_name(s)
FROM table_name
WHERE condition

11. In a table how can one find all of the employees whose first name does not start with 'M' or 'A'.
SELECT EmployeeID, FirstName, LastName, HireDate, City FROM Employees
WHERE (FirstName NOT LIKE 'M%') AND (FirstName NOT LIKE 'A%')

12.Find the wine(s) sold for the highest 
Sells(bar, wine, price)

FROM Sells
WHERE price >= ALL(
SELECT price
FROM Sells
wines(name, manf)

13.Find the wines that are the unique wine by their manufacturer
SELECT nameFROM wines w1
FROM wines
WHERE manf = w1.manf AND
name <>

14.Find the name and manufacturer of wines that  James like
wines (name, manf) Likes(drinker, wine)
FROM wines
(SELECT wine
FROM Likes
WHERE drinker = ‘James’

15.Find bars that serve Mike at the same price James
charges for Bille.
Sells(bar, wine, price)

FROM Sells
WHERE wine = 'Mike' AND
price =
(SELECT price
FROM Sells
WHERE bar = 'James''s Bar' AND
wine = 'Bille'

16.Find pairs of wines by the same manufacturer.
Wines(name, manf)
FROM Wines w1, Wines w2
WHERE w1.manf = w2.manf AND <;

17.Find the wines that the frequenters of James’s Bar
Likes(drinker, wine)
Frequents(drinker, bar)
FROM Frequents, Likes
WHERE bar = 'James''s Bar' AND
Frequents.drinker = Likes.drinker;

18.Find drinkers whose phone has exchange 555.
Drinkers(name, addr, phone)
FROM Drinkers
WHERE phone LIKE '%555- ';

19.Find the price James's Bar charges for Bille.
Sells(bar, wine, price)
SELECT price
FROM Sells
WHERE bar = 'James''s Bar' AND wine = 'Bille';

20. How to find out the 10th highest salary in SQL query?
Table - Tbl_Test_Salary 
Column - int_salary

select max(int_salary) 
from Tbl_Test_Salary 
where int_salary in(select top 10 int_Salary from Tbl_Test_Salary order by int_salary)

Browse for more Database Testing Related Posts 

Common Sql Queries Interview Questions 4
For more articles on Manual Testing Log On to Manual Testing Articles

Common Sql Queries Interview Questions - 3

Common SQL Queries Interview Questions :

1. Determine the name sex and age of the oldest student.
SELECT Name, Gender, (CURRENT_DATE-Dtnaiss)/365 AS Age
FROM Student
WHERE (CURRENT_DATE-Dtnaiss) /365 =
( SELECT MAX(( CURRENT_DATE-Dtnaiss) /365) FROM Student);

2. Display the marks of the student number 1 which are equal to the marks of the student number 2.

3. Finding the names of everybody who works in the same department as a person called James
SELECT name FROM emp WHERE dept_no =
(SELECT dept_no FROM emp WHERE name = 'James')

or as a join statement, like this:-
SELECT FROM emp e1,emp e2
WHERE e1.dept_no = e2.dept_no AND e2name = 'James'

4. The SQL statement to find the departments that have employees with a salary higher than the average employee salary
 SELECT name FROM dept
      SELECT dept_id FROM emp
      WHERE sal >
          (SELECT avg(sal)FROM emp)

5. Write the SQL to use a sub query which will not return any rows - when just the table structure is required and not any of the data.

SELECT * from table_orig WHERE 1=0;
The sub query returns no data but does return the column names and data types to the 'create table' statement.

6. How do you find the Second highest Salary?


7. Finding duplicates in a table
SELECT name, COUNT (name) AS NumOccurrences FROM users 
GROUP BY email HAVING (COUNT (name) > 1)

You could also use this technique to find rows that occur exactly once:
SELECT name FROM users GROUP BY name HAVING (COUNT (name) = 1) 

8.While deleting a row from the database,i need to check based upon the Name of the Employee and The Employee Number. How to give 2 Conditions in an SQL Query ?

 "DELETE FROM Employees WHERE Emp_Name='" & Txt_Name.Text.Trim & "' 
AND Emp_Number= '" & Txt_Empno.Text.Trim & "'"
 9. Common SQL Syntax used in database interaction

9a.   Select Statement
SELECT "column_name" FROM "table_name"

9b.    Distinct
SELECT DISTINCT "column_name" FROM "table_name"

9c.    Where
SELECT "column_name" FROM "table_name" WHERE "condition"

9d.   And/Or
SELECT "column_name" FROM "table_name" WHERE "simple condition" {[AND|OR] "simple condition"}+

9e.   In
SELECT "column_name" FROM "table_name" WHERE "column_name" IN ('value1', 'value2', ...)

9f.    Between
SELECT "column_name" FROM "table_name" WHERE "column_name" BETWEEN 'value1' AND 'value2'

9g.   Like
SELECT "column_name" FROM "table_name" WHERE "column_name" LIKE {PATTERN}

9h.   Order By
SELECT "column_name" FROM "table_name" [WHERE "condition"] ORDER BY "column_name" [ASC, DESC]

9i.    Count
SELECT COUNT("column_name") FROM "table_name"

9j.    Group By
SELECT "column_name1", SUM("column_name2") FROM "table_name" GROUP BY "column_name1"

9k.   Having
SELECT "column_name1", SUM("column_name2") FROM "table_name" GROUP BY "column_name1" HAVING (arithematic function condition)

9l.   Create Table Statement
CREATE TABLE "table_name" ("column 1" "data_type_for_column_1","column 2" "data_type_for_column_2",…)

9m.  Drop Table Statement
DROP TABLE "table_name"

9n.   Truncate Table Statement
TRUNCATE TABLE "table_name"

9m.  Insert Into Statement
INSERT INTO "table_name" ("column1", "column2", ...) VALUES ("value1", "value2", ...)

9o.   Update Statement
UPDATE "table_name" SET "column_1" = [new value] WHERE {condition}

9p.   Delete From Statement
DELETE FROM "table_name" WHERE {condition}

Browse for more Database Testing Related Posts 

Common Sql Queries Interview Questions 4

For more articles on Manual Testing Log On to Manual Testing Articles

Common SQL Queries Interview Questions 1 :

Common SQL Queries Interview Questions :

What is the system function to get the current user's details such as userid etc. ?

What is Stored Procedure?
A stored procedure is a named group of SQL statements that have been previously created and stored in the server database. Stored procedures accept input parameters so that a single procedure can be used over the network by several clients using different input data. And when the procedure is modified, all clients automatically get the new version. Stored procedures reduce network traffic and improve performance. Stored procedures can be used to help ensure the integrity of the database.

What is an extended stored procedure? Can you instantiate a COM object by using T-SQL?
An extended stored procedure is a function within a DLL (written in a programming language like C, C++ using Open Data Services (ODS) API) that can be called from T-SQL, just the way we call normal stored procedures using the EXEC statement. See books online to learn how to create extended stored procedures and how to add them to SQL Server. You can instantiate a COM (written in languages like VB, VC++) object from T-SQL by using sp_OACreate stored procedure.

What is Trigger?
A trigger is a SQL procedure that initiates an action when an event (INSERT, DELETE or UPDATE) occurs. Triggers are stored in and managed by the DBMS. Triggers are used to maintain the referential integrity of data by changing the data in a systematic fashion. A trigger cannot be called or executed; DBMS automatically fires the trigger as a result of a data modification to the associated table. Triggers can be viewed as similar to stored procedures in that both consist of procedural logic that is stored at the database level. Stored procedures, however, are not event-drive and are not attached to a specific table as triggers are. Stored procedures are explicitly executed by invoking a CALL to the procedure while triggers are implicitly executed. In addition, triggers can also execute stored procedures.

What is Nested Trigger?
A trigger can also contain INSERT, UPDATE and DELETE logic within itself, so when the trigger is fired because of data modification it can also cause another data modification, thereby firing another trigger. A trigger that contains data modification logic within itself is called a nested trigger.

What is a Linked Server?
Linked Servers is a concept in SQL Server by which we can add other SQL Server to a Group and query both the SQL Server databases using T-SQL Statements. With a linked server, you can create very clean, easy to follow, SQL statements that allow remote data to be retrieved, joined and combined with local data. Stored Procedure SP_AddLinkedServer, SP_AddLinkedSerevrLogin will be used add new Linked Server.

What is Cursor?
Cursor is a database object used by applications to manipulate data in a set on a row-by- row basis, instead of the typical SQL commands that operate on all the rows in the set at one time.
In order to work with a cursor we need to perform some steps in the following order:
  1. Declare cursor
  2. Open cursor
  3. Fetch row from the cursor
  4. Process fetched row
  5. Close cursor
  6. Deallocate cursor
What is sub-query? Explain properties of sub-query?
Sub-queries are often referred to as sub-selects, as they allow a SELECT statement to be executed arbitrarily within the body of another SQL statement. A sub-query is executed by enclosing it in a set of parentheses. Sub-queries are generally used to return a single row as an atomic value, though they may be used to compare values against multiple rows with the IN keyword.
A sub-query is a SELECT statement that is nested within another T-SQL statement. A sub-query SELECT statement if executed independently of the T-SQL statement, in which it is nested, will return a result-set. Meaning a sub-query SELECT statement can standalone and is not depended on the statement in which it is nested. A sub-query SELECT statement can return any number of values, and can be found in, the column list of a SELECT statement, a FROM, GROUP BY, HAVING, and/or ORDER BY clauses of a T-SQL statement. A Sub-query can also be used as a parameter to a function call. Basically a sub-query can be used anywhere an expression can be used.
What are different Types of Join?
  1. Cross Join A cross join that does not have a WHERE clause produces the Cartesian product of the tables involved in the join. The size of a Cartesian product result set is the number of rows in the first table multiplied by the number of rows in the second table. The common example is when company wants to combine each product with a pricing table to analyze each product at each price.
  2. Inner Join A join that displays only the rows that have a match in both joined tables is known as inner Join. This is the default type of join in the Query and View Designer.
  3. Outer Join A join that includes rows even if they do not have related rows in the joined table is an Outer Join. You can create three different outer join to specify the unmatched rows to be included:
    1. Left Outer Join: In Left Outer Join all rows in the first-named table i.e. "left" table, which appears leftmost in the JOIN clause are included. Unmatched rows in the right table do not appear.
    2. Right Outer Join: In Right Outer Join all rows in the second-named table i.e. "right" table, which appears rightmost in the JOIN clause are included. Unmatched rows in the left table are not included.
    3. Full Outer Join: In Full Outer Join all rows in all joined tables are included, whether they are matched or not.
  4. Self Join This is a particular case when one table joins to itself, with one or two aliases to avoid confusion. A self join can be of any type, as long as the joined tables are the same. A self join is rather unique in that it involves a relationship with only one table. The common example is when company has a hierarchal reporting structure whereby one member of staff reports to another. Self Join can be Outer Join or Inner Join.

What is User Defined Functions? What kind of User-Defined Functions can be created?
User-Defined Functions allow defining its own T-SQL functions that can accept 0 or more parameters and return a single scalar data value or a table data type.
Different Kinds of User-Defined Functions created are:
  1. Scalar User-Defined Function A Scalar user-defined function returns one of the scalar data types. Text, ntext, image and timestamp data types are not supported. These are the type of user-defined functions that most developers are used to in other programming languages. You pass in 0 to many parameters and you get a return value.
  2. Inline Table-Value User-Defined Function An Inline Table-Value user-defined function returns a table data type and is an exceptional alternative to a view as the user-defined function can pass parameters into a T-SQL select command and in essence provide us with a parameterized, non-updateable view of the underlying tables.
  3. Multi-statement Table-Value User-Defined Function A Multi-Statement Table-Value user-defined function returns a table and is also an exceptional alternative to a view as the function can support multiple T-SQL statements to build the final result where the view is limited to a single SELECT statement. Also, the ability to pass parameters into a TSQL select command or a group of them gives us the capability to in essence create a parameterized, non-updateable view of the data in the underlying tables. Within the create function command you must define the table structure that is being returned. After creating this type of user-defined function, It can be used in the FROM clause of a T-SQL command unlike the behavior found when using a stored procedure which can also return record sets.
What is Identity?
Identity (or AutoNumber) is a column that automatically generates numeric values. A start and increment value can be set, but most DBA leave these at 1. A GUID column also generates numbers; the value of this cannot be controlled. Identity/GUID columns do not need to be indexed.

What is Data-Warehousing?
  1. Subject-oriented, meaning that the data in the database is organized so that all the data elements relating to the same real-world event or object are linked together;
  2. Time-variant, meaning that the changes to the data in the database are tracked and recorded so that reports can be produced showing changes over time;
  3. Non-volatile, meaning that data in the database is never over-written or deleted, once committed, the data is static, read-only, but retained for future reporting.
  4. Integrated, meaning that the database contains data from most or all of an organization's operational applications, and that this data is made consistent.
How do you implement one-to-one, one-to-many and many-to-many relationships while designing tables?
One-to-One relationship can be implemented as a single table and rarely as two tables with primary and foreign key relationships. One-to-Many relationships are implemented by splitting the data into two tables with primary key and foreign key relationships. Many-to-Many relationships are implemented using a junction table with the keys from both the tables forming the composite primary key of the junction table.
What are the different isolation levels ?
An isolation level determines the degree of isolation of data between concurrent transactions. The default SQL Server isolation level is Read Committed. Here are the other isolation levels (in the ascending order of isolation): Read Uncommitted, Read Committed, Repeatable Read, Serializable.
What would happen if you create an index on each column of a table ?
If you create an index on each column of a table, it improves the query performance, as the query optimizer tool of the Database engine can choose from all the existing indexes to come up with an efficient execution plan. At the same time, data modification operations (such as INSERT, UPDATE, DELETE) will become slow, as every time data changes in the table, all the indexes need to be updated. Another disadvantage is that, indexes need disk space, the more indexes you have, more disk space is used.

What is Lock escalation?
Lock escalation is the process of converting many fine-grain locks into fewer coarse-grain locks, reducing system overhead. When a transaction exceeds its escalation threshold, automatic  escalation of  row locks and page locks into table locks happens.
When a transaction requests rows from a table, SQL Server automatically acquires locks on those rows affected and places higher-level intent locks on the pages and table, or index, which contain those rows. When the number of locks held by the transaction exceeds its threshold, a  stronger lock is acquired, and all page and row level locks held by the transaction on the table are released, reducing lock overhead.
Lock escalation thresholds are determined dynamically by SQL Server and do not require configuration.
What is a live lock?
A live lock is one, where a request for an exclusive lock is repeatedly denied because a series of overlapping shared locks keeps interfering. SQL Server detects the situation after four denials and refuses further shared locks. A live lock also occurs when read transactions monopolize a table or page, forcing a write transaction to wait indefinitely.

What is blocking and how would you troubleshoot it?
Blocking happens when one connection from an application holds a lock and a second connection requires a conflicting lock type. This forces the second connection to wait, blocked on the first.

Common SQL Queries Interview Questions :

Common SQL Queries Interview Questions :

What is the difference between oracle, sql and sql server ?
  • Oracle is based on RDBMS.
  • SQL is Structured Query Language.
  • SQL Server is another tool for RDBMS provided by MicroSoft.
Why you need indexing? where that is Stored and what you mean by schema object? For what purpose we are using view?
·         We can't create an Index on Index.. Index is stoed in user_index table. Every object that has been created on Schema is Schema Object like Table, View etc. If we want to share the particular data to various users we have to use the virtual table for the Base table. So that is a view.
·         Indexing is used for faster search or to retrieve data faster from various table. Schema containing set of tables, basically schema means logical separation of the database. View is crated for faster retrieval of data. It's customized virtual table. we can create a single view of multiple tables. Only the drawback is..view needs to be get refreshed for retrieving updated data.

Difference between Stored Procedure and Trigger?
  • we can call stored procedure explicitly.
  • but trigger is automatically invoked when the action defined in trigger is done.
    ex: create trigger after Insert on
  • this trigger invoked after we insert something on that table.
  • Stored procedure can't be inactive but trigger can be Inactive.
  • Triggers are used to initiate a particular activity after fulfilling certain condition.It need to define and can be enable and disable according to need.
What is the advantage to use trigger in your PL?
A trigger is a database object directly associated with a particular table. It fires whenever a specific statement/type of statement is issued against that table. The types of statements are insert,update,delete and query statements. Basically, trigger is a set of SQL statements A trigger is a solution to the restrictions of a constraint. For instance: 1.A database column cannot carry PSEUDO columns as criteria where a trigger can. 2. A database constraint cannot refer old and new values for a row where a trigger can.
Triggers are fired implicitly on the tables/views on which they are created. There are various advantages of using a trigger. Some of them are:
  • Suppose we need to validate a DML statement(insert/Update/Delete) that modifies a table then we can write a trigger on the table that gets fired implicitly whenever DML statement is executed on that table.
  • Another reason of using triggers can be for automatic updation of one or more tables whenever a DML/DDL statement is executed for the table on which the trigger is created.
  • Triggers can be used to enforce constraints. For eg : Any insert/update/ Delete statements should not be allowed on a particular table after office hours. For enforcing this constraint Triggers should be used.
  • Triggers can be used to publish information about database events to subscribers. Database event can be a system event like Database startup or shutdown or it can be a user even like User loggin in or user logoff.
What the difference between UNION and UNIONALL?
·         Union will remove the duplicate rows from the result set while Union all does'nt.

What is the difference between TRUNCATE and DELETE commands?
·         Both will result in deleting all the rows in the table .TRUNCATE call cannot be rolled back as it is a DDL command and all memory space for that table is released back to the server. TRUNCATE is much faster.Whereas DELETE call is an DML command and can be rolled back.

Which system table contains information on constraints on all the tables created ?
system table contains information on constraints on all the tables created

Explain Normalization ?
Normalisation means refining the redundancy and maintain stablisation. there are
Four types of Normalization :first normal forms, second normal forms, third normal forms and fourth Normal forms.

How to find out the database name from SQL*PLUS command prompt?
Select * from global_name;
This will give the database name which u r currently connected to.

What is the difference between SQL and SQL Server ? SQLServer is an RDBMS just like oracle,DB2 from Microsoft
Structured Query Language (SQL), pronounced "sequel", is a language that provides an interface to relational database systems. It was developed by IBM in the 1970s for use in System R. SQL is a de facto standard, as well as an ISO and ANSI standard. SQL is used to perform various operations on RDBMS.

What is difference between Co-related sub query and nested sub query?
Correlated subquery runs once for each row selected by the outer query. It contains a reference to a value from the row selected by the outer query.
Nested subquery runs only once for the entire nesting (outer) query. It does not contain any reference to the outer query row.
For example,
Correlated Subquery:
select e1.empname, e1.basicsal, e1.deptno from emp e1 where e1.basicsal = (select max(basicsal) from emp e2 where e2.deptno = e1.deptno)
Nested Subquery:
select empname, basicsal, deptno from emp where (deptno, basicsal) in (select deptno, max(basicsal) from emp group by deptno)

Pattern matching operator is LIKE and it has to used with two attributes
1. %  means matches zero or more characters and  
2. _ ( underscore ) means matching exactly one character

What is cluster.cluster index and non cluster index ?
Clustered Index:- A Clustered index is a special type of index that reorders the way records in the table are physically stored. Therefore table may have only one clustered index.Non-Clustered Index:- A Non-Clustered index is a special type of index in which the logical order of the index does not match the physical stored order of the rows in the disk. The leaf nodes of a non-clustered index does not consists of the data pages. instead the leaf node contains index rows.

What is the difference between a "where" clause and a "having" clause?
"Where" is a kind of restriction statement. You use where clause to restrict all the data from DB. Where clause is using before result retrieving. But Having clause is using after retrieving the data. Having clause is a kind of filtering command.

What structure can you implement for the database to speed up table reads?  
Follow the rules of DB tuning we have to:
1] properly use indexes ( different types of indexes)
2] properly locate different DB objects across different tablespaces, files and so on.
3] create a special space (tablespace) to locate some of the data with special datatype ( for example CLOB, LOB and …)

What are the tradeoffs with having indexes?
1. Faster selects, slower updates.
2. Extra storage space to store indexes. Updates are slower because in addition to updating the table you have to update the index.

What is "normalization"? "Denormalization"? Why do you sometimes want to denormalize?
 Normalizing data means eliminating redundant information from a table and organizing the data so that future changes to the table are easier.
 Denormalization means allowing redundancy in a table. The main benefit of denormalization is improved performance with simplified data retrieval and manipulation. This is done by reduction in the number of joins needed for data processing.

What is a "constraint"?
A constraint allows you to apply simple referential integrity checks to a table. There are four primary types of constraints
PRIMARY/UNIQUE - enforces uniqueness of a particular table column. But by default primary key creates a clustered index on the column, where are unique creates a non-clustered index by default. Another major difference is that, primary key doesn't allow NULLs, but unique key allows one NULL only DEFAULT - specifies a default value for a column in case an insert operation does not provide one. FOREIGN KEY - validates that every value in a column exists in a column of another table. CHECK - checks that every value stored in a column is in some specified list. Each type of constraint performs a specific type of action. Default is not a constraint. NOT NULL is one more constraint which does not allow values in the specific column to be null. And also it the only constraint which is not a table level constraint.

What types of index data structures can you have?
- An index helps to faster search values in tables. The three most commonly used index-types are: -
B-Tree: builds a tree of possible values with a list of row IDs that have the leaf value. Needs a lot of space and is the default index type for most databases.
Bitmap: string of bits for each possible value of the column. Each bit string has one bit for each row. Needs only few space and is very fast.(however, domain of value cannot be large, e.g. SEX(m,f); degree(BS,MS,PHD).
 Hash: A hashing algorithm is used to assign a set of characters to represent a text string such as a composite of keys or partial keys, and compresses the underlying data. Takes longer to build and is supported by relatively few databases.

Why can a "group by" or "order by" clause be expensive to process?
Processing of "group by" or "order by" clause often requires creation of Temporary tables to process the results of the query. Which depending of the result set can be very expensive.

What is a SQL view?
 An output of a query can be stored as a view. View acts like small table which meets our criterion. View is a pre-complied SQL query which is used to select data from one or more tables. A view is like a table but it doesn't physically take any space. View is a good way to present data in a particular format if you use that query quite often. View can also be used to restrict users from accessing the tables directly.

What is GROUP BY?
The GROUP BY keywords has been added to SQL because aggregate functions (like SUM) return the aggregate of all column values every time they are called. Without the GROUP BY functionality, finding the sum for each individual group of column values was not possible.

What are defaults? Is there a column to which a default can't be bound?
A default is a value that will be used by a column, if no value is supplied to that column while inserting data. IDENTITY columns and timestamp columns can't have defaults bound to them.

Explain different isolation levels
An isolation level determines the degree of isolation of data between concurrent transactions. The default SQL Server isolation level is Read Committed. Here are the other isolation levels (in the ascending order of isolation):
Read Uncommitted,
Read Committed,
Repeatable Read,

What type of Index will get created after executing the below statement?
CREATE INDEX myIndex ON myTable(myColumn)
Non-clustered index. By default a clustered index gets created on the primary key, unless specified otherwise.
What's the maximum size of a row?
8060 bytes.