Convenciones sql-1

Database object naming conventions
Last updated: May 17th '01
There exist so many different naming conventions for database objects, none of them is wrong. It's more of a
personal preference of the person who designed the naming convention. However, in an organization, one
person (or a group) defines the database naming conventions, standardizes it and others will follow it whether
they like it or not.
My database object naming conventions:
I came up with a naming convention which is a mixture of my own ideas and views of SQL experts like Joe
Celko! This article references Microsoft SQL Server databases in some examples, but can be used generically
with other RDBMSs like Oracle, Sybase etc. too. So, here's my preferred naming convention for:
Tables
Views
Stored procedures
User defined functions
Triggers
Indexes
Columns
User defined data types
Primary keys
Foreign keys
Default and Check constraints
Variables
Tables:
Tables represent the instances of an entity. For example, you store all your customer information in a table.
Here, 'customer' is an entity and all the rows in the customers table represent the instances of the entity
'customer'. So, why not name your table using the entity it represents, 'Customer'. Since the table is storing
'multiple instances' of customers, make your table name a plural word.
So, name your customer table as 'Customers'.

Name your order storage table as 'Orders'.
Name your error messages table as 'ErrorMessages'.
This is a more natural way of naming tables, when compared to approaches which name tables as
tblCustomers, tbl_Orders. Further, when you look at your queries it's very obvious that a particular name
refers to a table, as table names are always preceded by FROM clause of the SELECT statement.
If your database deals with different logical functions and you want to group your tables according to the
logical group they belong to, it won't hurt prefixing your table name with a two or three character prefix that
can identify the group.
For example, your database has tables which store information about Sales and Human resource departments,
you could name all your tables related to Sales department as shown below:
SL_NewLeads
SL_Territories
SL_TerritoriesManagers
You could name all your tables related to Human resources department as shown below:
HR_Candidates
HR_PremierInstitutes
HR_InterviewSchedules
This kind of naming convention makes sure, all the related tables are grouped together when you list all your
tables in alphabetical order. However, if your database deals with only one logical group of tables, you need
not use this naming convention.
Note that, sometimes you end up vertically partitioning tables into two or more tables, though these partitions
effectively represent the same entity. In this case, append a word that best identifies the partition, to the entity
name.
Views:
A view is nothing but a table, for any application that is accessing it. So, the same naming convention defined
above for tables, applies to views as well, but not always. Here are some exceptions:
1) Views not always represent a single entity. A view can be a combination of two tables based on a join
condition, thus, effectively representing two entities. In this case, consider combining the names of both the
base tables. Here's an example:
If there is a view combining two tables 'Customers' and 'Addresses', name the view as
'CustomersAddresses'. Same naming convention can be used with junction tables that are used to link
two many-to-many related base tables. Most popular example is the 'TitleAuthor' table from 'Pubs'
database of SQL Server.
2) Views can summarize data from existing base tables in the form of reports. You can see this type of views
in the 'Northwind' database that ships with SQL Server 7.0 and above. Here's the convention that database
follows. (I prefer this):
'Product Sales for 1997'

'Summary of Sales by Quarter'
'Summary of Sales by Year'
However, try to stay away from spaces within object names.
Stored procedures:
Stored procedures always do some work for you, they are action oriented. So, let their name describe the work
they do. So, use a verb to describe the work.
This is how I would name a stored procedure that fetches me the customer details given the customer
identification number:
'GetCustomerDetails'. Similarly, you could name a procedure that inserts a new customer information
as 'InsertCustomerInfo'. Here are some more names based on the same convention:
'WriteAuditRecord', 'ArchiveTransactions', 'AuthorizeUser' etc.
As explained above in the case of tables, you could use a prefix, to group stored procedures also, depending
upon the logical group they belong to. For example, all stored procedures that deal with 'Order processing'
could be prefixed with ORD_ as shown below:
ORD_InsertOrder
ORD_InsertOrderDetails
ORD_ValidateOrder
If you are using Microsoft SQL Server, never prefix your stored procedures with 'sp_', unless you are storing
the procedure in the master database. If you call a stored procedure prefixed with sp_, SQL Server always
looks for this procedure in the master database. Only after checking in the master database (if not found) it
searches the current database.
I do not agree with the approach of prefixing stored procedures with prefixes like 'sproc_' just to make it
obvious that the object is a stored procedure. Any database developer/DBA can identify stored procedures as
the procedures are always preceded by EXEC or EXECUTE keyword.
User defined functions:
In Microsoft SQL Server 2000, User Defined Functions (UDFs) are almost similar to stored procedures,
except for the fact that UDFs can be used in SELECT statements. Otherwise, both stored procedures and
UDFs are similar. So, the naming conventions discussed above for stored procedures, apply to UDFs as well.
You could even use a prefix to logically group your UDFs. For example, you could name all your string
manipulation UDFs as shown below:
str_MakeProperCase
str_ParseString
Triggers:
Though triggers are a special kind of stored procedures, it won't make sense to follow the same naming
convention as we do for stored procedures.
While naming triggers we have to extend the stored procedure naming convention in two ways:
Triggers always depend on a base table and can't exist on their own. So, it's better to link the base
table's name with the trigger name
Triggers are associated with one or more of the following operations: Insert, Update, Delete. So, the
name of the trigger should reflect it's nature
So, here's how I would name the insert, update and delete trigger on titles table:
titles_instrg
titles_updtrg
titles_deltrg
Microsoft SQL Server 7.0 started allowing more than one trigger per action per table. So, you could have 2
insert triggers, 3 update triggers and 4 delete triggers, if you want to! In SQL Server 7.0 you can't control the
order of firing of these triggers, however you have some control over the order of firing in SQL Server 2000.
Coming back to the point, if you have 2 insert triggers on titles table, use the following naming convention to
distinguish the triggers:
titles_ValidateData_instrg
titles_MakeAuditEntries_instrg
Same naming convention could be used with update and delete triggers.
If you have a single trigger for more than one action (same trigger for insert and update or update and delete
or any such combination), use the words 'ins', 'upd', 'del' together in the name of the trigger. Here's an
example. If you have a single trigger for both insert and update on titles table, name the trigger as
titles_InsUpdtrg
Indexes:
Just like triggers, indexes also can't exist on their own and they are dependent on the underlying base tables.
So, again it makes sense to include the 'name of the table' and 'column on which it's built' in the index name.
Further, indexes can be of two types, clustered and nonclustered. These two types of indexes could be either
unique or non-unique. So, the naming convention should take care of the index types too.
My index naming convention is:

Table name + Column name(s) + Unique/Non-uniqueness + Clustered/Non-clustered
For example, I would name the unique, clustered index on the TitleID column of Titles table as shown below:
Titles_TitleID_U_Cidx
I would name the unique, nonclustered index on the PubID column of Publishers table as shown below:
Publishers_PubID_U_Nidx
Here's how I would name a non-unique, non-clustered index on the OrdeID column of OrderDetails table:
OrderDetails_OrderID_NU_Nidx
Indexes can be composite too, meaning, an index can be built on more than one column. In this case, just
concatenate the column names together, just the way we did with junction tables and views above. So, here's
how I would name a composite, unique, clustered index on OrderID and OrderDetailID columns of
OrderDetails table:
OrderDetails_OrderIDOrderDetailID_U_Cidx
Sure, these index names look long and ugly, but who is complaining? You'll never need to reference these
index names in code, unless you are creating/dropping/rebuilding the indexes. So, it's not a pain, but it's a very
useful naming convention.
Columns:
Columns are attributes of an entity, that is, columns describe the properties of an entity. So, let the column
names be meaningful and natural.
Here's a simplest way of naming the columns of the Customers table:
CustomerID
CustomerFirstName
CustomerAddress
As shown above, it'll be a good idea to prefix the column names with the entity that they are representing.
Here's another idea. Decide on a standard two to four character code for each table in your database and make
sure it's unique in the database. For example 'Cust' for Customers table, 'Ord' for Orders tables, 'OrdD' for
OrderDetails table, 'Adt' for Audit tables etc. Use this table code to prefix all the column names in that table.
Advantage of this convention is that in multi-table queries involving complex joins, you don't have to worry
about ambiguous column names, and don't have to use table aliases to prefix the columns. It also makes your
queries more readable.
If you have to name the columns in a junction/mapping table, concatenate the table codes of mapped tables, or
come up with a new code for that combination of tables.
So, here's how the CustomerID column would appear in Customers table:
Cust_CustomerID
The same CustomerID column appears in the Orders table too, but in Orders table, here's how it's named:
Ord_CustomerID
Some naming conventions even go to the extent of prefixing the column name with it's data type. But I don't
like this approach, as I feel, the DBA or the developer dealing with these columns should be familiar with the
data types these columns belong to.
User defined data types:
User defined data types are just a wrapper around the base types provided by the database management
system. They are used to maintain consistency of data types across different tables for the same attribute. For
example, if the CustomerID column appears half a dozen tables, you must use the same data type for all the
occurrences of the CustomerID column. This is where user defined data types come in handy. Just create a
user defined data type for CustomerID and use it as the data type for all the occurrences of CustomerID
column.
So, the simplest way of naming these user defined data types would be: Column_Name + '_type'. So, I would
name the CustoerID type as:
CustomerID_type
Primary keys:
Primary key is the column(s) that can uniquely identify each row in a table. So, just use the column name
prefixed with 'pk_' + 'Table name' for naming primary keys.
Here's how I would name the primary key on the CustomerID column of Customers table:
pk_Customers_CustomerID
Consider concatenating the column names in case of composite primary keys.
Foreign keys:
Foreign key are used to represent the relationships between tables which are related. So, a foreign key can be
considered as a link between the 'column of a referencing table' and the 'primary key column of the referenced
table'.
I prefer the following naming convention for foreign keys:

fk_referencing table + referencing column_referenced table + referenced column.
Based on the above convention, I would name the foreign key which references the CustomerID column of
the Customers table from the Order's tables CustomerID column as:
fk_OrdersCustomerID_CustomersCustomerID
Foreign key can be composite too, in that case, consider concatenating the column names of referencing and
referenced tables while naming the foreign key. This might make the name of the foreign key lengthy, but you
shouldn't be worried about it, as you will never reference this name from your code, except while
creating/dropping these constraints.
Default and Check constraints:
Use the column name to which these defaults/check constraints are bound to and prefix it with 'def' and 'chk'
prefixes respectively for Default and Check constraints.
I would name the default constraint for OrderDate Column as def_OrderDate and the check constraint for
OrderDate column as chk_OrderDate.
Variables:
For variables that store the contents of columns, you could use the same naming convention that we used for
Column names.
Here are some general rules I follow:
I personally don't like complicated, long names for tables or other database objects. I like to keep it
simple
I Prefer to use 'Mixed case' names instead of using underscores to separate two words of a name.
However, when you use mixed case names, your developers should be consistent with case through
out their code, on case sensitive SQL Servers
I use underscores only between the prefix/suffix and the actual object name. That is, I never break
the name of an object with underscores
I prefer not to use spaces within the name of database objects, as spaces confuse front-end data
access tools and applications. If you must use spaces within the name of a database object, make sure
you surround the name with square brackets (in Microsoft SQL Server) as shown here: [Order
Details]
I make sure I'm not using any reserved words for naming my database objects, as that can lead to some
unpredictable situations. To get a list of reserved words for Microsoft SQL Server, search Books Online for
'Reserved keywords'
Database coding conventions, best
practices, programming guidelines
Last updated: January 28th '02
Databases are the heart and soul of many of the recent enterprise applications and it is very essential to pay
special attention to database programming. I've seen in many occasions where database programming is
overlooked, thinking that it's something easy and can be done by anyone. This is wrong. For a better
performing database you need a real DBA and a specialist database programmer, let it be Microsoft SQL
Server, Oracle, Sybase, DB2 or whatever! If you don't use database specialists during your development
cycle, database often ends up becoming the performance bottleneck. I decided to write this article, to put
together some of the database programming best practices, so that my fellow DBAs and database developers
can benefit!
Here are some of the programming guidelines, best practices, keeping quality, performance and
maintainability in mind. This list many not be complete at this moment, and will be constantly updated. Btw,
special thanks to Tibor Karaszi (SQL Server MVP) and Linda (lindawie) for taking time to read this article,
and providing suggestions.
Decide upon a database naming convention, standardize it across your organization and be consistent
in following it. It helps make your code more readable and understandable. Click here to see the
database object naming convention that I follow.
Do not depend on undocumented functionality. The reasons being:

- You will not get support from Microsoft, when something goes wrong with your undocumented
code
- Undocumented functionality is not guaranteed to exist (or behave the same) in a future release or
service pack, there by breaking your code
Try not to use system tables directly. System table structures may change in a future release.
Wherever possible, use the sp_help* stored procedures or INFORMATION_SCHEMA views. There
will be situattions where you cannot avoid accessing system table though!
Make sure you normalize your data at least till 3rd normal form. At the same time, do not
compromize on query performance. A little bit of denormalization helps queries perform faster.
Write comments in your stored procedures, triggers and SQL batches generously, whenever
something is not very obvious. This helps other programmers understand your code clearly. Don't
worry about the length of the comments, as it won't impact the performance, unlike interpreted
languages like ASP 2.0.
Do not use SELECT * in your queries. Always write the required column names after the SELECT
statement, like SELECT CustomerID, CustomerFirstName, City. This technique results in less disk
IO and less network traffic and hence better performance.
Try to avoid server side cursors as much as possible. Always stick to 'set based approach' instead of a
'procedural approach' for accessing/manipulating data. Cursors can be easily avoided by SELECT
statements in many cases. If a cursor is unavoidable, use a simpleWHILE loop instead, to loop
through the table. I personally tested and concluded that a WHILE loop is faster than a cursor most
of the times. But for a WHILE loop to replace a cursor you need a column (primary key or unique
key) to identify each row uniquely and I personally believe every table must have a primary or
unique key. Click here to see one of the many examples of using WHILE loop.
Avoid the creation of temporary tables while processing data, as much as possible, as creating a
temporary table means more disk IO. Consider advanced SQL or views or table variables of SQL
Server 2000 or derived tables, instead of temporary tables. Keep in mind that, in some cases, using a
temporary table performs better than a highly complicated query.
Try to avoid wildcard characters at the beginning of a word while searching using the LIKE
keyword, as that results in an index scan, which is defeating the purpose of having an index. The
following statement results in an index scan, while the second statement results in an index seek:
1. SELECT LocationID FROM Locations WHERE Specialities LIKE '%pples'

2. SELECT LocationID FROM Locations WHERE Specialities LIKE 'A%s'
Also avoid searching with not equals operators (<> and NOT) as they result in table and index scans.
If you must do heavy text-based searches, consider using the Full-Text search feature of SQL Server
for better performance.
Use 'Derived tables' wherever possible, as they perform better. Consider the following query to find
the second highest salary from Employees table:
SELECT MIN(Salary)
FROM Employees
WHERE EmpID IN
(
SELECT TOP 2 EmpID
FROM Employees
ORDER BY Salary Desc
)
The same query can be re-written using a derived table as shown below, and it performs twice as fast
as the above query:
SELECT MIN(Salary)
FROM
(
SELECT TOP 2 Salary
FROM Employees
ORDER BY Salary Desc
) AS A
This is just an example, the results might differ in different scenarios depending upon the database
design, indexes, volume of data etc. So, test all the possible ways a query could be written and go
with the efficient one. With some practice and understanding of 'how SQL Server optimizer works',
you will be able to come up with the best possible queries without this trial and error method.
While designing your database, design it keeping 'performance' in mind. You can't really tune
performance later, when your database is in production, as it involves rebuilding tables/indexes, re-
writing queries. Use the graphical execution plan in Query Analyzer or SHOWPLAN_TEXT or
SHOWPLAN_ALL commands to analyze your queries. Make sure your queries do 'Index seeks'
instead of 'Index scans' or 'Table scans'. A table scan or an index scan is a very bad thing and should
be avoided where possible (sometimes when the table is too small or when the whole table needs to
be processed, the optimizer will choose a table or index scan).
Prefix the table names with owner names, as this improves readability, avoids any unnecessary
confusions. Microsoft SQL Server Books Online even states that qualifying tables names, with
owner names helps in execution plan reuse.
Use SET NOCOUNT ON at the beginning of your SQL batches, stored procedures and triggers in
production environments, as this suppresses messages like '(1 row(s) affected)' after executing
INSERT, UPDATE, DELETE and SELECT statements. This inturn improves the performance of
the stored procedures by reducing the network traffic.
Use the more readable ANSI-Standard Join clauses instead of the old style joins. With ANSI joins
the WHERE clause is used only for filtering data. Where as with older style joins, the WHERE clause
handles both the join condition and filtering data. The first of the following two queries shows an old
style join, while the second one shows the new ANSI join syntax:
SELECT a.au_id, t.title

FROM titles t, authors a, titleauthor ta
WHERE
a.au_id = ta.au_id AND
ta.title_id = t.title_id AND
t.title LIKE '%Computer%'
SELECT a.au_id, t.title

FROM authors a
INNER JOIN
titleauthor ta
ON
a.au_id = ta.au_id
INNER JOIN
titles t
ON
ta.title_id = t.title_id
WHERE t.title LIKE '%Computer%'
Be aware that the old style *= and =* left and right outer join syntax may not be supported in a future
release of SQL Server, so you are better off adopting the ANSI standard outer join syntax.
Do not prefix your stored procedure names with 'sp_'. The prefix sp_ is reserved for system stored
procedure that ship with SQL Server. Whenever SQL Server encounters a procedure name starting
with sp_,, it first tries to locate the procedure in the master database, then looks for any qualifiers
(database, owner) provided, then using dbo as the owner. So, you can really save time in locating the
stored procedure by avoiding sp_ prefix. But there is an exception! While creating general purpose
stored procedures that are called from all your databases, go ahead and prefix those stored procedure
names with sp_ and create them in the master database.
Views are generally used to show specific data to specific users based on their interest. Views are
also used to restrict access to the base tables by granting permission on only views. Yet another
significant use of views is that, they simplify your queries. Incorporate your frequently required
complicated joins and calculations into a view, so that you don't have to repeat those
joins/calculations in all your queries, instead just select from the view.
Use 'User Defined Datatypes', if a particular column repeats in a lot of your tables, so that the
datatype of that column is consistent across all your tables.
Do not let your front-end applications query/manipulate the data directly using SELECT or
INSERT/UPDATE/DELETE statements. Instead, create stored procedures, and let your
applications access these stored procedures. This keeps the data access clean and consistent across all
the modules of your application, at the same time centralizing the business logic within the database.
Try not to use text, ntext datatypes for storing large textual data. 'text' datatype has some inherent
problems associated with it. You can not directly write, update text data using INSERT, UPDATE
statements (You have to use special statements like READTEXT, WRITETEXT and
UPDATETEXT). There are a lot of bugs associated with replicating tables containing text columns.
So, if you don't have to store more than 8 KB of text, use char(8000) or
varchar(8000)datatypes.
If you have a choice, do not store binary files, image files (Binary large objects or BLOBs) etc.
inside the database. Instead store the path to the binary/image file in the database and use that as a
pointer to the actual binary file. Retrieving, manipulating these large binary files is better performed
outside the database and after all, database is not meant for storing files.
Use char data type for a column, only when the column is non-nullable. If a char column is
nullable, it is treated as a fixed length column in SQL Server 7.0+. So, a char(100), when NULL,
will eat up 100 bytes, resulting in space wastage. So, use varchar(100) in this situation. Of
course, variable length columns do have a very little processing overhead over fixed length columns.
Carefully choose between char and varchar depending up on the length of the data you are going
to store.
Avoid dynamic SQL statements as much as possible. Dynamic SQL tends to be slower than static
SQL, as SQL Server must generate an execution plan every time at runtime. IF and CASE
statements come in handy to avoid dynamic SQL. Another major disadvantage of using dynamic
SQL is that, it requires the users to have direct access permissions on all accessed objects like tables
and views. Generally, users are given access to the stored procedures which reference the tables, but
not directly on the tables. In this case, dynamic SQL will not work. Consider the following scenario,
where a user named 'dSQLuser' is added to the pubs database, and is granted access to a procedure
named 'dSQLproc', but not on any other tables in the pubs database. The procedure dSQLproc
executes a direct SELECT on titles table and that works. The second statement runs the same
SELECT on titles table, using dynamic SQL and it fails with the following error:
Server: Msg 229, Level 14, State 5, Line 1

SELECT permission denied on object 'titles', database 'pubs', owner 'dbo'.
To reproduce the above problem, use the following commands:
sp_addlogin 'dSQLuser'
GO
sp_defaultdb 'dSQLuser', 'pubs'
USE pubs
GO
sp_adduser 'dSQLUser', 'dSQLUser'
GO
CREATE PROC dSQLProc
AS
BEGIN
SELECT * FROM titles WHERE title_id = 'BU1032' --This works
DECLARE @str CHAR(100)
SET @str = 'SELECT * FROM titles WHERE title_id = ''BU1032'''
EXEC (@str) --This fails
END
GO
GRANT EXEC ON dSQLProc TO dSQLuser
GO
Now login to the pubs database using the login dSQLuser and execute the procedure dSQLproc to
see the problem.
Consider the following drawbacks before using IDENTITY property for generating primary keys.
IDENTITY is very much SQL Server specific, and you will have problems if you want to support
different database backends for your application.IDENTITY columns have other inherent problems.
IDENTITY columns run out of numbers one day or the other. Numbers can't be reused
automatically, after deleting rows. Replication and IDENTITY columns don't always get along
well. So, come up with an algorithm to generate a primary key, in the front-end or from within the
inserting stored procedure. There could be issues with generating your own primary keys too, like
concurrency while generating the key, running out of values. So, consider both the options and go
with the one that suits you well.
Minimize the usage of NULLs, as they often confuse the front-end applications, unless the
applications are coded intelligently to eliminate NULLs or convert the NULLs into some other form.
Any expression that deals with NULL results in a NULL output. ISNULL and COALESCE functions
are helpful in dealing with NULL values. Here's an example that explains the problem:
Consider the following table, Customers which stores the names of the customers and the middle
name can be NULL.
CREATE TABLE Customers

(
FirstName varchar(20),
MiddleName varchar(20),
LastName varchar(20)
)
Now insert a customer into the table whose name is Tony Blair, without a middle name:
INSERT INTO Customers

(FirstName, MiddleName, LastName)
VALUES ('Tony',NULL,'Blair')
The following SELECT statement returns NULL, instead of the customer name:
SELECT FirstName + ' ' + MiddleName + ' ' + LastName FROM Customers
To avoid this problem, use ISNULL as shown below:
SELECT FirstName + ' ' + ISNULL(MiddleName + ' ','') + LastName

FROM Customers
Use Unicode datatypes like nchar, nvarchar, ntext, if your database is going to store not
just plain English characters, but a variety of characters used all over the world. Use these datatypes,
only when they are absolutely needed as they need twice as much space as non-unicode datatypes.
Always use a column list in your INSERT statements. This helps in avoiding problems when the
table structure changes (like adding a column). Here's an example which shows the problem.
Consider the following table:
CREATE TABLE EuropeanCountries

(
CountryID int PRIMARY KEY,
CountryName varchar(25)
)
Here's an INSERT statement without a column list , that works perfectly:
INSERT INTO EuropeanCountries

VALUES (1, 'Ireland')
Now, let's add a new column to this table:
ALTER TABLE EuropeanCountries

ADD EuroSupport bit
Now run the above INSERT statement. You get the following error from SQL Server:

Insert Error: Column name or number of supplied values does not match table definition.
This problem can be avoided by writing an INSERT statement with a column list as shown below:
INSERT INTO EuropeanCountries

(CountryID, CountryName)
VALUES (1, 'England')
Perform all your referential integrity checks, data validations using constraints (foreign key and
check constraints). These constraints are faster than triggers. So, use triggers only for auditing,
custom tasks and validations that can not be performed using these constraints. These constraints
save you time as well, as you don't have to write code for these validations and the RDBMS will do
all the work for you.
Always access tables in the same order in all your stored procedures/triggers consistently. This helps
in avoiding deadlocks. Other things to keep in mind to avoid deadlocks are: Keep your transactions
as short as possible. Touch as less data as possible during a transaction. Never, ever wait for user
input in the middle of a transaction. Do not use higher level locking hints or restrictive isolation
levels unless they are absolutely needed. Make your front-end applications deadlock-intelligent, that
is, these applications should be able to resubmit the transaction incase the previous transaction fails
with error 1205. In your applications, process all the results returned by SQL Server immediately, so
that the locks on the processed rows are released, hence no blocking.
Offload tasks like string manipulations, concatenations, row numbering, case conversions, type
conversions etc. to the front-end applications, if these operations are going to consume more CPU
cycles on the database server (It's okay to do simple string manipulations on the database end
though). Also try to do basic validations in the front-end itself during data entry. This saves
unnecessary network roundtrips.
If back-end portability is your concern, stay away from bit manipulations with T-SQL, as this is very
much RDBMS specific. Further, using bitmaps to represent different states of a particular entity
conflicts with the normalization rules.
Consider adding a @Debug parameter to your stored procedures. This can be of bit data type. When
a 1 is passed for this parameter, print all the intermediate results, variable contents using SELECT or
PRINT statements and when 0 is passed do not print debug information. This helps in quick
debugging of stored procedures, as you don't have to add and remove these PRINT/SELECT
statements before and after troubleshooting problems.
Do not call functions repeatedly within your stored procedures, triggers, functions and batches. For
example, you might need the length of a string variable in many places of your procedure, but don't
call the LEN function whenever it's needed, instead, call the LEN function once, and store the result
in a variable, for later use.
Make sure your stored procedures always return a value indicating the status. Standardize on the
return values of stored procedures for success and failures. The RETURN statement is meant for
returning the execution status only, but not data. If you need to return data, use OUTPUT parameters.
If your stored procedure always returns a single row resultset, consider returning the resultset using
OUTPUT parameters instead of a SELECT statement, as ADO handles output parameters faster than
resultsets returned by SELECT statements.
Always check the global variable @@ERROR immediately after executing a data manipulation
statement (like INSERT/UPDATE/DELETE), so that you can rollback the transaction in case of an
error (@@ERROR will be greater than 0 in case of an error). This is important, because, by default,
SQL Server will not rollback all the previous changes within a transaction if a particular statement
fails. This behavior can be changed by executing SET XACT_ABORT ON. The @@ROWCOUNT
variable also plays an important role in determining how many rows were affected by a previous data
manipulation (also, retrieval) statement, and based on that you could choose to commit or rollback a
particular transaction.
To make SQL Statements more readable, start each clause on a new line and indent when needed.
Following is an example:
SELECT title_id, title

FROM titles
WHERE title LIKE 'Computing%' AND
title LIKE 'Gardening%'
Though we survived the Y2K, always store 4 digit years in dates (especially, when using char or
int datatype columns), instead of 2 digit years to avoid any confusion and problems. This is not a
problem with datetime columns, as the century is stored even if you specify a 2 digit year. But it's
always a good practice to specify 4 digit years even with datetime datatype columns.
In your queries and other SQL statements, always represent date in yyyy/mm/dd format. This format
will always be interpreted correctly, no matter what the default date format on the SQL Server is.
This also prevents the following error, while working with dates:

The conversion of a char data type to a datetime data type resulted
in an out-of-range datetime value.
As is true with any other programming language, do not use GOTO or use it sparingly. Excessive
usage of GOTO can lead to hard-to-read-and-understand code.
Do not forget to enforce unique constraints on your alternate keys.
Always be consistent with the usage of case in your code. On a case insensitive server, your code
might work fine, but it will fail on a case sensitive SQL Server if your code is not consistent in case.
For example, if you create a table in SQL Server or database that has a case-sensitive or binary sort
order, all references to the table must use the same case that was specified in the CREATE TABLE
statement. If you name the table as 'MyTable' in the CREATE TABLE statement and use 'mytable' in
the SELECT statement, you get an 'object not found' or 'invalid object name' error.
Though T-SQL has no concept of constants (like the ones in C language), variables will serve the
same purpose. Using variables instead of constant values within your SQL statements, improves
readability and maintainability of your code. Consider the following example:
UPDATE dbo.Orders
SET OrderStatus = 5
WHERE OrdDate < '2001/10/25'
The same update statement can be re-written in a more readable form as shown below:
DECLARE @ORDER_PENDING int

SET @ORDER_PENDING = 5
UPDATE dbo.Orders
SET OrderStatus = @ORDER_PENDING
WHERE OrdDate < '2001/10/25'
Do not use the column numbers in the ORDER BY clause as it impairs the readability of the SQL
statement. Further, changing the order of columns in the SELECT list has no impact on the ORDER
BY when the columns are referred by names instead of numbers. Consider the following example, in
which the second query is more readable than the first one:
SELECT OrderID, OrderDate

FROM Orders
ORDER BY 2
SELECT OrderID, OrderDate

FROM Orders
ORDER BY OrderDate
Well, this is all for now folks. I'll keep updating this page as and when I have something new to add. I
welcome your feedback on this, so feel free to email me. Happy database programming!
Overview of SQL Server security model
and security best practices
Last updated: May 20th '03
This article discusses the security model of Microsoft SQL Server 7.0/2000 and security best practices to help
you secure your data. Special thanks to my friend Divya Kalra for her valuable input and content review.
Security is a major concern for the modern age systems/network/database administrators. It is natural for an
administrator to worry about hackers and external attacks while implementing security. But there is more to it.
It is essential to first implement security within the organization, to make sure right people have access to the
right data. Without these security measures in place, you might find someone destroying your valuable data,
or selling your company's secrets to your competitors or someone invading the privacy of others. Primarily a
security plan must identify which users in the organization can see which data and perform which activities in
the database.
SQL Server security model

To be able to access data from a database, a user must pass through two stages of authentication, one at the
SQL Server level and the other at the database level. These two stages are implemented using Logins names
and User accounts respectively. A valid login is required to connect to SQL Server and a valid user account is
required to access a database.
Login: A valid login name is required to connect to an SQL Server instance. A login could be:
A Windows NT/2000 login that has been granted access to SQL Server
An SQL Server login, that is maintained within SQL Server
These login names are maintained within the master database. So, it is essential to backup the master database
after adding new logins to SQL Server.
User: A valid user account within a database is required to access that database. User accounts are specific to
a database. All permissions and ownership of objects in the database are controlled by the user account. SQL
Server logins are associated with these user accounts. A login can have associated users in different databases,
but only one user per database.
During a new connection request, SQL Server verifies the login name supplied, to make sure, that login is
authorized to access SQL Server. This verification is called Authentication. SQL Server supports two
authentication modes:
Windows authentication mode: With Windows authentication, you do not have to specify a login
name and password, to connect to SQL Server. Instead, your access to SQL Server is controlled by
your Windows NT/2000 account (or the group to which your account belongs to), that you used to
login to the Windows operating system on the client computer/workstation. A DBA must first
specify to SQL Server, all the Microsoft Windows NT/2000 accounts or groups that can connect to
SQL Server
Mixed mode: Mixed mode allows users to connect using Windows authentication or SQL Server
authentication. Your DBA must first create valid SQL Server login accounts and passwords. These
are not related to your Microsoft Windows NT/2000 accounts. With this authentication mode, you
must supply the SQL Server login and password when you connect to SQL Server. If you do not
specify SQL Server login name and password, or request Windows Authentication, you will be
authenticated using Windows Authentication.
Point to note is that, whatever mode you configure your SQL Server to use, you can always login using
Windows authentication.
Windows authentication is the recommended security mode, as it is more secure and you don't have to send
login names and passwords over the network. You should avoid mixed mode, unless you have a non-
Windows NT/2000 environment or when your SQL Server is installed on Windows 95/98 or for backward
compatibility with your existing applications.
SQL Server's authentication mode can be changed using Enterprise Manager (Right click on the server name
and click on Properties. Go to the Security tab).
Authentication mode can also be changed using SQL DMO object model.
Here is a list of helpful stored procedures for managing logins and users:
Creates a new login that allows users to connect to SQL Server using SQL Server
sp_addlogin
authentication
Allows a Windows NT/2000 user account or group to connect to SQL Server using
sp_grantlogin
Windows authentication
sp_droplogin Drops an SQL Server login
sp_revokelogin Drops a Windows NT/2000 login/group from SQL Server
sp_denylogin Prevents a Windows NT/2000 login/group from connecting to SQL Server
sp_password Adds or changes the password for an SQL Server login
sp_helplogins Provides information about logins and their associated users in each database
sp_defaultdb Changes the default database for a login
Adds an associated user account in the current database for an SQL Server login or
sp_grantdbaccess
Windows NT/2000 login
sp_revokedbaccess Drops a user account from the current database
sp_helpuser Reports information about the Microsoft users and roles in the current database
Now let's talk about controlling access to objects within the database and managing permissions. Apart from
managing permissions at the individual database user level, SQL Server 7.0/2000 implements permissions
using roles. A role is nothing but a group to which individual logins/users can be added, so that the
permissions can be applied to the group, instead of applying the permissions to all the individual logins/users.
There are three types of roles in SQL Server 7.0/2000:
Fixed server roles
Fixed database roles
Application roles
Fixed server roles: These are server-wide roles. Logins can be added to these roles to gain the associated
administrative permissions of the role. Fixed server roles cannot be altered and new server roles cannot be
created. Here are the fixed server roles and their associated permissions in SQL Server 2000:
Fixed server
Description
role
sysadmin Can perform any activity in SQL Server
serveradmin Can set server-wide configuration options, shut down the server
setupadmin Can manage linked servers and startup procedures
Can manage logins and CREATE DATABASE permissions, also read error logs and
securityadmin
change passwords
processadmin Can manage processes running in SQL Server
dbcreator Can create, alter, and drop databases
diskadmin Can manage disk files
bulkadmin Can execute BULK INSERT statements
Here is a list of stored procedures that are helpful in managing fixed server roles:
sp_addsrvrolemember Adds a login as a member of a fixed server role

sp_dropsrvrolemember Removes an SQL Server login, Windows user or group from a fixed server role
sp_helpsrvrole Returns a list of the fixed server roles
sp_helpsrvrolemember Returns information about the members of fixed server roles
sp_srvrolepermission Returns the permissions applied to a fixed server role
Fixed database roles: Each database has a set of fixed database roles, to which database users can be added.
These fixed database roles are unique within the database. While the permissions of fixed database roles
cannot be altered, new database roles can be created. Here are the fixed database roles and their associated
permissions in SQL Server 2000:
Fixed database role Description

db_owner Has all permissions in the database
db_accessadmin Can add or remove user IDs
db_securityadmin Can manage all permissions, object ownerships, roles and role memberships
db_ddladmin Can issue ALL DDL, but cannot issue GRANT, REVOKE, or DENY statements
db_backupoperator Can issue DBCC, CHECKPOINT, and BACKUP statements
db_datareader Can select all data from any user table in the database
db_datawriter Can modify any data in any user table in the database
db_denydatareader Cannot select any data from any user table in the database
db_denydatawriter Cannot modify any data in any user table in the database
Here is a list of stored procedures that are helpful in managing fixed database roles:
sp_addrole Creates a new database role in the current database

sp_addrolemember Adds a user to an existing database role in the current database
sp_dbfixedrolepermission Displays permissions for each fixed database role
sp_droprole Removes a database role from the current database
sp_helpdbfixedrole Returns a list of fixed database roles
sp_helprole Returns information about the roles in the current database
sp_helprolemember Returns information about the members of a role in the current database
sp_droprolemember Removes users from the specified role in the current database
Application roles: Application roles are another way of implementing permissions. These are quite different
from the server and database roles. After creating and assigning the required permissions to an application
role, the client application needs to activate this role at run-time to get the permissions associated with that
application role. Application roles simplify the job of DBAs, as they don't have to worry about managing
permissions at individual user level. All they need to do is to create an application role and assign permissions
to it. The application that is connecting to the database activates the application role and inherits the
permissions associated with that role. Here are the characteristics of application roles:
There are no built-in application roles
Application roles contain no members
Application roles need to be activated at run-time, by the application, using a password
Application roles override standard permissions. For example, after activating the application role,
the application will lose all the permissions associated with the login/user account used while
connecting to SQL Server and gain the permissions associated with the application role
Application roles are database specific. After activating an application role in a database, if that
application wants to run a cross-database transaction, the other database must have a guest user
account enabled
Here are the stored procedures that are required to manage application roles:
sp_addapprole Adds an application role in the current database

sp_approlepassword Changes the password of an application role in the current database
sp_dropapprole Drops an application role from the current database
sp_setapprole Activates the permissions associated with an application role in the current database
Now that we discussed different kinds of roles, let's talk about granting/revoking permissions to/from
database users and database roles and application roles. The following T-SQL commands are used to manage
permissions at the user and role level.
GRANT: Grants the specific permission (Like SELECT, DELETE etc.) to the specified user or role
in the current database
REVOKE: Removes a previously granted or denied permission from a user or role in the current
database
DENY: Denies a specific permission to the specified user or role in the current database
Using the above commands, permissions can be granted/denied/revoked to users/roles on all database objects.
You can manage permissions at as low as the column level.
Note: There is no way to manage permissions at the row level. That is, in a given table, you can't grant
SELECT permission on a specific row to User1 and deny SELECT permission on another row to User2. This
kind of security can be implemented by using views and stored procedures effectively. Click here to read
about row level security implementation in SQL Server databases. Just an FYI, Oracle has a feature called
"Virtual Private Databases" (VPD) that allows DBAs to configure permissions at row level.
SQL Server security best practices

Here is an ideal implementation of security in a Windows NT/2000 environment with SQL Server 7.0/2000
database server:
Configure SQL Server to use Windows authentication mode
Depending upon the data access needs of your domain users, group them into different global groups
in the domain
Consolidate these global groups from all the trusted domains into the Windows NT/2000 local
groups in your SQL Server computer
The Windows NT/2000 local groups are then granted access to log into the SQL Server
Add these Windows NT/2000 local groups to the required fixed server roles in SQL Server
Associate these local group logins with individual user accounts in the databases and grant them the
required permissions using the database roles
Create custom database roles if required, for finer control over permissions
Here is a security checklist and some standard security practices and tips:
Restrict physical access to the SQL Server computer. Always lock the server while not in use.
Make sure, all the file and disk shares on the SQL Server computer are read-only. In case you have
read-write shares, make sure only the right people have access to those shares.
Use the NTFS file system as it provides advanced security and recovery features.
Prefer Windows authentication to mixed mode. If mixed mode authentication is inevitable, for
backward compatibility reasons, make sure you have complex passwords for sa and all other SQL
Server logins. It is recommended to have mixed case passwords with a few numbers and/or special
characters, to counter the dictionary based password guessing tools and user identity spoofing by
hackers.
Rename the Windows NT/2000 Administrator account on the SQL Server computer to discourage
hackers from guessing the administrator password.
In a website environment, keep your databases on a different computer than the one running the web
service. In other words, keep your SQL Server off the Internet, for security reasons.
Keep yourself up-to-date with the information on latest service packs and security patches released
by Microsoft. Carefully evaluate the service packs and patches before applying them on the
production SQL Server. Bookmark this page for the latest in the security area from Microsoft:
http://www.microsoft.com/security/
If it is appropriate for your environment, hide the SQL Server service from appearing in the server
enumeration box in Query Analyzer, using the /HIDDEN:YES switch of NET CONFIG SERVER
command.
Enable login auditing at the Operating System and SQL Server level. Examine the audit for login
failure events and look for trends to detect any possible intrusion.
If it fits your budget, use Intrusion Detection Systems (IDS), especially on high-risk online database
servers. IDS can constantly analyze the inbound network traffic, look for trends and detect Denial of
Service (DoS) attacks and port scans. IDS can be configured to alert the administrators upon
detecting a particular trend.
Disable guest user account of Windows. Drop guest user from production databases using
sp_dropuser
Do not let your applications query and manipulate your database directly using
SELECT/INSERT/UPDATE/DELETE statements. Wrap these commands within stored procedures
and let your applications call these stored procedures. This helps centralize business logic within the
database, at the same time hides the internal database structure from client applications.
Let your users query views instead of giving them access to the underlying base tables.
Discourage applications from executing dynamic SQL statements. To execute a dynamic SQL
statement, users need explicit permissions on the underlying tables. This defeats the purpose of
restricting access to base tables using stored procedures and views.
Don't let applications accept SQL commands from users and execute them against the database. This
could be dangerous (known as SQL injection), as a skilled user can input commands that can destroy
the data or gain unauthorized access to sensitive information.
Take advantage of the fixed server and database roles by assigning users to the appropriate roles.
You could also create custom database roles that suit your needs.
Carefully choose the members of the sysadmin role, as the members of the sysadmin role can do
anything in the SQL Server. Note that, by default, the Windows NT/2000 local administrators group
is a part of the sysadmin fixed server role.
Constantly monitor error logs and event logs for security related alerts and errors.
SQL Server error logs can reveal a great deal of information about your server. So, secure your error
logs by using NTFS permissions.
Secure your registry by restricting access to the SQL Server specific registry keys like
HKEY_LOCAL_MACHINE\Software\Microsoft\MSSQLServer.
If your databases contain sensitive information, consider encrypting the sensitive pieces (like credit
card numbers and Social Security Numbers (SSN)). There are undocumented encryption functions in
SQL Server, but I wouldn't recommend those. If you have the right skills available in your
organization, develop your own encryption/decryption modules using Crypto API or other
encryption libraries.
If you are running SQL Server 7.0, you could use the encryption capabilities of the Multi-Protocol
net library for encrypted data exchange between the client and SQL Server. SQL Server 2000
supports encryption over all protocols using Secure Socket Layer (SSL). See SQL Server 7.0 and
2000 Books Online (BOL) for more information on this topic. Please note that, enabling encryption
is always a tradeoff between security and performance, because of the additional overhead of
encryption and decryption.
Prevent unauthorized access to linked servers by deleting the linked server entries that are no longer
needed. Pay special attention to the login mapping between the local and remote servers. Use logins
with the bare minimum privileges for configuring linked servers.
DBAs generally tend to run SQL Server service using a domain administrator account. That is asking
for trouble. A malicious SQL Server user could take advantage of these domain admin privileges.
Most of the times, a local administrator account would be more than enough for SQL Server service.
DBAs also tend to drop system stored procedures like xp_cmdshell and all the OLE automation
stored procedures (sp_OACreate and the likes). Instead of dropping these procedures, deny
EXECUTE permission on them to specific users/roles. Dropping these procedures would break some
of the SQL Server functionality.
Be prompt in dropping the SQL Server logins of employees leaving the organization. Especially, in
the case of a layoff, drop the logins of those poor souls ASAP as they could do anything to your data
out of frustration.
When using mixed mode authentication, consider customizing the system stored procedure
sp_password, to prevent users from using simple and easy-to-guess passwords.
To setup secure data replication over Internet or Wide Area Networks (WAN), implement Virtual
Private Networks (VPN) . Securing the snapshot folder is important too, as the snapshot agent
exports data and object scripts from published databases to this folder in the form of text files. Only
the replication agents should have access to the snapshot folder.
It is good to have a tool like Lumigent Log Explorer handy, for a closer look at the transaction log to
see who is doing what in the database.
Do not save passwords in your .udf files, as the password gets stored in clear text.
If your database code is proprietary, encrypt the definition of stored procedures, triggers, views and
user defined functions using the WITH ENCRYPTION clause. dbLockdown is a tool that automates
the insertion of the WITH ENCRYPTION clause and handles all the archiving of encrypted database
objects so that they can be restored again in a single click. Click here to find out more information
about this product.
In database development environments, use a source code control system like Visual Source Safe
(VSS) or Rational Clear Case. Control access to source code by creating users in VSS and giving
permissions by project. Reserve the 'destroy permanently' permission for VSS administrator only.
After project completion, lock your VSS database or leave your developers with just read-only
access.
Store the data files generated by DTS or BCP in a secure folder/share and delete these files once you
are done.
Install anti-virus software on the SQL Server computer, but exclude your database folders from
regular scans. Keep your anti-virus signature files up to date.
SQL Server 2000 allows you to specify a password for backups. If a backup is created with a
password, you must provide that password to restore from that backup. This discourages
unauthorized access to backup files.
Windows 2000 introduced Encrypted File System (EFS) that allows you to encrypt individual files
and folders on an NTFS partition. Use this feature to encrypt your SQL Server database files. You
must encrypt the files using the service account of SQL Server. When you want to change the service
account of SQL Server, you must decrypt the files, change the service account and encrypt the files
again with the new service account.
The above points pretty much cover my security check list. Feel free to email me your comments and
suggestions. Be sure to check back once in a while, as I will be constantly updating this page.
What are federated databases?
Last updated: December 2nd '01
What are federated database servers?
"Federated database servers" is a feature introduced in SQL Server 2000. A federation is a group of SQL
Servers that cooperate to share the processing load of a system. Federated database servers let you scale out a
set of servers to support the processing needs of large systems and websites.
Scaling out is the process of increasing the processing power of a system by adding one or more additional
computers, or nodes, instead of beefing up the hardware of a single computer (Scaling up).
Federated database servers can be implemented using "Distributed Partitioned Views" (DPV). You can
partition tables horizontally across several servers, and define a distributed partitioned view on one server,
covering all member server partitions. This view makes it appear as if a full copy of the original table is stored
on one server.
SQL Server 7.0 supports partitioned views too, but SQL Server 2000 came up with the following
enhancements, that allow the views to scale out and form federations of database servers:
In SQL Server 2000, partitioned views are updateable

The SQL Server 2000 query optimizer supports new optimizations that minimize the amount of
distributed data that has to be transferred. The distributed execution plans generated by SQL Server
2000 result in good performance for a larger set of queries than the plans generated by SQL Server
7.0
Why federated database servers and when to scale out?
When websites, applications generate processing loads that exceed the capacity of large individual servers,
scaling out is the best option for increasing the processing capacity of the system. When the server for a
particular application is at its maximum potential, and is no longer able to meet user demands, you should
consider scaling out.
According to Microsoft, a federation of servers running SQL Server 2000 is capable of supporting the growth
requirements of any Web site, or of the largest enterprise systems.
How are we able to gain performance by scaling out?
Distributed processing of data, which means individual member servers of the federation are
working with smaller subsets of data, instead of the complete data
More CPUs working in parallel
Parallel disk I/O
Availability of more RAM, as each member server is working with a smaller subset of data
How to create distributed partitioned views?
Creation of partitioned views is explained in SQL Server Books Online, in the page titled "Creating a
Partitioned View". The rules for creating updateable partitioned views are also explained in this page.
What is the impact of partitioned views on my front-end applications?
Instead of accessing the base tables directly, your front-end applications or stored procedures will access the
partitioned view. SQL Server will take care of getting the right data from right servers, transparent to your
application. Even in OLTP scenarios SQL Server manages the INSERT, UPDATE and DELETE commands
transparently and sends them to the right partition.
What other things I should consider before implementing federated database servers?
Availability: Make sure all table partitions spread across different servers are accessible all the time,
or else you will not be able to use the partitioned view.
Backup/Restore: If transactional consistency across the partitioned is not a concern, you can backup
and restore some or all of your partitions individually. If you must achieve transactional consistency,
perform coordinated backup and restore operations. This means that you backup all your partitions
simultaneously and in case you have to restore them, you will have to restore all your partitions to a
certain point in time.
Security: Since you have your data partitioned across multiple database servers, you have to follow
consistent security practices across all your servers.
What is the performance gain I can expect by scaling out?
There is no easy way to calculate the performance gain, without actually implementing and testing the
scenario, as the performance depends on a lot of other factors too. But in general, in a simulated DSS system,
I achieved a performance gain of 25 to 30% by scaling out to just two SQL Servers.
Evaluation of Federated Database Servers
and Distributed Partitioned Views of SQL
Server 2000
Last updated: March 26th '01
Okay, there are no high end systems, no quad processor boxes, no RAID, and no dedicated network. It's an
evaluation scenario at it's simplest form.
I horizontally partitioned a huge table (5 Million rows, not so huge anyways!) across two database servers
with 2.5 million rows on each server. Idea is to compare the response times in the following two scenarios:
Having the table on one server (5 million rows)

Having the table spread across two servers (2.5 million rows on each server)
Here is the hardware configuration of both the servers:
Server1: Server1
OS: Microsoft Windows 2000 Advanced Server with SP1
System model: DELL OptiPlex GX1
Processor: Pentium III, 500MHz
RAM: 128 MB
Server2: Server2
OS: Microsoft Windows 2000 Advanced Server with SP1
System model: DELL OptiPlex GX1
Processor: Pentium III, 500MHz
RAM: 128 MB
Here is the table structure:
CREATE TABLE [dbo].[tblPriceDefn] (

[ConfigID] [numeric](19, 0) NOT NULL ,
[PriceListID] [varchar] (40) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[ValidityDate] [datetime] NOT NULL ,
[Price] [money] NOT NULL
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[tblPriceDefn] WITH NOCHECK ADD
CONSTRAINT [PK_PriceDefn] PRIMARY KEY CLUSTERED
(
[ConfigID],
[PriceListID],
[ValidityDate]
) ON [PRIMARY]
Horizontal partitioning of the data:

For this evaluation purpose I took the tblPriceDefn table which has 5 Million rows. I divided the table into
two equal halves depending upon the value of the PriceListID column. Stored the first half on the first server
(Server1) and the second half on the second server (Server2). Then I created a distributed partioned view
named View1 in the first server (Server1) which cobmines both the halves of the table tblPriceDefn. Here is
the view definition:
CREATE VIEW View1

AS
SELECT * FROM [Server1].test.dbo.table1
UNION ALL
SELECT * FROM [Server2\inst1].test.dbo.table1
Performance testing:
The following queries are executed on the first server locally on both the base table and the partitioned view
and the response times were recorded. Each query is executed 5 times and the average response time is
recorded
Query 1:
--On the base table:

SELECT
PriceListID,
COUNT(*) [# of rows]
FROM
ArenaDBRel2.dbo.tblPriceDefn
GROUP BY
PriceListID
ORDER BY
PriceListID
Average response time on the base table with 5 million rows: 59 seconds
--On the partitioned view:

SELECT
PriceListID,
COUNT(*) [# of rows]
FROM
test.dbo.View1
GROUP BY
PriceListID
ORDER BY
PriceListID
Average response time on the partitioned view : 41 seconds
A gain of 18 seconds which is a 30% improvement in performance

Query 2:

SELECT
DISTINCT PriceListID
FROM

SELECT
DISTINCT PriceListID
FROM
test.dbo.View1
Average response time on the partitioned view: 32 seconds
Query 3:

SELECT
PriceListID,
ConfigID,
Price
FROM
WHERE
PriceListID = 'M05-05'
OR

SELECT
PriceListID,
ConfigID,
Price
FROM
test.dbo.View1
WHERE
OR
Average response time on the partitioned view: 44 seconds

Conclusion:
The federated database servers feature in SQL Server 2000 is a very useful one and we can be sure to see
performance gains in the query response times. As I mentioned earlier, this is a very simple evaluation of the
feature. I am going to try it out with more concurrent connections and complicated queries and will post more
information on this site. Be sure to check back after a while! Please let me know if you have any suggestions
regarding this page!
Principios de diseño de bases
de datos
1 . Introducción
2 . Almacenar sólo la información necesaria
3 . Pedir sólo lo necesario y ser explícito
4 . Normalizar las estructuras de tablas
5 . Seleccionar el tipo de dato apropiado
6 . Utilizar índices apropiadamente
7 . Usar consultas REPLACE
8 . Usar tablas temporales
9 . Usar una versión reciente de MySQL
10 . Consideraciones finales
Introducción
Uno de los pasos cruciales en la construcción de una aplicación que maneje una base de datos, es sin duda, el
diseño de la base de datos. Si las tablas no son definidas apropiadamente, podemos tener muchos dolores de
cabeza al momento de ejecutar consultas a la base de datos para tratar de obtener algún tipo de información.
No importa si nuestra base de datos tiene sólo 20 registros, o algunos cuantos miles, es importante
asegurarnos que nuestra base de datos está correctamente diseñada para que tenga eficiencia y usabilidad a
lo largo del tiempo.
En este artículo, se mencionarán algunos principios básicos del diseño de base de datos y se tratarán algunas
reglas que se deben seguir cuando se crean bases de datos. Dependiendo de los requerimientos de la base de
datos, el diseño puede ser algo complejo, pero con algunas reglas simples que tengamos en la cabeza será
mucho más fácil crear una base de datos perfecta para nuestro siguiente proyecto.
Construir grandes aplicaciones en MySQL resulta fácil con herramientas como Apache, Perl, PHP, y Python.
Asegurarse de que son rápidas, sin embargo, requiere algo más que perspicacia. MySQL tiene una bien
merecida reputación de ser un servidor de bases de datos muy rápido que también es muy fácil de configurar
y usar, además de que en los últimos años su popularidad ha crecido notablemente debido a que se utiliza en
infinidad de sitios web que requieren hacer uso de una base de datos. Sin embargo, pocos usuarios sabemos
algo más que crear una base de datos y escribir algunas búsquedas contra ella.
Después de leer este artículo debemos ser capaces de entender algunas técnicas que nos ayudarán a diseñar
bases de datos MySQL para construir mejores aplicaciones. Vamos a suponer que se tiene un conocimiento
básico del lenguaje SQL, y de MySQL, pero no vamos a asumir que se tiene mucha experiencia en alguno de
los dos.
Almacenar sólo la información necesaria

Parece de sentido común, pero muchas personas suelen tomar el enfoque de "sumidero de cocina" para el
diseño de bases de datos. A menudo pensamos en todo lo que quisiéramos que estuviera almacenado en una
base de datos y diseñamos la base de datos para guardar dichos datos. Hemos de ser realistas acerca de
nuestras necesidades y decidir qué información es realmente necesaria. Frecuentemente podemos generar
algunos datos sobre la marcha sin tener que almacenarlos en una tabla de una base de datos. En estos casos
también tiene sentido hacer esto desde el punto de vista del desarrollo de la aplicación.
Por ejemplo, una tabla de productos para un catálogo en línea puede contener nombres, descripciones,
tamaños, pesos y precios de varios productos. Además del precio, puede que se quieran guardar los impuestos
y los gastos de envío asociados con cada producto. Pero realmente no hay ninguna necesidad de hacer esto.
Primero, tanto los impuestos como los gastos de envío pueden ser calculados sobre la marcha (ya sea por
nuestra aplicación, o por MySQL). Segundo, si cambiamos los impuestos o los gastos de envío, tendríamos que
escribir las búsquedas necesarias para actualizar los impuestos y los gastos de envío en cada registro del
producto.
Algunas veces pensamos que agregar campos a las tablas de una base de datos una vez que han sido creadas
es demasiado difícil, así que nos vemos impulsados a definir tantas columnas como se pueda. Bueno, esto
simplemente es un concepto erróneo, ya que en MySQL podemos usar el comando ALTER TABLE para
modificar la definición de una tabla en cualquier momento para que se adecue a nuestras necesidades
cambiantes.
Por ejemplo, si en algún momento nos damos cuenta que necesitamos agregar una columna de popularidad a
nuestra tabla productos (tal vez queramos que nuestros clientes califiquen los productos en nuestro catálogo),
podríamos hacer lo siguiente:
ALTER TABLE productos ADD popularidad INTEGER;
Pedir sólo lo necesario y ser explícito

Igual que decir "almacenar sólo lo necesario", esto puede parecer un poco más de sentido común, sin
embargo, esto no suele ser considerado muy a menudo. ¿Por qué?. Porque cuando una aplicación está en
desarrollo los requerimientos suelen cambiar, de tal forma que muchas de las búsquedas terminan
pareciéndose a esto:
SELECT * FROM algunaTabla;
Obtener todas las columnas de una tabla es simplemente lo más conveniente que podemos hacer cuando no
estamos seguros de qué campos necesitamos. Sin embargo, a medida que las tablas crecen y cambian, esto
puede convertirse en un problema de rendimiento. A la larga es mucho mejor tardarnos un tiempo extra
después de nuestro desarrollo inicial y decidir exactamente qué es lo que necesitamos en nuestras búsquedas.
En concreto, es mucho mejor especificar las columnas de forma explícita:
SELECT nombre, precio, descripcion FROM productos;
Esto se relaciona con un punto que tiene que ver más con el mantenimiento del código que con el
rendimiento. La mayoría de los lenguajes de programación (Perl, Python, PHP, Java, etc) nos permiten acceder
a los resultados de una consulta por los nombres de los campos y por su posición numérica. Para el ejemplo
anterior, podemos acceder al campo 0, o al campo nombre y obtener los mismos resultados.
A la larga es mejor usar los nombres de columnas que sus posiciones numéricas. ¿Por qué? Porque las
posiciones relativas de columnas en una tabla o en un resultado de una consulta pueden variar. Por ejemplo,
pueden variar en una tabla como resultado de un ALTER TABLE, o bien, cambiarán en una consulta como
resultado de que alguien rescriba la búsqueda y se olvide de actualizar la lógica de la aplicación
apropiadamente.
Claro está, ¡aún debemos ser cuidadosos cuando cambiemos los nombres de las columnas! Pero si usamos
nombres en vez de posiciones numéricas, podemos usar la capacidad de búsqueda y reemplazo de nuestro
editor para encontrar el código que hemos de cambiar en caso de que cambie el nombre de una columna.
Normalizar las estructuras de tablas
Si nunca antes hemos oído hablar de la "normalización de datos", no debemos temer. Mientras que la
normalización puede parecer un tema complejo, nos podemos beneficiar ampliamente al entender los
conceptos más elementales de la normalización.
Una de las formas más fáciles de entender esto es pensar en nuestras tablas como hojas de cálculo. Por
ejemplo, si quisiéramos seguir la pista de nuestra colección de CDs en una hoja de cálculo, podríamos diseñar
algo parecido a lo que se muestra en la siguiente tabla.
+------------+-------------+--------------+ .. +--------------+
| album | track1 | track2 | | track10 |
+------------+-------------+--------------+ .. +--------------+
| Antrologia | Tarzan Boy | Life is life | .. | Square rooms |
| | (Baltimora) | (Opus) | .. | (Al Corley) |
+------------+-------------+--------------+ .. +--------------+
Esto parece razonable. Sin embargo el problema es que el número de pistas que tiene un CD es bastante
variable. Esto significa que con este método tendríamos que tener una hoja de cálculo realmente grande para
albergar todos los datos, que en los peores casos podrían ser de hasta 20 pistas. Esto en definitiva no es nada
bueno.
Uno de los objetivos de una estructura de tabla normalizada es minimizar el número de "celdas vacías". En el
caso de la tabla de CDs que estamos tratando, tendríamos muchas de estas celdas si permitiéramos CDs de 20
pistas o más. En el caso de que las listas de campos pueden expandirse "hacia la derecha", como en este
ejemplo de los CDs, nos da una pista de que necesitamos dividir los datos en dos o más tablas a las que luego
podamos acceder de forma conjunta para obtener los datos que necesitemos.
Mucha gente nueva en los sistemas de bases de datos relacionales no sabe en verdad lo que significa
"relacional" en RDBMS (Relational Database Management System). En términos simples, grupos parecidos de
información son almacenados en distintas tablas que luego pueden ser "juntadas" (relacionadas) basándose en
los datos que tengan en común. Desafortunadamente esto suena bastante académico y vago, sin embargo, en
nuestra base de datos de CDs podemos ejemplificar una situación concreta en la que veremos cómo
normalizar los datos.
El darnos cuenta de que cada lista de CDs tiene un conjunto fijo de atributos (título, artista, año, género) y un
conjunto variable de atributos (el número de pistas) nos da una idea de cómo dividir los datos en múltiples
tablas que luego podamos relacionar entre sí. Podemos crear una tabla que contenga una lista de todos los
álbumes y sus atributos fijos y otra que contenga una lista de todas las pista de esos álbumes. De esta forma,
en vez de pensar en forma horizontal (como con la hoja de cálculo), pensamos en forma vertical y dejamos
una estructura de tablas como la que se muestra a continuación.
CREATE TABLE album (

id_album INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
titulo VARCHAR(80) NOT NULL );
CREATE TABLE pista (

id_album INTEGER NOT NULL,
numero INTEGER NOT NULL,
titulo VARCHAR(80) NOT NULL );
El identificador de álbum (id_album) es lo que relaciona las distintas pistas de un álbum dado. El campo
id_album en la tabla pista coincide con el campo id_album en la tabla album, de tal manera que para obtener
una lista de todas las pistas de un álbum dado, podríamos realizar una consulta como esta:
SELECT pista.numero, pista.nombre FROM album, pista
WHERE album.titulo = "El titulo del album"
AND album.id_album = pista.id_album
Esta estructura es a la vez flexible y eficiente. La flexibilidad está en el hecho que podemos agregar datos al
sistema posteriormente sin tener que rescribir lo que ya tenemos. Por ejemplo, si quisiéramos agregar la
información de los artistas de cada álbum, lo único que tenemos que hacer es crear una tabla artista que esté
relacionada a la tabla album de la misma manera que la tabla pista. Por lo tanto, no tendremos que modificar
la estructura de nuestras tablas actuales, simplemente agregar la que hace falta.
La eficiencia se refiere al hecho de que no tenemos duplicación de datos, y tampoco tenemos grandes
cantidades de "celdas vacías". De esta manera MySQL no tiene que almacenar más datos de los necesarios, ni
gastar recursos al revisar las áreas vacías en nuestras tablas.
El objetivo principal del diseño de bases de datos es generar tablas que modelan los registros en los que
guardaremos nuestra información. Es importante que esta información se almacene sin redundancia para que
se pueda tener una recuperación rápida y eficiente de los datos. A través de la normalización tratamos de
evitar ciertos defectos que nos conduzcan a un mal diseño y que lleven a un procesamiento menos eficaz de
los datos.
Podríamos decir que estos son los principales objetivos de la normalización:
Controlar la redundancia de la información.

Evitar pérdidas de información.
Capacidad para representar toda la información.
Mantener la consistencia de los datos.
Si somos novatos en el ambiente de las bases de datos relacionales pudiéramos pensar que con la
normalización nuestros datos tienen una apariencia extraña, sin embargo, esto le permite a MySQL ser muy
eficiente al momento de almacenar y recuperar los datos de las tablas, además de que nos da la flexibilidad de
crecer y escalar nuestras aplicaciones sin la necesidad de reestructurar una base de datos a cada momento.
Seleccionar el tipo de dato apropiado

Una vez identificadas todas las tablas y columnas que necesita la base de datos, debemos determinar el tipo
de dato de cada campo. Existen tres categorías principales que pueden aplicarse prácticamente a cualquier
aplicación de bases de datos:
Texto
Números
Fecha y hora
Cada uno de éstos presenta sus propias variantes, por lo que la elección del tipo de dato correcto no sólo
influye en el tipo de información que se puede almacenar en cada campo, sino que afecta al rendimiento
global de la base de datos.
A continuación se dan algunos consejos que nos ayudarán a elegir un tipo de dato adecuado para nuestras
tablas:
Identificar si una columna debe ser de tipo texto, numérico o de fecha.

Esto suele ser un paso demasiado sencillo. Valores eminentemente numéricos como códigos postales o
cantidades monetarias deben tratarse como campos de texto si decidimos incluir sus signos de
puntuación, pero obtendremos mejores resultados si los almacenamos como números y solucionamos la
cuestión del formato de alguna otra forma.
Elegir el subtipo más apropiado para cada columna.
Los campos de longitud fija (como CHAR) son generalmente más rápidos que los de longitud variable
(como VARCHAR), aunque ocupan más espacio en disco.
El tamaño de cada campo debe restringirse al mínimo en función de cuál pudiera ser la entrada más
grande. Por ejemplo, si el valor en una columna de tipo entero es menor de mil, lo mejor es configurar
esta columna como un SMALLINT de tres dígitos sin signo (lo que permite exactamente 999 valores
distintos).
Configurar la longitud máxima para las columnas de texto y numéricas, así como otros atributos.
Puede que nosotros tengamos preferencias distintas, pero el factor más importante es siempre ajustar al
máximo la información de cada campo en lugar de usar siempre tipos TEXT e INT genéricos (e
ineficientes).
Utilizar índices apropiadamente

Los índices son un sistema especial que utilizan las bases de datos para mejorar su rendimiento global. Al
definir índices en las columnas de una tabla, se le indica a MySQL que preste atención especial a dichas
columnas.
MySQL permite definir hasta 32 índices por cada tabla y cada índice puede incorporar hasta 16 columnas.
Aunque un índice de varias columnas puede no resultar de utilidad obvia a primera vista, lo cierto es que
resulta muy útil a la hora de realizar búsquedas frecuentes sobre un mismo conjunto de columnas.
Dado que los índices hacen que las consultas se ejecuten más rápido, podemos estar incitados a indexar todas
las columnas de nuestras tablas. Sin embargo, lo que tenemos que saber es que el usar índices tiene un
precio. Cada vez que hacemos un INSERT, UPDATE, REPLACE, o DELETE sobre una tabla, MySQL tiene que
actualizar cualquier índice en la tabla para reflejar los cambios en los datos.
¿Así que, cómo decidimos usar índices o no?. La respuesta es "depende". De manera simple, depende que tipo
de consultas ejecutamos y que tan frecuentemente lo hacemos, aunque realmente depende de muchas otras
cosas.
La razón para tener un índice en una columna es para permitirle a MySQL que ejecute las búsquedas tan
rápido como sea posible (y evitar los escaneos completos de tablas). Podemos pensar que un índice contiene
una entrada por cada valor único en la columna. En el índice, MySQL debe contar cualquier valor duplicado.
Estos valores duplicados decrementan la eficiencia y la utilidad del índice.
Así que antes de indexar una columna, debemos considerar que porcentaje de entradas en la tabla son
duplicadas. Si el porcentaje es demasiado alto, seguramente no veremos alguna mejora con el uso de un
índice.
Otra cosa a considerar es qué tan frecuentemente los índices serán usados. MySQL puede usar un índice para
una columna en particular si dicha columna aparece en la cláusula WHERE en una consulta. Si muy rara vez
usamos una columna en una cláusula WHERE, seguramente no tiene mucha sentido indexar dicha columna.
De esta manera, probablemente sea más eficiente sufrir el escaneo completo de la tabla las raras ocasiones en
que se use esta columna en una consulta, que estar actualizando el índice cada vez que cambien los datos de
la tabla.
Ante la duda, no tenemos otra alternativa que probar. Siempre podemos ejecutar algunas pruebas sobre los
datos de nuestras tablas con y sin índices para ver como obtenemos los resultados más rápidamente. Lo único
a considerar es que las pruebas sean lo más realistas posibles.
Usar consultas REPLACE

Existen ocasiones en las que deseamos insertar un registro a menos de que éste ya se encuentre en la tabla.
Si el registro ya existe, lo que quisiéramos hacer es una actualización de los datos. En lugar de escribir el
código que cumpla con esta lógica, y tener que ejecutar varias consultas, lo mejor es usar la sentencia
REPLACE de MySQL.
Usar tablas temporales

Cuando estamos trabajando con tablas muy grandes, suele suceder que ocasionalmente necesitemos ejecutar
algunas consultas sobre un pequeño subconjunto de una gran cantidad de datos. En vez de ejecutar estas
consultas sobre la tabla completa y hacer que MySQL encuentre cada vez los pocos registros que necesitamos,
puede ser mucho más rápido seleccionar dichos registros en una tabla temporal y entonces ejecutar nuestras
consultas sobre esta tabla.
Crear una tabla temporal es tan sencillo como agregar la palabra TEMPORARY a una sentencia típica CREATE
TABLE:
CREATE TEMPORARY TABLE tabla_temp

(
campo1 tipoDato,
campo2 tipoDeDato,
...
);
Una tabla temporal existe mientras dure la conexión a MySQL. Cuando se interrumpe la conexión MySQL
remueve automáticamente la tabla y libera el espacio que ésta usaba. Nosotros podemos por supuesto
eliminar esta tabla mientras estamos conectados a MySQL. Si una tabla nombrada tabla_temp ya existe en
nuestra base de datos al momento de crear una tabla temporal con el mismo nombre, la tabla temporal oculta
a la tabla no temporal.
MySQL también permite especificar que una tabla temporal sea creada en memoria si dicha tabla se declara
del tipo HEAP:
CREATE TEMPORARY TABLE tabla_temp

(
campo1 tipoDato,
campo2 tipoDeDato,
...
) TYPE = HEAP;
Ya que las tablas del tipo HEAP son almacenadas en memoria, las consultas sobre estas tablas son ejecutadas
mucho más rápido que en las tablas en disco no temporales. Sin embargo las tablas HEAP son ligeramente
diferentes de una tabla normal y tienen algunas limitaciones propias.
Como en las sugerencias previas, lo único que nos queda es probar si con las tablas temporales nuestras
consultas se ejecutan más rápidamente que usando la tabla que contiene una gran cantidad de datos. Si los
datos están bien indexados puede que las tablas temporales no nos sean de mucha utilidad.
Usar una versión reciente de MySQL

La recomendación es simple y concreta, siempre que esté en nuestras manos, debemos usar la versión más
reciente de MySQL que se encuentre disponible. Además de que las nuevas versiones frecuentemente incluyen
muchas mejoras, cada vez son más estables y más rápidas. De esta manera, a la vez que sacamos provecho
de las nuevas características incorporadas en MySQL, veremos significativos incrementos en la eficiencia de
nuestro servidor de bases de datos.
Consideraciones finales
El último paso del diseño de la base de datos es adoptar determinadas convenciones de nombres. Aunque
MySQL es muy flexible en cuanto a la forma de asignar nombre a las bases de datos, tablas y columnas, he
aquí algunas reglas que es conveniente observar:
Utilizar caracteres alfanuméricos.

Limitar los nombres a menos de 64 caracteres (es una restricción de MySQL).
Utilizar el guión bajo (_) para separar palabras.
Utilizar palabras en minúsculas (esto es más una preferencia personal que una regla).
Los nombres de las tablas deberían ir en plural y los nombres de las columnas en singular (es
igual una preferencia personal).
Utilizar las letras ID en las columnas de clave primaria y foránea.
En una tabla, colocar primero la clave primaria seguida de las claves foráneas.
Los nombres de los campos deben ser descriptivos de su contenido.
Los nombres de los campos deben ser unívocos entre tablas, excepción hecha de las claves.
Los puntos anteriores corresponden muchos de ellos a preferencias personales, más que a reglas que
debamos de cumplir, y en consecuencia muchos de ellos pueden ser pasados por alto, sin embargo, lo más
importante es que la nomenclatura utilizada en nuestras bases de datos sea coherente y consistente con el fin
de minimizar la posibilidad de errores al momento de crear una aplicación de bases de datos.
Tipo de dato Sinónimos Tamaño Descripción
BINARY VARBINARY 1 byte por Se puede almacenar cualquier tipo de datos en un campo de este
BINARY VARYING carácter tipo. Los datos no se traducen (por ejemplo, a texto). La forma en
BIT VARYING que se introducen los datos en un campo binario indica cómo
aparecerán al mostrarlos.
BIT BOOLEAN 1 byte Valores Sí y No, y campos que contienen solamente uno de dos
LOGICAL valores.
LOGICAL1
YES/NO
TINYINT INTEGER1 1 byte Un número entero entre 0 y 255.
BYTE
COUNTER AUTOINCREMENT Se utiliza para campos contadores cuyo valor se incrementa
automáticamente al crear un nuevo registro.
MONEY CURRENCY 8 bytes Un número entero comprendido entre
– 922.337.203.685.477,5808 y 922.337.203.685.477,5807.
DATETIME DATE 8 bytes Una valor de fecha u hora entre los años 100 y 9999
TIME
UNIQUEIDENTIFIER GUID 128 bits Un número de identificación único utilizado con llamadas a
procedimientos remotos.
DECIMAL NUMERIC 17 bytes Un tipo de datos numérico exacto con valores comprendidos entre
DEC 1028 - 1 y - 1028 - 1. Puede definir la precisión (1 - 28) y la escala
(0 - precisión definida). La precisión y la escala predeterminadas
son 18 y 0, respectivamente.
REAL SINGLE 4 bytes Un valor de coma flotante de precisión simple con un intervalo
FLOAT4 comprendido entre – 3,402823E38 y – 1,401298E-45 para valores
IEEESINGLE negativos, y desde 1,401298E-45 a 3,402823E38 para valores
positivos, y 0.
FLOAT DOUBLE 8 bytes Un valor de coma flotante de precisión doble con un intervalo
FLOAT8 comprendido entre – 1,79769313486232E308 y –
IEEEDOUBLE 4,94065645841247E-324 para valores negativos, y desde
NUMBER 4,94065645841247E-324 a 1,79769313486232E308 para valores
positivos, y 0.
SMALLINT SHORT 2 bytes Un entero corto entre – 32.768 y 32.767.
INTEGER2
INTEGER LONG 4 bytes Un entero largo entre – 2.147.483.648 y 2.147.483.647.
INT
INTEGER4
IMAGE LONGBINARY Lo que se Desde cero hasta un máximo de 2.14 gigabytes.
GENERAL requiera Se utiliza para objetos OLE.
OLEOBJECT
TEXT LONGTEXT 2 bytes por Desde cero hasta un máximo de 2.14 gigabytes.
LONGCHAR carácter.
MEMO (Consulte las
NOTE notas).
NTEXT
CHAR TEXT(n) 2 bytes por Desde cero
ALPHANUMERIC carácter.
CHARACTER (Consulte las
STRING notas).
VARCHAR
CHARACTER VARYING
NCHAR
NATIONAL CHARACTER
NATIONAL CHAR
NATIONAL CHARACTER
VARYINGNATIONAL CHAR
VARYING

Convenciones sql-1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Convenciones sql-1

Uploaded by

Copyright:

Available Formats

Database object naming conventions

Last updated: May 17th '01

My database object naming conventions:

So, name your customer table as 'Customers'.

'Product Sales for 1997'

However, try to stay away from spaces within object names.

User defined functions:

My index naming convention is:

Here's a simplest way of naming the columns of the Customers table:

User defined data types:

Consider concatenating the column names in case of composite primary keys.

I prefer the following naming convention for foreign keys:

Default and Check constraints:

Here are some general rules I follow:

Do not depend on undocumented functionality. The reasons being:

1. SELECT LocationID FROM Locations WHERE Specialities LIKE '%pples'

SELECT a.au_id, t.title

SELECT a.au_id, t.title

Server: Msg 229, Level 14, State 5, Line 1

To reproduce the above problem, use the following commands:

CREATE TABLE Customers

INSERT INTO Customers

To avoid this problem, use ISNULL as shown below:

SELECT FirstName + ' ' + ISNULL(MiddleName + ' ','') + LastName

Consider the following table:

CREATE TABLE EuropeanCountries

Here's an INSERT statement without a column list , that works perfectly:

INSERT INTO EuropeanCountries

Now, let's add a new column to this table:

ALTER TABLE EuropeanCountries

Server: Msg 213, Level 16, State 4, Line 1

INSERT INTO EuropeanCountries

SELECT title_id, title

Server: Msg 242, Level 16, State 3, Line 2

Do not forget to enforce unique constraints on your alternate keys.

DECLARE @ORDER_PENDING int

SELECT OrderID, OrderDate

SELECT OrderID, OrderDate

SQL Server security model

An SQL Server login, that is maintained within SQL Server

Fixed server roles

Fixed database roles

sp_addsrvrolemember Adds a login as a member of a fixed server role

Fixed database role Description

sp_addrole Creates a new database role in the current database

There are no built-in application roles

Application roles contain no members

Application roles need to be activated at run-time, by the application, using a password

sp_addapprole Adds an application role in the current database

SQL Server security best practices

Configure SQL Server to use Windows authentication mode

What are federated database servers?

In SQL Server 2000, partitioned views are updateable

Why federated database servers and when to scale out?

How are we able to gain performance by scaling out?

How to create distributed partitioned views?

What is the impact of partitioned views on my front-end applications?

What is the performance gain I can expect by scaling out?

Having the table on one server (5 million rows)

Here is the hardware configuration of both the servers:

Here is the table structure:

CREATE TABLE [dbo].[tblPriceDefn] (

Horizontal partitioning of the data: