SQL Server | Designing New Databases

"The desire to create is one of the deepest yearnings of the human soul." 

-- Dieter F. Uchtdorf

 

Before we jump into the designing part, lets take a quick look at how the SQL Server databases are formed.

Database Files

Ever database has one Primary data file, which by default has an .MDF extension.

In addition, ever database can also have secondary database files, which have .NDF extension.


 

All database files are grouped into FileGroups. FileGroup is a logical unit that simplifies database administration. 

Imagine houses in an area. We recognize each house with it's own number but we also include a lot og groups in the address. Sector is a Filegroup. Pin-code is a Filegroup.



This is how it looks like in database.

There are 5 disks with 4 data files and 1 transaction log file.

You can add multiple data files inside a filegroup. This lets you spread the load across different storage drives. This could help you improve I/O performance.

Filegroup1 - green - contains the MDF file on drive M:

Filegroup2 - yellow - contains one NDF file on drive N:

Filegroup3 - red - contains two NDF files across drives O: and P:


Log File - Due it's sequential nature, it performs best if it's kept on a separate physical disk. These are not part of any filegroup.

Now when you create some new tables:

CREATE TABLE dbo.Customers

    col1..
    col2..
) ON [yellow];

CREATE TABLE dbo.Articles
    col1..
    col2..
) ON [yellow];

CREATE TABLE dbo.Orders
    col1..
    col2..
) ON [red];

 

The tables will be created as follows:


Raid Models


Raid 0

Minimum Disks = 2

Increased Performance 🙂

Maximum Capacity 🙂

No Fault Tolerance 😡

Advantages: extremely high disk read and write efficiency, there is no verification, does not take up too much CPU resources, and the design, use and configuration are relatively simple
Disadvantages: no redundancy, cannot be used in environments with high data security requirements

Raid 0 - Stripping Data into multiple drives





Raid 1

Minimum Disks = 2. Only 1 disk failure is handled.

Reduced Performance 😐 

50% Capacity 😡

Fault Tolerance 🙂 - 1 disk crash

Advantages: It has 100% data redundancy, provides the highest data security guarantee, theoretically can achieve 2 times the reading efficiency, and the design and use are relatively simple
Disadvantages: high overhead, space utilization is only 50%, little improvement in write performance

RAID1

 

 

Raid 5 (Parity)

Requires 3 or more disks. Only 1 disk failure is handled.

Parity Data  🙂

Fault Tolerance 🙂

Good Performance 😐

Reduced Capacity 😐

Advantages: relatively high read performance, medium write performance, distributed access to check information, avoiding write operation bottlenecks
Disadvantages: The controller design is complicated, and the disk reconstruction process is more complicated

 


Raid 1 + 0

Requires 4 or more disks. Only 1 disk failure is handled within each sub array.

Increased Performance 🙂

50% Capacity 😡

Fault Tolerance 🙂

Advantages: high read performance, better write performance, good data security, allowing N disk failures at the same time
Disadvantages: The space utilization is only 50%, and the overhead is large. It is not allowed to damage the above Disk0 and Disk2 or Disk1 and Disk3 together.

 

Best Practices:


 

RAID 1 : often chosen to store operating systems, binaries, index file groups and database transaction log files. Critical to log and index performance is fault tolerance and a good write speed.

RAID 1 or 10 : Since log file are written sequentially and are read only for recovery operations, these are the recommended RAID for Log Files. If your RAID 1 is at a 100% usage, choose RAID 10 for better performance.

RAID is 5 or 10 : For data files with random access and read heavy data volumes, striping is important. 

RAID 10 : For data files which require good write performance. Use a battery backed caching RAID controller for better write performance

RAID 1 or 10 : For tempdb files with good read/write performance RAID 1 or 10 is recommended. Although tempdb contains temporary data and DBA’s often go in for RAID 0 for tempdb, keep in mind that SQL Server requires tempdb to carry out many of its activities. So if you want your system to be always available, rethink RAID 0.

RAID 5 : Putting log files or tempdb on a RAID 5 array is not recommended, since RAID 5 does not perform well for write operations

 

More coming soon..

Good Luck on your journey.

 

 

Comments

Popular posts from this blog

Index Clean-Up Scripts

forgot sa password and no logins are added

The SQL Server DBA’s Guide to Teradata