Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I have a Category table like below :

enter image description here

I have a recursive relationship on this Category table. I use soft delete in my project. I have 2000 rows in Category table. When I want to show data in a UI dropdown it takes about 3 minutes.

Which columns need an index to improve query speed?

Table definition:

 CREATE TABLE [dbo].[Category](
    [CategoryId] [uniqueidentifier] NOT NULL,
    [CategoryCode] [int] IDENTITY(1,1) NOT NULL,
    [CategoryName] [nvarchar](400) NOT NULL,
    [ParentId] [uniqueidentifier] NULL,
    [DisplayOrder] [int] NOT NULL,
    [Description] [nvarchar](max) NULL,
    [IsActive] [bit] NOT NULL,
    [IsShowOnMenu] [bit] NOT NULL,
    [AttachmentId] [uniqueidentifier] NULL,
    [Depth] [int] NOT NULL,
    [Path] [nvarchar](max) NULL,
    [IsDeleted] [bit] NOT NULL,

 CONSTRAINT [PK_dbo.Category] PRIMARY KEY CLUSTERED 
(
    [CategoryId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
)

My query is in C#. I used C# statment to generate show all category in dropdown. My query is:

exec sp_executesql N'SELECT 
[Project1].[DisplayOrder] AS [DisplayOrder], 
[Project1].[CategoryId] AS [CategoryId], 
[Project1].[CategoryName] AS [CategoryName]
FROM ( SELECT 
    [Extent1].[CategoryId] AS [CategoryId], 
    [Extent1].[CategoryName] AS [CategoryName], 
    [Extent1].[DisplayOrder] AS [DisplayOrder]
    FROM [dbo].[Category] AS [Extent1]
    WHERE (([Extent1].[IsDeleted] = @DynamicFilterParam_1) OR (@DynamicFilterParam_2 IS NOT NULL)) AND ([Extent1].[ParentId] = @p__linq__0)
)  AS [Project1]
ORDER BY [Project1].[DisplayOrder] ASC',N'@DynamicFilterParam_1 bit,@DynamicFilterParam_2 bit,@p__linq__0 uniqueidentifier',@DynamicFilterParam_1=0,@DynamicFilterParam_2=NULL,@p__linq__0='8d8dd739-a132-e3ef-5443-159ecf8adc44'

Above query runs for each item, because I need to get all children.

share|improve this question

This question has an open bounty worth +100 reputation from Paul White ending in 4 days.

This question has not received enough attention.

3  
I would expect a composite index on ParentId and CategoryId would be most beneficial to a recursive query, especially if clustered or with included columns to cover the query and perhaps filtered. Even with no useful index, 3 minutes seem very high, even with many levels of recursion. Provide the DDL and query like Michael requested. – Dan Guzman May 28 at 12:39
2  
I think you should try to refactor your code on the client side so you only have to fetch the 2000 rows once. Having the data client side an doing loops an lookups or whatever is needed to build the structure you want will be faster than hitting the db with that query 2000 times. – Mikael Eriksson May 29 at 6:11
1  
As @Mikael Eriksson said, you are not doing recursion inside SQL server itself, but rather from C# by executing this query repeatedly (2000 times). Based on the above, indexes will not solve your problem (you may get an improvement of, say, 25% but you will not get the "orders of magnitude" improvement that you are looking for). You need to change the overall design, by either using caching on C# and building the tree in C# or generating a tree in the database and returning that to C#. – Alex 2 days ago
    
@KarInter can you add the actual plan xml (use pastebin and link it here) please ? – Kin 2 days ago

There are a number of ways the design of your model can change to speed up this kind of query.

a) Materialized path

You basically store a path for each node in the tree. Retrieving the sub tree is something like:

select ...
from category
where path like :current_path || '%';

When adding a category somewhere in the tree, the new path becomes the path of the parent + parent.

b) nested sets

For each category you store an upper and a lower bound. Retrieving the sub tree is something like:

select ...
from category
where upper_bound < :current_upper_bound
  and lower_bound > :current_lower_bound;

Maintaining the tree is a bit of a challenge. Inserting a category somewhere may cause a renumbering of large parts of the table. If you're tree is dynamic and changes a lot you should probably look for a different solution.

c) Transitive closure

The idea is to keep a separate table holding the transitive closure of the tree. Say:

CREATE TABLE dbo.Category_Closure (
    CategoryId ... NOT NULL,
    AncestorId ... NOT NULL,
        PRIMARY KEY (CategoryId, AncestorId)
);

Retrieving the sub tree is something like:

select c.*
from category c
join category_closure cc
    on c.CategoryId = cc.CategoryId
where cc.AncestorId = :current_category_id;

When adding a category somewhere in the tree, you also add the ancestors of the new parent + the parent together with the new category in the closure table. Using triggers to maintain the closure table is an common approach.

There's a lot of materials written regarding this. I suggest you google for:

Nested sets + sql
Materialized Path + sql
Transitive Closure + sql

I have some notes on Transitive Closure and how to maintain it via triggers at:

http://dustbite.se/tree/

that may be of interest

share|improve this answer

Try this index for some noticeable improvement:

CREATE NONCLUSTERED INDEX [CatIndex1] ON [dbo].[Category]
(
    [ParentID] ASC
)
INCLUDE (   [CategoryID],
    [CategoryName],
    [DisplayORder]) 

I am interested in the time reduction. Please let me know your findings.

p.s. consider sorting only once with the ORDER BY if you're running it multiple times. Perhaps put data into a C# DataTable or SQL temp table and sort at the very end of all retrieval operations.

share|improve this answer
    
Is adding CategoryID in the include unnecessary since CategoryID is the PK? – dfundako 2 days ago
1  
not according to my findings BUT try it both ways--with and without the Category ID. The include bundles the data in a smaller package rather than traveling the index tree. Either way, it's a tiny index relatively speaking. – Sting 2 days ago
2  
@dfundako not because it's primary but because it's the clustered index – Tom V 2 days ago

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.