Feb 13
Sharepoint 2003 performance issue with document libraries
Situation:
I’ve been put in charge of an application running on the Sharepoint 2003 plateform. The application composed of many webparts and stores its data directly within lists and document libraries. The website is split by language, one subsite by language. For example:
- /root/English
- /root/French
- /root/Spanish
- …
The purpose of this application is to store and search files with custom metadata attached to it. So we find a customized document library in each language subsite.
The problem:
Overall, there is about 1 000 000 records in those libraries (the biggest library is english with about 400 000 records) and searching through this using custom CAML queries is a pain!
At some point in the application we need to know in which language a document is available. For this we query each document libraries using a filter on a metadata that contains our unique identifier.
<where>
<fieldref name="MyUniqueID">
<value type="Number">00000</value>
</where>
The tracing revealed an increasing time of processing. We currently have 7 languages and the processing time goes like that:
- 1st language: less than 1 second
- 2nd language: about 2 seconds
- 3rd language: about 5 seconds
- 4th language: about 10 seconds
- 5th language: about 20 seconds
- 6th language: about 20 seconds
- 7th language: about 20 seconds
Work Around:
First of all, the design of the application is not the best but that’s not the purpose of this post.
Some would say, get that data out of Sharepoint! Ya, that’s what we’re doing (for many reasons…) but in the meantime, the system must continue to be up and running.
Metadata is dynamic (it varies from list to list) and Sharepoint is not optimized to search using this. But Sharepoint recognize when you search using its internal identifier and it goes very fast. So if we could map our identifier to the sharepoint internal identier, it should be faster. I found out that doing this 2 steps operation is actually faster (in my scenario it reduced the total from 77 seconds to 2 seconds average):
- Run an ugly SQL statement directly on the Sharepoint database to find the Sharepoint internal identifier based on our own unique identifier.
- Query Sharepoint API (using CAML) with Sharepoint internal identifier.
the SQL statement looks like this:
SELECT tp_ID FROM UserData WHERE tp_ListID = '...' AND Float1 = ...
Float1 stores our unique identifier. I determine it at run-time by looking at the Fields property (not really sure by heart of the property name…). It contains the Sharepoint column Name / SQL column name mappings.
To get the list identifier (a GUID), just take the ID property of the SPList object.
That’s about it… I could talk about it for hours but I think I said the important points.
Additional notes:
Maybe the following hotfix would have correct the problem:
http://support.microsoft.com/?id=900929
See this explanation from Keith Richie’s blog:
http://blogs.msdn.com/krichie/archive/2006/07/20/673197.aspx
No comments