Overcoming the List View Threshold in SharePoint CAML queries
From: https://www.codeproject.com/articles/1076854/overcoming-the-list-view-threshold-in-sharepoint-c
Introduction
When your CAML queries start to hit the list view threshold, you'll think it will never work. It can work, but it's tough. This article brings together the tips and tricks for building CAML queries that I've gathered over the past year or so.
When using large lists in SharePoint, you will undoubtedly encounter the List View Threshold. This is a fixed limit of 5000 rows which can be returned in a single view. Now, that's a vast oversimplification - in reality there are ways to avoid seeing this limit. In this article, I will focus on methods of handling this limit in your CAML query code. Specifically, I will be using C# and the Client-Side Object Model (CSOM), although the JavaScript Object Model will be exactly the same and most of the issues are also relevant in the Server Object Model.
Do not be confused between the list view threshold (5000) and the limit of capacity of lists, which is somewhere in the region of 50 million items - or 50,000 items with unique permissions.
History
First a quick history of the 5000 items limit. It is a hard limit, and it's present in SharePoint 2010, 2013 and 2016, as well as SharePoint Online (Office 365). You can change the limit in your on-premise environment but that's not recommended so I'm not even going to say how. You could change your limit from 5,000 to 20,000, for example, but what happens when your list grows to 20,000 items? You will be better served by changing your schema and writing queries to address this limit, using the techniques in this article.
Underlying a SharePoint list is an SQL Server table. When you perform a CAML query, the query results in an SQL query against SQL Server. Now, in SQL Server, locking items during a query execution is a small performance hit. When you lock a large enough number of items, the lock is escalated to the *entire* table - which, as you can imagine, causes a general performance hit with other queries against that table. So, SharePoint prevents this from happening by enforcing a threshold of 5000 items returned in a single query. This way, as developers, we're forced to improve our schema and querying skills to avoid this situation.
In SharePoint 2016, this problem is mitigated slightly in a few ways:
- List View Auto-Indexing
This causes columns to be indexed automatically if SharePoint detects that it would result in a performance improvement. - Allows retrospective creation of indices
In SP2013, you cannot add an index to a column of a list containing more than 5000 items. In SP2016, this will be allowed. - Smarter list-view-threshold violation detection
It will more reliably detect when a query should be throttled. - Improving default Document Library views
The out-of-the-box document library view will no longer sort folders first, avoiding a potential list view threshold error.We can see from the above points that some progress has been made in managing large lists. However, the list view threshold remains - so from a querying perspective nothing has changed.
For more information, see Bill Baer's blog post on the topic: http://blogs.technet.com/b/wbaer/archive/2015/08/27/navigating-list-view-thresholds-in-sharepoint-server-2016-it-preview.aspx
SharePoint UI
If you have more than 5000 items in a list you'll get a warning in the list settings - "The number of items in this list exceeds the list view threshold". This means that many UI functions will no longer work, and your custom views will probably no longer function.
The list above has about three quarters of a million items, and is a test list for Repstor custodian - so this proves that yes, you can use large lists with some smart querying!
Column indexing
Sorting will no longer work except on indexed columns. Unfortunately, you can't even add an index to a column while the list contains more than 5000 items, so if your list may grow to this size, you need to prepare in advance. This will be improved in SharePoint 2016, though.
The ID column is automatically indexed, so by default, you can sort on the ID column with 5k+ items present. You can have up to 20 columns indexed. As described above, in SharePoint 2016, column indices can be automatically managed - however, if you're planning to do some querying then you will want to explicitly specify your indices.
Filtered views
Even when all relevant columns are indexed, you can't present a filtered view when that view would display more than 5000 items, even when it is paged. Unfortunately paging doesn't really help at all when navigating the issue, since you're still forcing an underlying scan of more than 5000 items. One of the tough things to understand is that the query, excluding paging, must never exceed 5000 results except in some trivial circumstances.
CAML
In these examples, I'll use a few conventions. My table has, let's say, a million items. It has the following columns: ID, IndexedCol, and NonIndexedCol, which should be fairly self explanatory; IndexedCol is indexed, NonIndexedCol is not. All of the following are completely valid CAML and will always work if you have fewer than 5k items.
This simple CAML query will work:
Hide Copy Code
<Query>
<View>
<RowLimit>10</RowLimit>
</View>
</Query>
Now, even though it's not including a filter, only the start of the table is being scanned: just the first 10 items are being picked up. However, if we don't restrict it to 10 items, we'll get an error - this query will not work:
Hide Copy Code
<Query>
<View>
</View>
</Query>
Let's assume there are only 1000 rows where IndexedCol equals 'match1k'. This query will work, even though we don't include a <RowLimit> tag:
Hide Copy Code
<Query>
<Where>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match1k</Value>
</Eq>
</Where>
</Query>
That makes sense - in SQL, only 1000 rows are matched by the WHERE clause. Let's now assume there are 6000 rows where IndexedCol equals 'match6k'. This query will not work:
Hide Copy Code
<Query>
<Where>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
</Where>
</Query>
However, combining the queries using an AND operator will work in this instance:
Hide Copy Code
<Query>
<Where>
<And>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match1k</Value>
</Eq>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
</And>
</Where>
</Query>
Seems obvious, doesn't it? However, confusingly, the following query will not work even though it appears to be the same as the query above:
Hide Copy Code
<Query>
<Where>
<And>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match1k</Value>
</Eq>
</And>
</Where>
</Query>
Why doesn't it work? Because 6000 matches are scanned from the first part of the query (IndexedCol = 'match6k'), and the threshold error occurs before hitting the second conditional of the WHERE clause. The lesson here is:
Order your WHERE conditionals with the most specific first.
Now, we'll try querying the non-indexed columns. This query will never work, even if it doesn't match any items:
Hide Copy Code
<Query>
<Where>
<Eq>
<FieldRef Name='NonIndexedCol'
/><Value Type='Text'>matchNone</Value>
</Eq>
</Where>
</Query>
This is because:
Non-indexed columns can never be used for filtering in a list with 5000+ items - regardless of how many matches there are.
OR
Now we move on to the use of 'OR'. Unfortunately, we're pretty much stuck here. Using 'OR' against a list with more than 5000 items will ALWAYS result in a list view threshold error! So, the OR section is pretty short...Don't use OR! Your only option here is to run multiple queries.
Ordering
You can order your results as long as you meet two requirements:
- Your query is valid according to the above rules and does not break the list view threshold (obviously),
- The field you are filtering on is indexed.
Hence this very simple query will work:
Hide Copy Code
<Query>
<View>
<OrderBy>
<FieldRef Name='IndexedCol' Ascending='False'
/></OrderBy>
<RowLimit>10</RowLimit>
</View>
</Query>
This very simple query will not work as it's on a non-indexed column:
Hide Copy Code
<Query>
<View>
<OrderBy>
<FieldRef Name='NonIndexedCol' Ascending='False'
/></OrderBy>
<RowLimit>10</RowLimit>
</View>
</Query>
Remember, if you are including a WHERE clause with the above, your WHERE should match a maximum of 5000 results, regardless of your use of the RowLimit element. So, this will work:
Hide Copy Code
<Query>
<Where>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match1k</Value>
</Eq>
</Where>
<View>
<OrderBy>
<FieldRef Name='IndexedCol' Ascending='False'
/></OrderBy>
</View>
</Query>
Paging
If you have large lists then you will almost always want to take advantage of paging. Paging works brilliantly when you have no filter, or a filter that returns less than 5000 items. So, you can query the first "page" of most recent items with a simple query like this which will work:
Hide Copy Code
<Query>
<View>
<OrderBy>
<FieldRef Name='IndexedCol' Ascending='False'
/></OrderBy>
<RowLimit>10</RowLimit>
</View>
</Query>
This query, without RowLimit, does not break the view threshold.
To retrieve the next page following on from the 10th item returned, you then specify the value to continue on from via the ListItemCollectionPosition field on the CamlQuery object:
Hide Copy Code
CamlQuery camlQuery = new CamlQuery();
camlQuery.ListItemCollectionPosition = "Paged=TRUE&p_ID=1034";
camlQuery.ViewXml = "..."; //Query View element
ListItemCollection listItems = list.GetItems(camlQuery);
clientContext.Load(listItems);
clientContext.ExecuteQuery();
//Note listItems.ListItemCollectionPosition for the next page
The value of the ListItemCollectionPosition property comes from the ListItemCollection.ListItemCollectionPosition of the previous page.
Again, this works if there is not filter, or there's a filter that returns less than 5000 items.
Advanced paging techniques
The paged query above works because there is no WHERE clause in the query that can potentially cause a list view threshold error. For example, there's no way to retrieve all the items of this query that we saw earlier (this will not work):
Hide Copy Code
<Query>
<Where>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
</Where>
</Query>
If you really need to execute a query like this that potentially exceeds the list view threshold, then you may be able to craft your queries to achieve the effect of paging by adding additional WHERE clauses. For example, by adding a filter on ID, this will work:
Hide Copy Code
<Query>
<Where>
<And>
<And>
<Gt><FieldRef Name='ID'></FieldRef><Value Type='Number'>0</Value></Gt>
<Lt><FieldRef Name='ID'></FieldRef><Value Type='Number'>5000</Value></Lt>
</And>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
</And>
</Where>
<View>
<OrderBy>
<FieldRef Name='ID' Ascending='True'
/></OrderBy>
<RowLimit>60</RowLimit>
</View>
</Query>
In the above example, the first part of the query narrows the result set down to items with ID between 0 and 5000. This prevents any possibility of exceeding the list view threshold. Then, it filters those into the items where IndexedCol = match6k. Finally, the RowLimit ensures that only 60 of the items are returned.
There are a few implications of this technique:
- You cannot predict how many results are returned, only that it's less than or equal to the RowLimit (60, in this case).
- You may need to re-run the query repeatedly to receive sufficient results
To retrieve the next page of results, you must get the last returned item's ID (the highest ID, assuming we're sorted ascending). Using that ID, form a new query - with ID greater than that value, and less than that value plus 5000.
For example, if the highest ID returned previously was 2074, the next query to execute looks like this:
Hide Copy Code
<Query>
<Where>
<And>
<And>
<Gt><FieldRef Name='ID'></FieldRef><Value Type='Number'>2074</Value></Gt>
<Lt><FieldRef Name='ID'></FieldRef><Value Type='Number'>7074</Value></Lt>
</And>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
</And>
</Where>
<View>
<OrderBy>
<FieldRef Name='ID' Ascending='True'
/></OrderBy>
<RowLimit>60</RowLimit>
</View>
</Query>
The above query will reliably return the items without exceeding the list view threshold. Simply repeat until you have reached the list's Max ID (which you'll have to retrieve separately).
There's a potential problem with this method, though. Consider the following scenario:
- There are 1 million items in the list
- The first item to match the 'IndexedCol' = 'match6' clause is item 900,000
In this case, the query will have to run 900,000 / 5000 = 180 times before it returns even one item!
There is a very effective enhancement to make to this technique, and that's to intelligently adjust the min and max IDs to span a range greater than 5000. You can follow the following rules:
- If no items are returned, then for the next query, double the ID span (eg. increase 5000 items to 10000)
- If the list view threshold is exceeded, then repeat the same query but halve the ID span (eg. reduce 10000 to 5000)
In this way, the query will only have to run 8 times to start retrieving items. On the 8th iteration, when it hits items, it'll be attempting to retrieve items 635k to 1.2m (big numbers!). If it exceeds the list view threshold at this point, that's ok - the algorithm above will ensure that the range then scales down until it runs successfully.
Don't forget to cache the results, so that this doesn't need to happen too often.
You can start with a number less than 5000, if you wish to tweak performance. Likewise, you can triple instead of double the 'scale-up' factor, if that makes more sense for your data set.
Less than & Greater than
You can't use Less than or Equal (Lte) or Greater than or Equal (Gte) in CAML queries that involve large numbers of items. I don't know why, but it doesn't work. Stick to Less than or Greater than (Lt/Gt).
Indexable field types
Not all field types are indexable. For example, User fields can be indexed and used in queries involving large numbers of items. However, a Mult-User field cannot. Please see the following link for more information: Creating SharePoint indexed columns.
Other List Thresholds
The List View Threshold is not the only limit you need to be aware of! Another list limit is the item-level permissions limit of 50,000 items. Often, permissions are set at the list level. However, if you choose to set unique permissions for each individual list item, then you can only do this for 50k items within any given list. This is a hard and absolute list in SharePoint and if you need to exceed it, then you need to split your data across multiple lists.
See the following page for more information about boundaries and limits in SharePoint 2013
Search API
If all of this is too much to handle, you might want to consider using the Search API. I would absolutely recommend this any time over grappling with the nuances of CAML!
However, if you choose to persevere with CAML, hopefully this guide helps. Please let me know in the comments any other tips or tricks, errors or omissions. Meanwhile, I'm going to go and cry in a corner and attempt to come to terms with all this CAML horribleness....
Overcoming the List View Threshold in SharePoint CAML queries的更多相关文章
- SHAREPOINT - CAML列表查询
首先要了解的是CAML(Collaboration Application Markup Language)不仅仅是用在对列表.文档库的查询,字段的定义,站点定义等处处使用的都是CAML. 简单的提一 ...
- SharePoint CAML Query小结
CAML的结构. <View Type="HTML" Name="Summary"> <ViewBody ExpandXML="TR ...
- SharePoint CAML In Action——Part I
在SharePoint中,我们经常要对List进行操作,比如要从List中取出相应的ListItem,利用CAML是个好办法.在没了解CAML之前,我是这样取数据的: MyList.Items.Cas ...
- SharePoint - CAML
1. CAML是顺序操作,如果要实现类似 “A or B or C or D” 的结果,最好写成 “(((A or B) or C) or D)”的形式,但写成 “((A or B) or (C or ...
- Sharepoint CAML 增删改查 List
Lists.UpdateListItems 方法 (websvcLists) Windows SharePoint Services 3 Adds, deletes, or updates the ...
- SharePoint CAML In Action——Part II
在SharePoint中,相对于Linq to SharePoint而言,CAML是轻量化的.当然缺点也是显而易见的,"Hard Code"有时会让你抓狂.在实际场景中,经常会根据 ...
- Build Tree View Structure for SharePoint List Data
博客地址 http://blog.csdn.net/foxdave 此文参考自->原文链接 版权归原作者所有,我只是进行一下大致的翻译 应坛友要求,帮助验证一下功能. SharePoint列表数 ...
- 深入浅出SharePoint——Caml快速开发
适用于Visual Studio 2010的Caml智能感知工具 http://visualstudiogallery.msdn.microsoft.com/15055544-fda0-42db-a6 ...
- [总结]SHAREPOINT - CAML列表查询(上)
首先要了解的是CAML(Collaboration Application Markup Language)不仅仅是用在对列表.文档库的查询,字段的定义,站点定义等处处使用的都是CAML. 简单的提一 ...
随机推荐
- IBM BR10i阵列卡配置Raid0/Raid1(转)
说明:IBM的阵列卡无论多旧多新操作步骤都基本差不多. RAID1的步骤: 开机自检过程中出现ctrl+c提示,按ctrl+c进入LSI Logic Config Utility v6.10.02.0 ...
- OpenOCD Debug Adapter Configuration
Correctly installing OpenOCD includes making your operating system give OpenOCD access to debug adap ...
- HDU4607(求树中的最长链)
题目:Park Visit 题意:给定一棵树,从树中的任意选一个顶点出发,遍历K个点的最短距离是多少?(每条边的长度为1) 解析:就是求树的最长链,假设求出的树的最长链所包含的点数为m,那么如果K&l ...
- 改变窗体大小视图区图形也会跟着变化 MFC
怎样实现窗体缩放,视图区里的图形也会跟着变化 在CMFCView类中加入三个消息函数: 在类向导中选中CMFCView类,点击右键---->类向导------>消息--------> ...
- Java嵌入式数据库H2学习总结(一)——H2数据库入门
一.H2数据库介绍 常用的开源数据库有:H2,Derby,HSQLDB,MySQL,PostgreSQL.其中H2和HSQLDB类似,十分适合作为嵌入式数据库使用,而其它的数据库大部分都需要安装独立的 ...
- 移动web前端小结
原文地址:http://blog.csdn.net/small_rice_/article/details/22690535 在智能手机横行的时代,作为一个web前端,不会编写移动web界面,的确是件 ...
- uitextfield 设置为密码框显示
uitextfield 设置为密码框显示: 在xib中,将文本secure的复选框选中即可.
- Javascript 身份证号获得出生日期、获得性别、检查身份证号码
//---------------------------------------------------------- // 功能:根据身份证号获得出生日期 // 参数:身份证号 psidno // ...
- 在Android工程中加入AIDL文件时,gen目录生成的文件报错-问题解决
from://http://blog.csdn.net/watt520/article/details/10099047 今天在弄清除缓存的东东,按照网上别人的方法,创建了一个AIDL文件,这个时候发 ...
- ZooKeeper_客户端工具zkCli.sh使用
#一.命令 [root@VM_31_182_centos bin]# ./zkCli.sh -server 127.0.0.1:2181 #二.帮助命令 help #三.创建.修改.删除.退出de ...