Overcoming the List View Threshold in SharePoint CAML queries
From: https://www.codeproject.com/articles/1076854/overcoming-the-list-view-threshold-in-sharepoint-c
Introduction
When your CAML queries start to hit the list view threshold, you'll think it will never work. It can work, but it's tough. This article brings together the tips and tricks for building CAML queries that I've gathered over the past year or so.
When using large lists in SharePoint, you will undoubtedly encounter the List View Threshold. This is a fixed limit of 5000 rows which can be returned in a single view. Now, that's a vast oversimplification - in reality there are ways to avoid seeing this limit. In this article, I will focus on methods of handling this limit in your CAML query code. Specifically, I will be using C# and the Client-Side Object Model (CSOM), although the JavaScript Object Model will be exactly the same and most of the issues are also relevant in the Server Object Model.
Do not be confused between the list view threshold (5000) and the limit of capacity of lists, which is somewhere in the region of 50 million items - or 50,000 items with unique permissions.
History
First a quick history of the 5000 items limit. It is a hard limit, and it's present in SharePoint 2010, 2013 and 2016, as well as SharePoint Online (Office 365). You can change the limit in your on-premise environment but that's not recommended so I'm not even going to say how. You could change your limit from 5,000 to 20,000, for example, but what happens when your list grows to 20,000 items? You will be better served by changing your schema and writing queries to address this limit, using the techniques in this article.
Underlying a SharePoint list is an SQL Server table. When you perform a CAML query, the query results in an SQL query against SQL Server. Now, in SQL Server, locking items during a query execution is a small performance hit. When you lock a large enough number of items, the lock is escalated to the *entire* table - which, as you can imagine, causes a general performance hit with other queries against that table. So, SharePoint prevents this from happening by enforcing a threshold of 5000 items returned in a single query. This way, as developers, we're forced to improve our schema and querying skills to avoid this situation.
In SharePoint 2016, this problem is mitigated slightly in a few ways:
- List View Auto-Indexing
This causes columns to be indexed automatically if SharePoint detects that it would result in a performance improvement. - Allows retrospective creation of indices
In SP2013, you cannot add an index to a column of a list containing more than 5000 items. In SP2016, this will be allowed. - Smarter list-view-threshold violation detection
It will more reliably detect when a query should be throttled. - Improving default Document Library views
The out-of-the-box document library view will no longer sort folders first, avoiding a potential list view threshold error.We can see from the above points that some progress has been made in managing large lists. However, the list view threshold remains - so from a querying perspective nothing has changed.
For more information, see Bill Baer's blog post on the topic: http://blogs.technet.com/b/wbaer/archive/2015/08/27/navigating-list-view-thresholds-in-sharepoint-server-2016-it-preview.aspx
SharePoint UI
If you have more than 5000 items in a list you'll get a warning in the list settings - "The number of items in this list exceeds the list view threshold". This means that many UI functions will no longer work, and your custom views will probably no longer function.

The list above has about three quarters of a million items, and is a test list for Repstor custodian - so this proves that yes, you can use large lists with some smart querying!
Column indexing
Sorting will no longer work except on indexed columns. Unfortunately, you can't even add an index to a column while the list contains more than 5000 items, so if your list may grow to this size, you need to prepare in advance. This will be improved in SharePoint 2016, though.
The ID column is automatically indexed, so by default, you can sort on the ID column with 5k+ items present. You can have up to 20 columns indexed. As described above, in SharePoint 2016, column indices can be automatically managed - however, if you're planning to do some querying then you will want to explicitly specify your indices.
Filtered views
Even when all relevant columns are indexed, you can't present a filtered view when that view would display more than 5000 items, even when it is paged. Unfortunately paging doesn't really help at all when navigating the issue, since you're still forcing an underlying scan of more than 5000 items. One of the tough things to understand is that the query, excluding paging, must never exceed 5000 results except in some trivial circumstances.
CAML
In these examples, I'll use a few conventions. My table has, let's say, a million items. It has the following columns: ID, IndexedCol, and NonIndexedCol, which should be fairly self explanatory; IndexedCol is indexed, NonIndexedCol is not. All of the following are completely valid CAML and will always work if you have fewer than 5k items.
This simple CAML query will work:
Hide Copy Code
<Query>
<View>
<RowLimit>10</RowLimit>
</View>
</Query>
Now, even though it's not including a filter, only the start of the table is being scanned: just the first 10 items are being picked up. However, if we don't restrict it to 10 items, we'll get an error - this query will not work:
Hide Copy Code
<Query>
<View>
</View>
</Query>
Let's assume there are only 1000 rows where IndexedCol equals 'match1k'. This query will work, even though we don't include a <RowLimit> tag:
Hide Copy Code
<Query>
<Where>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match1k</Value>
</Eq>
</Where>
</Query>
That makes sense - in SQL, only 1000 rows are matched by the WHERE clause. Let's now assume there are 6000 rows where IndexedCol equals 'match6k'. This query will not work:
Hide Copy Code
<Query>
<Where>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
</Where>
</Query>
However, combining the queries using an AND operator will work in this instance:
Hide Copy Code
<Query>
<Where>
<And>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match1k</Value>
</Eq>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
</And>
</Where>
</Query>
Seems obvious, doesn't it? However, confusingly, the following query will not work even though it appears to be the same as the query above:
Hide Copy Code
<Query>
<Where>
<And>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match1k</Value>
</Eq>
</And>
</Where>
</Query>
Why doesn't it work? Because 6000 matches are scanned from the first part of the query (IndexedCol = 'match6k'), and the threshold error occurs before hitting the second conditional of the WHERE clause. The lesson here is:
Order your WHERE conditionals with the most specific first.
Now, we'll try querying the non-indexed columns. This query will never work, even if it doesn't match any items:
Hide Copy Code
<Query>
<Where>
<Eq>
<FieldRef Name='NonIndexedCol'
/><Value Type='Text'>matchNone</Value>
</Eq>
</Where>
</Query>
This is because:
Non-indexed columns can never be used for filtering in a list with 5000+ items - regardless of how many matches there are.
OR
Now we move on to the use of 'OR'. Unfortunately, we're pretty much stuck here. Using 'OR' against a list with more than 5000 items will ALWAYS result in a list view threshold error! So, the OR section is pretty short...Don't use OR! Your only option here is to run multiple queries.
Ordering
You can order your results as long as you meet two requirements:
- Your query is valid according to the above rules and does not break the list view threshold (obviously),
- The field you are filtering on is indexed.
Hence this very simple query will work:
Hide Copy Code
<Query>
<View>
<OrderBy>
<FieldRef Name='IndexedCol' Ascending='False'
/></OrderBy>
<RowLimit>10</RowLimit>
</View>
</Query>
This very simple query will not work as it's on a non-indexed column:
Hide Copy Code
<Query>
<View>
<OrderBy>
<FieldRef Name='NonIndexedCol' Ascending='False'
/></OrderBy>
<RowLimit>10</RowLimit>
</View>
</Query>
Remember, if you are including a WHERE clause with the above, your WHERE should match a maximum of 5000 results, regardless of your use of the RowLimit element. So, this will work:
Hide Copy Code
<Query>
<Where>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match1k</Value>
</Eq>
</Where>
<View>
<OrderBy>
<FieldRef Name='IndexedCol' Ascending='False'
/></OrderBy>
</View>
</Query>
Paging
If you have large lists then you will almost always want to take advantage of paging. Paging works brilliantly when you have no filter, or a filter that returns less than 5000 items. So, you can query the first "page" of most recent items with a simple query like this which will work:
Hide Copy Code
<Query>
<View>
<OrderBy>
<FieldRef Name='IndexedCol' Ascending='False'
/></OrderBy>
<RowLimit>10</RowLimit>
</View>
</Query>
This query, without RowLimit, does not break the view threshold.
To retrieve the next page following on from the 10th item returned, you then specify the value to continue on from via the ListItemCollectionPosition field on the CamlQuery object:
Hide Copy Code
CamlQuery camlQuery = new CamlQuery();
camlQuery.ListItemCollectionPosition = "Paged=TRUE&p_ID=1034";
camlQuery.ViewXml = "..."; //Query View element
ListItemCollection listItems = list.GetItems(camlQuery);
clientContext.Load(listItems);
clientContext.ExecuteQuery();
//Note listItems.ListItemCollectionPosition for the next page
The value of the ListItemCollectionPosition property comes from the ListItemCollection.ListItemCollectionPosition of the previous page.
Again, this works if there is not filter, or there's a filter that returns less than 5000 items.
Advanced paging techniques
The paged query above works because there is no WHERE clause in the query that can potentially cause a list view threshold error. For example, there's no way to retrieve all the items of this query that we saw earlier (this will not work):
Hide Copy Code
<Query>
<Where>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
</Where>
</Query>
If you really need to execute a query like this that potentially exceeds the list view threshold, then you may be able to craft your queries to achieve the effect of paging by adding additional WHERE clauses. For example, by adding a filter on ID, this will work:
Hide Copy Code
<Query>
<Where>
<And>
<And>
<Gt><FieldRef Name='ID'></FieldRef><Value Type='Number'>0</Value></Gt>
<Lt><FieldRef Name='ID'></FieldRef><Value Type='Number'>5000</Value></Lt>
</And>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
</And>
</Where>
<View>
<OrderBy>
<FieldRef Name='ID' Ascending='True'
/></OrderBy>
<RowLimit>60</RowLimit>
</View>
</Query>
In the above example, the first part of the query narrows the result set down to items with ID between 0 and 5000. This prevents any possibility of exceeding the list view threshold. Then, it filters those into the items where IndexedCol = match6k. Finally, the RowLimit ensures that only 60 of the items are returned.
There are a few implications of this technique:
- You cannot predict how many results are returned, only that it's less than or equal to the RowLimit (60, in this case).
- You may need to re-run the query repeatedly to receive sufficient results
To retrieve the next page of results, you must get the last returned item's ID (the highest ID, assuming we're sorted ascending). Using that ID, form a new query - with ID greater than that value, and less than that value plus 5000.
For example, if the highest ID returned previously was 2074, the next query to execute looks like this:
Hide Copy Code
<Query>
<Where>
<And>
<And>
<Gt><FieldRef Name='ID'></FieldRef><Value Type='Number'>2074</Value></Gt>
<Lt><FieldRef Name='ID'></FieldRef><Value Type='Number'>7074</Value></Lt>
</And>
<Eq>
<FieldRef Name='IndexedCol'
/><Value Type='Text'>match6k</Value>
</Eq>
</And>
</Where>
<View>
<OrderBy>
<FieldRef Name='ID' Ascending='True'
/></OrderBy>
<RowLimit>60</RowLimit>
</View>
</Query>
The above query will reliably return the items without exceeding the list view threshold. Simply repeat until you have reached the list's Max ID (which you'll have to retrieve separately).
There's a potential problem with this method, though. Consider the following scenario:
- There are 1 million items in the list
- The first item to match the 'IndexedCol' = 'match6' clause is item 900,000
In this case, the query will have to run 900,000 / 5000 = 180 times before it returns even one item!
There is a very effective enhancement to make to this technique, and that's to intelligently adjust the min and max IDs to span a range greater than 5000. You can follow the following rules:
- If no items are returned, then for the next query, double the ID span (eg. increase 5000 items to 10000)
- If the list view threshold is exceeded, then repeat the same query but halve the ID span (eg. reduce 10000 to 5000)
In this way, the query will only have to run 8 times to start retrieving items. On the 8th iteration, when it hits items, it'll be attempting to retrieve items 635k to 1.2m (big numbers!). If it exceeds the list view threshold at this point, that's ok - the algorithm above will ensure that the range then scales down until it runs successfully.
Don't forget to cache the results, so that this doesn't need to happen too often.
You can start with a number less than 5000, if you wish to tweak performance. Likewise, you can triple instead of double the 'scale-up' factor, if that makes more sense for your data set.
Less than & Greater than
You can't use Less than or Equal (Lte) or Greater than or Equal (Gte) in CAML queries that involve large numbers of items. I don't know why, but it doesn't work. Stick to Less than or Greater than (Lt/Gt).
Indexable field types
Not all field types are indexable. For example, User fields can be indexed and used in queries involving large numbers of items. However, a Mult-User field cannot. Please see the following link for more information: Creating SharePoint indexed columns.
Other List Thresholds
The List View Threshold is not the only limit you need to be aware of! Another list limit is the item-level permissions limit of 50,000 items. Often, permissions are set at the list level. However, if you choose to set unique permissions for each individual list item, then you can only do this for 50k items within any given list. This is a hard and absolute list in SharePoint and if you need to exceed it, then you need to split your data across multiple lists.
See the following page for more information about boundaries and limits in SharePoint 2013
Search API
If all of this is too much to handle, you might want to consider using the Search API. I would absolutely recommend this any time over grappling with the nuances of CAML!
However, if you choose to persevere with CAML, hopefully this guide helps. Please let me know in the comments any other tips or tricks, errors or omissions. Meanwhile, I'm going to go and cry in a corner and attempt to come to terms with all this CAML horribleness....
Overcoming the List View Threshold in SharePoint CAML queries的更多相关文章
- SHAREPOINT - CAML列表查询
首先要了解的是CAML(Collaboration Application Markup Language)不仅仅是用在对列表.文档库的查询,字段的定义,站点定义等处处使用的都是CAML. 简单的提一 ...
- SharePoint CAML Query小结
CAML的结构. <View Type="HTML" Name="Summary"> <ViewBody ExpandXML="TR ...
- SharePoint CAML In Action——Part I
在SharePoint中,我们经常要对List进行操作,比如要从List中取出相应的ListItem,利用CAML是个好办法.在没了解CAML之前,我是这样取数据的: MyList.Items.Cas ...
- SharePoint - CAML
1. CAML是顺序操作,如果要实现类似 “A or B or C or D” 的结果,最好写成 “(((A or B) or C) or D)”的形式,但写成 “((A or B) or (C or ...
- Sharepoint CAML 增删改查 List
Lists.UpdateListItems 方法 (websvcLists) Windows SharePoint Services 3 Adds, deletes, or updates the ...
- SharePoint CAML In Action——Part II
在SharePoint中,相对于Linq to SharePoint而言,CAML是轻量化的.当然缺点也是显而易见的,"Hard Code"有时会让你抓狂.在实际场景中,经常会根据 ...
- Build Tree View Structure for SharePoint List Data
博客地址 http://blog.csdn.net/foxdave 此文参考自->原文链接 版权归原作者所有,我只是进行一下大致的翻译 应坛友要求,帮助验证一下功能. SharePoint列表数 ...
- 深入浅出SharePoint——Caml快速开发
适用于Visual Studio 2010的Caml智能感知工具 http://visualstudiogallery.msdn.microsoft.com/15055544-fda0-42db-a6 ...
- [总结]SHAREPOINT - CAML列表查询(上)
首先要了解的是CAML(Collaboration Application Markup Language)不仅仅是用在对列表.文档库的查询,字段的定义,站点定义等处处使用的都是CAML. 简单的提一 ...
随机推荐
- ftp通用类2
using System; using System.Net; using System.IO; using System.Text; using System.Net.Sockets; /// &l ...
- .net中实现RSS方法
引用 如何在.net动态网站中实现RSS呢?主要思想是编写一个能够自动按照RSS格式生成xml文档的通用类.具体步骤如下: 步骤一:创建RSS通用类 C#代码 using System; usin ...
- bitnami下webmin安装
下载 我在官方网站下载最新的安装包(webmin_1.670_all.deb):http://sourceforge.net/projects/webadmin/files/webmin 安装 单独 ...
- Windows Sysinternals实战指南
http://www.epubit.com.cn/book/details/4786 Mark Russinovich是Microsoft Azure首席技术官,主要负责微软云计算平台的技术战略和架构 ...
- [Asp.net core]使用Polly网络请求异常重试
摘要 在网络传输过程中,不能保证所有的请求都能正确的被服务端接受或者处理,那么进行简单的重试可以进行简单的补救.比如现在大部分支付功能,在支付成功之后,需要回调我们网站的接口,并且要求我们的接口给一个 ...
- 教程:如何手动安装Xamarin与Xamarin for VisualStudio
[2016/4/17更新:如果你下载后发现仍然需要付费才能编译Android/iOS APP,请到文章最下面更新Xamarin for VS和Xamarin Studio到最新的版本.Build201 ...
- iOS 线程之GCD的高级使用方法
之前的一篇关于线程的blog已经为大家介绍了GCD的简单使用方式及样例说明,今天因为项目中有特殊的应用GCD的实例,为大家介绍两种特殊需求的使用GCD的方法. 目的:实现一件事情做完,再做下一件事情. ...
- ASP.NET Web API实践系列11,如何设计出优秀的API
本篇摘自:InfoQ的微信公众号 在设计API的时候考虑的问题包括:API所使用的传输协议.支持的消息格式.接口的控制.名称.关联.次序,等等.我们很难始终作出正确的决策,很可能是在多次犯错之后,并从 ...
- Unity3D实践系列05,为GameObject添加额外属性
在Unity中,通常通过脚本为GameObject添加额外的属性.具体有2种方式:一种是通过硬编码为脚本字段赋值,另一种是通过反射在运行时给脚本字段赋值. 脚本通过字段硬编码为GameObject添加 ...
- 3.13. Notepad++中Windows,Unix,Mac三种格式之间的转换
由于历史原因,导致Windows,Unix/Linux,Mac三者之间,对于文件中所用回车换行符,表示的方法,都不一样. 这就导致了很多人都会遇到回车换行符的困惑,和需要在不同格式间进行转换. 其中, ...