Paging with LINQ – performance
Facing a scenario where I needed to select EPiServerProfile objects based on different criteria and displaying a paged view based on these objects I thought I should use LINQ. Neat syntax and readable code, but what about the performance? There are probably thousands of posts like this out there already, but here is mine :)
The test
I designed a test to compare the performance of LINQ to a simple looping scenario. Based on a simple collection more advanced objects should be created, their properties inspected for matching criteria, and then a certain page of a certain size of the results should be selected. This should correspond to my scenario having the list of all usernames, load their corresponding EPiServerProfile and selecting objects based on properties of the profile.
The LINQ
This is the LINQ I use to select my objects:
1: List<Dummy> = items.Select(num => new Dummy(num, delay)).
2: Where(dum => dum.Good).
3: Skip((page - 1) * pageSize).
4: Take(pageSize).
5: ToList();
The items object is a List<bool> whose objects are used to construct the more complex Dummy instances I want. The boolean values are stored in the Dummy objects Good property and used to determine if the object should be selected. The delay variable is used to control how “heavy” the object should be to construct in the test (simulating operations performed by the object, including database requests etc.).
The for loop
To manually gather the same results I use the following method:
1: private static List<Dummy> ManualForSelect(List<bool> items, int page, int pageSize, int delay)
2: {
3: List<Dummy> selection;
4: selection = new List<Dummy>();
5: int matches = 0;
6: int skip = (page - 1) * pageSize;
7: for (int i = 0; i < items.Count; i++)
8: {
9: Dummy dum = new Dummy(items[i], delay);
10: if (dum.Good)
11: {
12: matches++;
13: }
14:
15: if (matches > skip)
16: {
17: selection.Add(dum);
18: }
19:
20: if (selection.Count == pageSize)
21: {
22: break;
23: }
24: }
25: return selection;
26: }
Collecting results
Trying to get as good results as possible the two methods are compared with different collection sizes, object construction times (the delay variable above), scarcity of the objects to be selected (randomly distributed in the collection) and page sizes. Each set of variables is run a number of loops where random pages in the probable span (1000 “valid” objects with page size 100 should give pages 1 to 10) were requested. The number of loops was set to match the number of pages, with some exceptions (I didn’t have the patience to wait for too many loops).
Results
The table below contains the test results
Collection size | Loops | Delay (ticks) | Scarcity | Page size | Total pages | LINQ (ms) | For loop (ms) | Ratio |
10000 | 10 | 10000 | 100 | 100 | 1 | 19226 | 5506 | 349% |
10000 | 100 | 10000 | 10 | 100 | 10 | 12422 | 10915 | 114% |
10000 | 1000 | 10000 | 10 | 10 | 100 | 8040 | 7923 | 101% |
10000 | 100 | 10000 | 2 | 100 | 50 | 10020 | 9867 | 102% |
10000 | 100 | 1000 | 100 | 100 | 1 | 60 | 20 | 300% |
10000 | 1000 | 1000 | 10 | 100 | 10 | 31 | 20 | 155% |
10000 | 10000 | 1000 | 10 | 10 | 100 | 22 | 17 | 129% |
10000 | 100 | 10 | 100 | 100 | 1 | 65 | 14 | 464% |
10000 | 1000 | 10 | 10 | 100 | 10 | 30 | 24 | 125% |
10000 | 10000 | 10 | 10 | 10 | 100 | 22 | 17 | 129% |
1000000 | 10 | 10 | 10000 | 100 | 1 | 628 | 343 | 183% |
1000000 | 100 | 10 | 1000 | 100 | 10 | 412 | 321 | 128% |
1000000 | 1000 | 1000 | 10 | 100 | 1000 | 358 | 331 | 108% |
1000000 | 1000 | 1 | 10 | 100 | 1000 | 334 | 312 | 107% |
Conclusion
It seems that the for loop is always faster (not surprisingly), but how much faster differs widely. The for loop’s greatest advantage is when the parameters are set so that the whole collection has to be looped, which seems fair. When the objects requested are less scarce compared to the collection and page size however, the LINQ closes in. And added heavy construction time for the objects they’re practically equal (because less time is spent looping, more constructing objects).
In my scenario I’m looking to select from user profiles (probably more like ten thousand than a million) which are relatively heavy to construct (will probably include database calls at least the first time). Matches will be relatively common and page sizes will be between ten and a hundred. In this scenario I can afford the performance overhead of LINQ.
Interesting post Magnus!
= items.Select(num => new Dummy(num, delay)).Where(dum => dum.Good).ToList().Skip((page - 1) * pageSize).Take(pageSize).ToList();
Just out of curiousity, have you, or could you, test if this renders a different result?
List
/ Joel Abrahamsson
I only did a few tests, but as I suppose you expected it is considerably slower to call ToList before paging. I guess that forces the LINQ to yield the Select and Where completely, instead of progressing item by item until the Skip and Take are fulfilled. Neat that it seems to understand how to enumerate efficiently in my test case though. That was actually my main concern and reason for doing the tests. I could probably have found information about how LINQ chains calls but testing was more fun.
Interesting post. It's to bad though, the LINQ code is so compact and nice looking!
/ Jonas Hjalmarsson