C# and .Net offer a couple different sorted classes. Arrays can be sorted, sure, but I'm looking at ones that offer some form of insertion sort. For this, I'm looking at these three classes:
Collection | Insert(Random) | Insert(Sorted) | Remove | Indexed Fetch |
---|---|---|---|---|
SortedDictionary | O(log n) | O(log n) | O(log n) | O(log n) |
SortedList | O(n) | ~O(log n) ? | O(n) | O(log n) |
SortedSet | O(log n) | O(log n) | O(log n) | - |
Runs
SortedDictionary and SortedList using custom comparer
Runs Dictionary SortedDict SortedList SortedSet Insert 50000 26,668 460,026 3,823,552 210,012 60000 33,335 566,699 5,513,648 253,347 70000 53,336 683,372 7,717,108 326,685 80000 53,336 863,382 12,737,395 340,019 90000 50,002 896,718 13,194,088 410,023 100000 83,338 1,066,728 19,034,422 460,026 110000 90,005 1,156,733 22,917,977 526,697 120000 90,005 1,306,741 27,244,891 650,037 Fetch 50000 13,334 313,351 293,350 60000 13,334 550,031 363,354 70000 16,667 506,695 463,359 80000 23,334 570,032 520,029 90000 26,668 666,705 610,034 100000 26,668 753,376 703,373 110000 30,001 870,049 796,712 120000 30,002 976,722 896,718
Part of the problem comparing all these collections is that don't compare in the same way. SortedDictionary and SortedList both default to comparing unique keys. If you can guarantee each item in the collection has a unique sorting key, then they work great. If you require more complicated sorting, such as comparing against a non-unique value within the TValue, then you add in the extra overhead of storage and lookup of the base collection. SortedSet works better for comparing TValues.
SortedDictionary and SortedList using default key sorter (insert in order)
Runs Dictionary SortedDict SortedList SortedSet Insert 50000 30,003 246,691 130,005 160,013 60000 33,336 356,702 126,679 273,360 70000 36,670 353,368 73,340 310,031 80000 56,672 413,374 80,008 416,708 90000 66,673 466,713 90,009 486,715 100000 70,007 503,383 103,343 576,724 110000 80,008 553,388 113,344 666,733 120000 103,343 646,731 180,018 653,398 Fetch 50000 10,001 123,345 73,340 60000 13,334 160,016 90,009 70000 16,668 186,685 106,677 80000 16,668 233,356 130,013 90000 20,002 246,691 140,014 100000 20,002 266,693 156,682 110000 23,335 310,031 190,019 120000 23,335 330,033 190,019
Using the default key comparer in order, SortedList performs optimally.
SortedDictionary and SortedList using default key sorter (insert in reverse order)
Runs Dictionary SortedDict SortedList SortedSet Insert 50000 30,000 273,333 23,233,332 210,000 60000 30,000 293,333 28,999,999 276,666 70000 46,666 350,000 29,679,999 343,333 80000 50,000 433,333 60,493,330 370,000 90000 50,000 526,666 49,526,662 500,000 100000 56,666 560,000 79,496,663 600,000 110000 56,666 716,666 114,516,661 640,000 120000 63,333 753,333 98,069,995 720,000 Fetch 50000 10,000 123,333 73,333 60000 13,333 150,000 93,333 70000 13,333 176,666 106,666 80000 20,000 200,000 123,333 90000 20,000 243,333 140,000 100000 23,333 263,333 153,333 110000 26,666 293,333 173,333 120000 30,000 323,333 186,666
Do the same run only with reversed keys and SortedList falls apart. SortedDictionary is most consistent with key comparisons (SortedSet still comparing values in these runs).
Notes
These are an average of three runs each. I'm not sure if I'm doing something wrong for the SortedList inserts or not - it's way above everything else. Otherwise for purely random data inserts, SortedSet seems to be the most consistent. If the data is mostly pre-sorted, then SortedList will beat them all, but then bombs with random data. Indexed fetch isn't available for SortedSet so it's not shown. SortedDictionary and List both perform and scale similarly with a custom comparer.