You are on page 1of 9

C# Collection Fundamentals

What is a Collection – Type responsible for managing groups of objects in memory

Three types of collections:

- Lists
- Dictionaries
- Sets

What are Arrays? – Index-based list. A collection where the elements are ordered

- Characteristics of arrays:
o Rich API
o Very lightweight
o Special C# syntax
o Fixed size!
- Arrays are reference types!
- Iterating arrays with foreach or for (typically in a for loop)
o Avoid using foreach when you want to:
 Only enumerate part of the array
 Need to access the index
 Modify items in the array
o In a foreach loop, replacing items in the array is illegal, but making changes inside
the elements of the array is still ok.
- For arrays, == does the same thing as ReferenceEquals() – it compares references of the
given arrays
- You can store a derived type on an array of a base type. This follows the OOP Principle: A
derived reference can be used in place of a Base reference
- You can always implicitly cast Derived[] to Base[] – this is known as Array Covariance
- Array covariance turns compile-time errors in run-time errors.
- YOU CANNOT normally cast a collection of derived type to a collection of base type
- Array covariance: can cast array of derived type to array of the base type -> but this is a
really bad thing to do
- Covariance is safe for enumerators because an enumerator can’t modify the collectionI
IEnumerable<T>, IEnumerator<T>, IREadOnlyCollection<T>, IREadOnlyList<T>

Array methods tend to work in-place / LINQ methods tend to return new objects
LINQ Array members
More suited to interfaces Performance
(Good for best practices) (no LINQ overhead!)
Consistent for all collections Only arrays (and List<T>)
Return new objects Modify arrays inline
Interface inheritance tree

- In the left side, there’s an older hierarchy for non-generic interfaces, which were written for
.NET 1.x before generics.
- There’s a hierarchy of generic interfaces, which are the main general purpose ones that
you’ll use most of the time.
- And in the right side there is a newer hierarchy of read-only interfaces, which were
introduced only recently with .NET 4.5, and essentially strip out all the functionality for
modifying collections from the generic interfaces, letting you expose interfaces to other
code without any risk of that code being able to use those interfaces to modify collections.
- In the upper region there are a coule of interfaces based on IEnumerator, which exists
separately from the main tree.

IEnumerable<T> – “You can iterate my elements” – the most used interface for collections.

- It declares that the object can be enumerated. In other words, it contains items that you can
iterate through
- Every collections does implement IEnumerable<T> => foreach works for all collections
- It’s the main hook that LINQ attaches to in order to provide LINQ functionality
- If something implements IEnumerable<T> you get to be able to run LINQ operations on it
- Enumerating a collection is done by a separate object called an enumerator. The soule
purpose of IEnumerable<T> is to supply that enumerator. The enumerator has to implement
IEnumerator<T> interface.
o IEnumerator<T> GetEnumerator() – returns an enumerator – the thing that
does the enumerating

ICollection<T> – As the name suggests, ICollection<T> is really the thing that declares something is what
we think of a collection.

- Identifies a collections
- Represents an in-memory data structure that contains one or more elements
- Collections are expected to know how many items they contain. Features a Count property
that returns that value. Cout is in ICollection<T>, but not in IEnumerable<T>
o int ICollection<T>.Count – Number of items in the colection
- ICollection<T> also features various methods to modify which elements are in the collection,
whereas IEnumerable is strictly for enumerating, not modifying the collection
- ICollection<T> says that an object is a collection, but crucially it doesn’t tell you anything
about the nature of the collection. Is it a list? Is it a dictionary?
- Members of ICollection<T>:

After ICollection<T>, the hierarchy splits into three branches, and that’s in order to distinguish broadly
the three types of collections: Lists, Dictionary, and Sets.

The readonly branches of the interface hierarchy are roughly duplicates of the main generic branch.

IReadOnlyCollection<T> - mirrors ICollection<T> - an Enumerable that knows how many elements it has

There is no read-only interface for ISet<T>. The only read-only interfaces are IReadOnlyList<T>, and
IReadOnlyDictionary<TKey, TValue>.
To determine number of elements:

- Length for arrays: Array.Length


- Count for collections: ICollection<T>.Count
- There is also a Count extension method provided by LINQ for IEnumerable<T> , but prefer
property if it’s available

IList<T> – “You can look up my elements with an index”

- Arrays implement IList<T>


- IList<T> is also implemented by List<T>
- There is a IndexOf() method used to find elements in a list given a condition ( a predicate)
- And methods to Insert of Remove element from a specific index

IReadOnlyList<T> – Is a collection that knows how many items it has and will let you look up an element
by index

IDictionary<TKey, TValue> – “You can look up my elements with a key” – like a list, but with keys
instead of an index
- Here are the members that the interface exposes:

- Dictionaries represents collections of key-value pairs:


o IDictionary<TKey, TValue> derives from
ICollection<KeyValuePair<TKey, TValue>>

IReadOnlyDictionary<TKey, TValue> - mirrors IDictionary<TKey, TValue>

ISet<T > – “I can do set operations with other collections”

- Here are the members that the interface exposes:


Lists:

- List<T>
o Extensive API
o Like array but with adding/removing elements
o Features:
 IList<T> - access by index
 Foreach
 Collection initializers
 Everything an array can do
o List<T> is:
 Lightweight
 Efficient
 Not customizable
- ReadOnlyCollection<T>
o Read-only wrapper for lists
- Collection<T>
o Allows lists to be customized - is specifically designed to be derived from and
therefore is customizable
o Provides an implementation of IList<T>
- ObservableCollection<T>
o Lists with change notification

Linked Lists, Stacks and Queues:

- LinkedList<T>
o List with fast adding / removing elements
o Purpose: collection that’s quick at adding / removing elements
o Each element store the next element’s address
o This means that you can’t jump to an element
o You can get to an element by enumerating all the previous ones
o There are also listst that you can iterate through the element backwards, because
each element contains also a pointer to the previous element; These types of lists
are known as doubly linked lists
o Requires only 4 operations to insert an item: change the next and prev references (2
for the inserted element, 1 from the previous element and 1 from the next element)
- LinkedList<T> is a collection of LinkedListNode<T> not of T
o PROs and CONs:
 PROs:
 Adding / removing elements is FAST
 Good at enumerating
 CONs:
 No index-based access to elements
 Use lots of memory
- LinkedListNode<T>
o Required to store items in a linked list
- Stack<T>
o First-in Last-out list
o Implemented as a thin wrapper around T[]
o Operations: Push, Pop, Peek – peek -> see what is the top element without removing
it
- Queue<T>
o Remove items in same order as added (first-in first-out)
o Implemented as a thin wrapper around T[] (but more complicated than for
Stack<T>)
o Operation: Enqueue, Dequeue, Peek -> see what is the next element without
removing it

Dictionaries:

- Dictionary<TKey, TValue>
o General purpose dictionary
o IDictionary<TKey, TValue>.TryGetValue()
 Looks up items that might be not present
o Dict[key] = value; - can replace or add
o dict.Add(key, value); - can only add
o IEqualityComparer<T> - can check whether T instances are equal
o Dictionary is composed of a set of buckets that uses hash algorithm to place the
items evenly in the buckets. This leads to better performances to query, because
there are not many elements to search in a bucket. Also adding or removing
elements in bucket linked list is quick
- ReadOnlyDictionary<TKey, TValue>
o Is the dictionary’s equivalent of read-only collection for lists
o It provides a wrapper around an ordinary dictionary in order to give read-only
access
o To use a ReadOnlyDictionary, you always have to supply it with an existing
dictionary, which will be the thing it provides read-only access to
- SortedList<TKey, TValue>
o It’s a dictionary that automatically keeps its values sorted
- SortedDictionary<TKey, TValue>
o Dictionaries that sort their elements
o Functionlity the same as SortedList<TKey, TValue>
o The difference is entirely in the internal implementation
o SortedDictionary sorts its elements using a data structure called balanced tree
o This means that it keeps the elements sorted by key in a hierarchal structure, and as
alements are added or removed it just adjusts the structure of the tree to keep it
optimized for fast element look-up
o Use a SortedDictionary in preference to a SortedList for performance reasons
(modification fast)
- KeyedCollection<TKey, TValue>
o Customizable
o It stores two collection in one: a List<TValue> and a Dictionary<TKey, TValue>
- HashTables
o Are used by collections to store elements using hash functions for faster lookup

Sets:

- HashSet<T>
o Collection of elements that has no sense of any order or location in the set
o Can’t look up elements in a HashSet
o The only way to get to the elements in a set is by enumerating them
o Good performance for checking if an element is already in the set
o No element can ever be in the set twice
- SortedSet<T>
o Gives the same functionality of HashSet<T> but it uses a balanced tree, which is
keeping the elements sorted in order
- The ISet<T> interface
o Set operations
 Intersection – find values that are in BOTH collections
 Union – find values that are in EITHER collections
 Difference – find values that are in ONLY THE FIRST collection
 Symmetric Difference – find values that are in ONLY ONE collection
o Set comparisons
 SetEquality – does two collections contain the same values?
 ISet<T> Comparisons -> No LINQ Equivalents for these operations
 SetEquals()
 IsSubsetOf()
 IsSupersetOf()
 Overlaps() – are elements in both sets?
 IsProperSubsetOf()
 IsProperSupersetOf()
- Uniqueness of Elements
- IEnumerator<T > interface
Multidimensional arrays:
- Synthax: float[,] tempsGrid = new float[4, 3];
- Get length of each dimension: - get the index bound for a multidimensional array at runtime
o tempsGrid.GetLength(0);
o tempsGrid.GetLength(1);
- Array.Rank – gives the dimension (number of indices)
- Array.GetLowerBound() – returns the smallest possible index for a given dimension
- Array.GetUpperBound() – returns the largest possible index for a given dimension

Multidimensional arrays vs. Jagged arrays:


- float[,] tempsGrid; - comma means this is a 2-dimensional array of floats
- float[][] tempsGrid; - additional square brackets mean this is an array of arrays of floats

Multidimensional arrays apply only on Arrays;

Jagged arrays apply on any collection;

You might also like