You are on page 1of 48

JAVA Development

Collections
Java Data Structures

Split into two hierarchies:

● Collections: hold elements of the same type


“Ana”, “John”, “Mary”

● Maps: hold elements accessible by a key


(key-value pairs, a dictionary)
“galaxy8” -> new Phone(“Samsung S8”, 3000),
“nokia8” -> new Phone(“Nokia 8”, 2500”)

“PI” -> 3.14159,


“e” -> 2.71828
Collections
Collection

interface Collection<E> extends Iterable<E>

- represents a generic collection, holding multiple elements of the same type

- base interface, declares some common methods:


● add(E), addAll(Collection<? extends E>)
● remove(Object), removeAll(Collection<?>), retainAll(...)
● contains(Object), containsAll(Collection<?>)
● size(), isEmpty(), clear(), toArray() ...

- no restrictions/guarantees regarding:
● order (element position)
● uniqueness
● sorting
List

interface List<E> extends Collection<E>


● An ordered collection: keeps insertion order, allows index-based access
● No uniqueness constraints - allows duplicates
● Adds specific methods:
○ get(int index): like arrays, can get elements by index
○ set(int index, E e): update element at a specific index, overwriting the old one
○ add(int index, E e): insert an element at a specific index
○ remove(int index): remove element at index
○ indexOf(Object), lastIndexOf(Object): find the index of first/last occurence of elem
○ subList(int fromIndex, int toIndex): returns part of the list
● Also inherits all methods of Collection, but declares more specific meaning for some
(considering the new properties like ordering): add(E) - adds element at end of the list, etc..
List - Implementations

Most commonly used implementations:

ArrayList<E>
● fast access by index
● slow add/remove operations
● backed by an array, needs a contiguous
memory area

LinkedList<E>
● slow access by index
● fast add/remove operations
● no need for contiguous memory area
● needs slightly more memory
List - Example

List<String> names = new ArrayList<>();


names.add("john");
names.add("mary");
names.add("anna");
names.add("mary"); //duplicates are allowed, so this will be added

System.out.println(names.size()); //4
System.out.println(names); //[john, mary, anna, mary] - nice toString

System.out.println(names.get(1)); //mary - index-based access, starts at 0

//accessing all elements - may use index:


for (int i = 0; i < names.size(); i++) System.out.println(names.get(i));
//also possible with simplified for:
for (String name : names) System.out.println(name);
List - Removal

List<String> names = new LinkedList<>(); //different impl

names.addAll(Arrays.asList("john", "mary", "anna", "mary"));


System.out.println(names); //[john, mary, anna, mary]

names.remove(0); //remove elem at index 0 (‘john’)

names.remove("mary"); //removes only the first occurrence!


System.out.println(names.contains("mary")); //true
System.out.println(names); //[anna, mary]

names.remove("mary"); //removes remaining occurrence


System.out.println(names.contains("mary")); //false now
Set

interface Set<E> extends Collection<E>

● A collection with unique elements (like sets in mathematics)


Set<Integer> ints = new HashSet<>();
ints.add(1);
ints.add(2);
ints.add(2); //duplicate, will not be added!
System.out.println(ints.toString()); //-> [1, 2]

● No ordering: not required to preserve insertion ordering (so elements may


seem “randomly ordered”); also no possibility for index-based access

● No extra methods compared to Collection<E>


Set - Implementations

Most commonly used implementations:

HashSet<E>
● Safest option, good performance
● Iteration order can be random
● Uses a HashMap internally (details to follow)

LinkedHashSet<E>
● In addition to a normal HashSet, also stores the elements in a double linked list
● Preserves insertion order (not required by Set interface, but happens due to the
actual implementation)
Set - HashSet example

Set<String> names = new HashSet<>();


names.add("john");
names.add("mary");
names.add("anna");
names.add("mary"); //will not be added! (duplicate, not allowed)

System.out.println(names); //[mary, john, anna] - undefined/random order


System.out.println(names.size()); //3

//names.get(1); //compile error -> no index-based access


System.out.println(names.contains("mary")); //true

//iteration possible with simplified for (in same random order)


for (String name : names) System.out.println(name); //mary john anna
Set - LinkedHashSet example

Set<String> names = new LinkedHashSet<>();


names.add("john");
names.add("mary");
names.add("anna");
names.add("mary"); //will not be added! (duplicate, not allowed)

System.out.println(names); //[john, mary, anna] - keeps insertion order

for (String name : names) System.out.println(name); //john mary anna


Set - Other implementations

TreeSet<E>
● elements are sorted (don’t keep insertion order, but are ordered by a specific
criteria - using either compareTo or with a custom Comparator)
● specific methods:
■ first, last
■ floor: find the greatest element lower or equal to the given argument
■ ceiling: find the lowest element greater or equal to the given argument
■ higher, lower: greatest/lowest elem strictly lower/greater than given arg
■ subSet, tailSet, headSet, descendingSet, pollFirst, pollLast, etc.

See docs at: https://docs.oracle.com/javase/8/docs/api/java/util/TreeSet.html


Set - TreeSet example

Set<String> names = new TreeSet<>();

names.add("john");
names.add("mary");
names.add("anna");
names.add("mary"); //duplicate (so not added)
names.add("dan");
names.add("emma");

System.out.println(names); //[anna, dan, emma, john, mary] - sorted alphabetically!


Collections - Common implementations
Collections - Choosing the right one

- Most commonly used:


ArrayList , HashSet

- You need a good reason to


pick something else...

HashSet
TreeSet
Collection - Other types

● Stack: last in, first out

● Queue: first in, first out


○ ArrayQueue

● Deque: operations at both ends (“double ended queue”)


○ ArrayDeque, LinkedList

● PriorityQueue: a queue in which elements have a priority


○ elements are not sorted
○ only interested in getting the element with the highest priority
○ e.g.: support tickets, E.R. patients
Collections - Utility class

java.lang.Collections - class with some static utility methods you may use/explore:
- sort - sort a collection by natural ordering of the elements
- binarySearch - searches for an element in a sorted collection (fast)
- min, max - find min/max element of a collection (according to their natural ordering)
- replaceAll - replace all occurrences of a specific element
- reverse, rotate, shuffle - reorder the collection in various ways
- swap - swap 2 elements of the collection
- indexOfSublist - find position of a sublist
- frequency - return frequency (count) of a given element in the collection

Notes:
- remember about Arrays.asList() method (convert an array to a List)
- check the constructors of common Set/List implementations - besides the default one (for an
empty collection), they usually have copy constructors (start with elements of existing collection)
Maps
Maps – Map interface

interface Map<K,V>

• A Map is a dictionary, storing key-value pairs


○ key-based access (not index-based)
○ keys must be unique
○ key and values can be of different types (K,V)

• Useful when you have to search, update or delete (more complex) elements
based on a (simpler) unique key
Maps – Map.Entry interface

interface Map<K,V> {
interface Entry<K,V> {…}

}

• Entry is a sub-interface of Map (may access its type by Map.Entry name)


• Used in some methods of Map
• Represents a key-value pair (encapsulating both parts in a single object)
○ contains methods to access the two parts (getKey, getValue, setValue)
Map – Methods (1)

- V get(Object key) - return the value for the specified key (if found)
- boolean containsKey(Object key) - returns true if map contains given key
- boolean containsValue(Object value) - returns true if map contains (at least) a key
with given value

- Set<Map.Entry<K, V>> entrySet() - returns a Set view with all entries


- Set<K> keySet() - returns a Set view with only the keys
- Collection<V> values() - returns a Collection view with only the values

- int size() - return number of entries


- boolean isEmpty() - return true if map is empty
Map – Methods (2)

- V put(K key, V value) - add a map entry (or update existing one)
- void putAll(Map<? extends K, ? extends V> map) - insert all entries of specified
map into this map
- V remove(K key) - remove an entry from map (and returns it)
- void clear() - removes all map entries

- V getOrDefault(Object key, V default) - returns the value of specified key, or


default if missing
- V putIfAbsent(K key, V value) - insert an entry to map only if key is not present
- V replace(K,V) - replace the value for some existing key
Maps – HashMap

• HashMap - implements the Map interface by using a hashtable

• Characteristics:
○ holds key-value pairs (unique keys)
○ maintains no order
○ based on hashing (needs correct hashCode()/equals() for keys)
○ nulls: allows one null key, multiple null values

https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html
Maps – Example

Map<Integer, String> m = new HashMap<>();


m.put(100, "Alex");
m.put(101, "Bogdan");
m.put(102, "Cristi");
m.put(102, "Catalin"); //overwrites value for existing key

m.remove(101);
for (Map.Entry e : m.entrySet()) {
System.out.println(e.getKey() + ": " + e.getValue());
}
//prints entries in random order
//-> 102: Catalin
//-> 100: Alex
Maps – LinkedHashMap

● LinkedHashMap - Hash table and linked list implementation of


the Map interface, with predictable iteration order

● Characteristics:
○ holds key-value pairs (unique keys)
○ maintains insertion order (unlike HashMap)
○ based on hashing (needs correct hashCode()/equals() for keys)
○ slightly slower than HashMap
○ nulls: allows one null key, multiple null values

https://docs.oracle.com/javase/8/docs/api/java/util/LinkedHashMap.html
Maps – Example

Map<Integer, String> m = new LinkedHashMap<>();


m.put(100, "Alex");
m.put(101, "Bogdan");
m.put(102, "Cristi");
m.put(102, "Catalin"); //overwrites value for existing key

m.remove(101);
for (Map.Entry e : m.entrySet()) {
System.out.println(e.getKey() + ": " + e.getValue());
}
//prints entries in insertion order
//-> 100: Alex
//-> 102: Catalin
Maps – TreeMap

• TreeMap - implements the Map interface by using a tree, storing


in an efficient way the entries in sorted order (by key)

• Characteristics:
○ holds key-value pairs (unique keys)
○ maintains entries in sorted order (unlike HashMap)
■ key comparison: see Comparable or Comparator later
○ slower than HashMap, LinkedHashMap (due to sorting)
○ nulls: null keys not allowed, allows null values

https://docs.oracle.com/javase/8/docs/api/java/util/TreeMap.html
TreeMap– Example

Map<Integer, String> m = new TreeMap<>();


m.put(102, "Cristi");
m.put(100, "Alex");
m.put(101, "Bogdan");

for (Map.Entry e : m.entrySet()) {


System.out.println(e.getKey() + ": " + e.getValue());
}
//prints entries sorted alphabetically by key
//-> 100: Alex
//-> 101: Bogdan
//-> 102: Cristi
Map – common implementations
Maps – HashTable

● HashTable - implements Map using a hashtable

● Characteristics:
○ holds key-value pairs (unique keys)
○ maintains no order
○ based on hashing (needs correct hashCode()/equals() for keys)
○ nulls: doesn’t allow null keys or values
○ synchronized (safe for multi-threaded access)

● Considered deprecated - to be avoided! (old, slow, confusing api;


use ConcurrentHashMap instead, when needed)
https://docs.oracle.com/javase/8/docs/api/java/util/Hashtable.html
Enums
Enums - declaration

● enum - a special Java type used to define small collections of constants


○ is a special kind of Java class
○ all enums extend java.lang.Enum (and this cannot be changed/overridden)
○ besides constants, they may optionally contain fields, methods etc.

- Declaration: using ‘enum’ instead of class/interface, and giving the list of possible values:
enum Level {
HIGH, MEDIUM, LOW
}

- Referring the values:


Level level = Level.HIGH;
Enums - usage

- Comparing values - ok with “==”, no need for .equals() :

Level level = … ; //assign some Level constant to it

if (level == Level.HIGH) {/*do something*/}


else if (level == Level.MEDIUM) {/*do something else*/}
else if (level == Level.LOW) {/*else do something else*/}

- Iterating over all defined values:

for (Level level : Level.values()) {


System.out.println(level);
}
Enums - example

public enum Level { …


Level level = Level.HIGH;
HIGH(3), //calls constructor with value 3 System.out.println(level.getLevelCode());
MEDIUM(2), //calls constructor with value 2
LOW(1) //calls constructor with value 1 //prints out the value 3
; //optional, needed if fields/methods follow //which is the value of levelCode field
//for the enum constant HIGH
private final int levelCode;

Level(int levelCode) {
this.levelCode = levelCode;
}

public int getLevelCode() {


return this.levelCode;
}
}
EnumMap

class EnumMap < K extends Enum<K>, V > extends AbstractMap<K, V>

- EnumMap - a specialized Map implementation


- for keys of type enum (only)
- with better performance than HashMap

- commonly used constructor:


- EnumMap(Class<K> keyType) - creates an empty enum map
with the specified key type (K needs to extend Enum)
EnumMap – Example

import java.util.*;

class EnumMapExample {

//create an enum
enum Days { Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday }

public static void main(String[] args) {


//create and populate enum map
EnumMap<Days, String> map = new EnumMap<>(Days.class); //enum is just a kind of class
map.put(Days.Thursday, "4");
map.put(Days.Friday, "5");
map.put(Days.Saturday, "6");
map.put(Days.Sunday, "7");
map.put(Days.Monday, "1");
map.put(Days.Tuesday, "2");
map.put(Days.Wednesday, "3");

//print the entries of enum map, in the order of the constants in enum definition
for (Map.Entry m : map.entrySet()) { System.out.println(m.getKey() + " " + m.getValue()); }
}
}
Object.equals()
Object.equals()

Testing the equality of two values/objects:


- with ‘==’ operator -> works well only for primitives!

- with equals() - a method defined in base class Object, so inherited by all objects:

public class Object {


public boolean equals(Object obj) {
return this == obj; //default impl: same as “==” operator
}
}

You may override it in your classes, to make it test the logical equality of your objects!

Tip: IntelliJ IDEA can generate it: right click / Generate… / equals() and hashCode()
Object.equals() - Example

class Point {
private double x, y;
Point(int x, int y) { this.x = x; this.y = y; }

@Override
public boolean equals(Object obj) { //note the type: Object, not Point!
if (!(obj instanceof Point)) { return false; } //need to be of same type
Point that = (Point) obj; //may cast now from Object to specific type
return this.x == that.x && this.y == that.y; //check to have fields with same values
}
}

//using it to compare some Point instances:


Point origin = new Point(0, 0);
Point p1 = getSomePoint(…);
if (origin.equals(p1)) {…}
Object.equals() - Gotchas

Usage rule:
- == should be used for primitives (+ enums)
- equals() should always be used for reference types
- including String
- including wrapper types (Long, Integer, Double…)

Note: be careful when calling .equals(), as the object might be null !


- may use instead the static helper function Objects.equals() that also checks for null
String str = … ;
if (str == "some string") {} //Incorrect (not a primitive)
if (str.equals("some string")) {} //Correct, unsafe with null
if ("some string".equals(str)) {} //Correct, null safety, but harder to read
if (Objects.equals(str, "some string")) {} //Correct and safe
Object.equals() - Contract

equals() contract - the overridden equals() must be:


- reflexive: x.equals(x) == true for any x
- symmetric: x.equals(y) == y.equals(x) for any x,y
- transitive: x.equals(y) and y.equals(z) => x.equals(z)
- consistent: the value of equals() should change only if a property checked in
equals() changes (no randomness allowed)

Notes:
- pay attention to keeping equals() up to date (on changes of class structure/fields)
- pay attention to inheritance scenarios - if a base class overrides equals, derived
classes most probably need to override it too! (parent impl is not good enough)
Object.hashCode()
Object.hashCode()

- hashCode() - method inherited from Object by all reference types; may be overridden...
- return value:
- should be an integer value representing the current instance of the class
- the returned value should be consistent with definition of equality for that class

hashCode() contract - is closely related to equals() method:


- internal consistency: hashCode value may only change if a field used in equals() changes
- equals consistency: objects which are equal to each other must have the same hashCode
- collisions allowed: unequal objects may have the same hashCode
- for performance reasons, its desirable to have a hashCode with minimal collisions (and also fast)

Rule: equals() and hashCode() should always be overridden together!


Object.hashCode()

Why is this important?


- hash-based collections (HashMap, HashSet..) require a correct hashCode !
- if hashCode() breaks the contract, these collections may malfunction!

Implementation details:
- HashSet / HashMap(for keys): elements organized in a series of “buckets”, elements of a bucket having the
same hashCode value (may be multiple ones if collisions occurred)
- when putting a new object in a HashSet/HashMap:
- its hashCode value is computed, and based on it the element is added to the right bucket
- when getting an element from the collection, or checking existence:
- the hashCode of the given object is computed, and then based on this value the right bucket is
checked (and after this equals() will also be called)
- if hashCode is wrong/missing/breaks contract: may make collection actions wrong/unpredictable!
(failing to correctly add new elements, like allowing duplicates, or failing to get existing elements…)
Object.hashCode() - Example

class Cat { Cat c1 = new Cat("Pixel");


private String name; Cat c2 = new Cat("Pixel");
Cat(String name) { this.name = name; } Cat c3 = new Cat("Paka");

@Override System.out.println(c1.equals(c2)); //true


public String toString() { System.out.println(
return "Cat{name="+name+'}'; c1.hashCode() == c2.hashCode()); //false!!
}
Set<Cat> cats = new HashSet<>();
cats.add(c1); cats.add(c2); cats.add(c3);
@Override
public boolean equals(Object o) { //=> 3 cats in Set now, 2 of them equal!!
if (this == o) return true; // (but with different hashcode)
if (o==null||getClass()!=o.getClass()) System.out.println("cats:"+cats.size()); //3!!
return false;
return Objects.equals(name,((Cat) o).name); System.out.println(cats);
//=> [Cat{name=Pixel}, Cat{name=Paka}, Cat{name=Pixel}]
}
//but NO hashCode() override!! System.out.println("contains Pixel?: " +
} cats.contains(new Cat("Pixel"))); //false!
Questions?
Extra reading:
- http://tutorials.jenkov.com/java-collections/index.html
- https://www.geeksforgeeks.org/collections-in-java-2/
- https://en.wikipedia.org/wiki/Java_collections_framework
- https://www.baeldung.com/java-equals-hashcode-contracts

You might also like