Difference between revisions of "Collections"
|  (→Sets) | |||
| Line 136: | Line 136: | ||
| ** Sorted sets implement the SortedSet<E> interface. | ** Sorted sets implement the SortedSet<E> interface. | ||
| ** They have additional methods - first() and last() | ** They have additional methods - first() and last() | ||
| − | ** headSet(e) - returns all elements less than e,  | + | ** headSet(e) - returns all elements less than e (e is the headelement),   | 
| + | ** tailSet(e), - e is the tail element, returns all elements greater than or equal to e. | ||
| + | ** combining both is subSet(e1, e2) - includes e1, excludes e2 | ||
| * NavigableSet    | * NavigableSet    | ||
Revision as of 02:01, 29 August 2011
Collections
Contents
Equals and Hashcode
- equals()
- An equals() method must satisfy all the conditions below:
- Reflexive: self.equals(self) must always be true
- Symmetric: x.equals(y) means y.equals(x)
- Transitive: if x.equals(y) and y.equals(z) then x.equals(z)
- Consistent: multiple invocations of x.equals(y) must return same result
- null Comparision: obj.equals(null) should always be false
- hashCode() :
- Mainly used for storing and retrieving objects from hashed collections.
- First object's hashcode is used to figure out which hash bucket the object is in
- Then the equals() method is used to compare objects in the same hash bucket.
This means that:
- equal objects MUST have the same hashcode
- unequal objects can have the same (or different) hashcode (e.g. same bucket but objects are different)
- objects having different hashcodes MUST be unequal
Types of Collections
- There are four basic types of collections: Lists, Sets, Queues and Maps.
There are four flavours of Collections:
- Sorted -> Means that the order in the collection is determined according to the sort order i.e. properties of the object. Most common sort oder is the natural sort order. e.g. ascending for Integer objects, alphabetical for String objects.
Sort order is defined by implementing the Comparable<T> interface and defining the compareTo method. examples of Sorted collections are TreeMap and TreeSet. If you try to insert an object which is not comparable, it will fail.
- Unsorted -> An unsorted collection can be ordered or unordered. See below.
- Ordered -> Means that we can iterate through the collection in the same specific order every time. For e.g. all lists are ordered by index position. See example below:
import java.util.*;
class Foo implements Comparable<Foo> {
	private String str;
	
	public Foo(String str) {
		this.str = str;
	}
	
	public String toString() {
		return "Foo:" + str;
	}
	
	public int compareTo(Foo f) {
		return str.compareTo(f.str);
	}
}
public class Ordering {
	public static void main(String[] args) {
		List<String> list = new ArrayList<String>();
		list.add("D");
		list.add("A");
		list.add("C");
		list.add("B");
		
		for(Iterator<String> iter = list.iterator(); iter.hasNext();) {
			System.out.println(iter.next());
		}
		//Will print D,A,C,B each time
		
		List<Foo> fooList = new ArrayList<Foo>();
		fooList.add(new Foo("D"));
		fooList.add(new Foo("A"));
		fooList.add(new Foo("C"));
		fooList.add(new Foo("B"));
		
		for(Iterator<Foo> iter = fooList.iterator(); iter.hasNext();) {
			System.out.println(iter.next());
		}
		//Will print Foo:D,Foo:A,Foo:C,Foo:B each time
	}
}
So despite Foo having defined a comparator, ordering is performed according to the element's index position. LinkedHashSet is ordered by insertion, so the last element inserted is the last element in the LinkedHashSet.
This is an example of an ordered, but unsorted collection.
- Unordered ->
e.g. HashSet, iterating over a HashSet is not guaranteed. It might produced the same order many times, but is NOT guaranteed to do so every time. This is why it is unordered.
- The Comparable interface specifies the natural sort order.
- We can also define other sort orders using the Comparator interface.
Iteration
- An iterator can be used to traverse through a collection.
- An iterator can be used to remove values from the collection using the iterator.remove() method.
- The remove method removes the element that was returned by the last call to next().
- remove() will throw an IllegalStateException when a next() has not been called immediately prior to it.
 
- The enhanced for loop is also used to iterate through anything that is Iterable. Since collection is Iterable, it will work here.
- Note: Structural change to the collection during the for(:) loop or while an iterator is traversing the collection will result in a ConcurrentModificationException..
- So do not use collection.remove() instead use iterator.remove().
Implementation
- Set, List and Queue implement the interface Collection.
- Each one of the concrete collections has a constructor which accepts another Collection - so collections can be easily interchanged.
- The Collection interface implements basic methods such as:
- add/addAll, remove/removeAll,
- retainAll.
- contain/containsAll,
- size(), clear(), isEmpty()
- iterator() and toArray()
 
Sets
- Do not allow duplicates
Set interface models a mathematical set
- Assume a and b are sets.
- a.containsAll(b) is checking if b is a subset of a.
- a.addAll(b) is a union b
- a.removeAll(b) is a-b
- a.retainAll(b) is a intersection b
- a.clear is making a the empty set.
Implementations
- The HashSet implementation uses as Hashtable - so near constant time for most operations
- No ordering is preserved.
- Elements must implement equals() and hashCode()
- LinkedHashSet guarantees insertion order - choose it when frequent traversal of set is necessary. LinkedHashSet extends HashSet.
- Sorted Sets
- Sorted sets implement the SortedSet<E> interface.
- They have additional methods - first() and last()
- headSet(e) - returns all elements less than e (e is the headelement),
- tailSet(e), - e is the tail element, returns all elements greater than or equal to e.
- combining both is subSet(e1, e2) - includes e1, excludes e2
 
- NavigableSet
- NavigableSet extends the SortedSet interface with navigation methods to find the closest matches for specific search targets.
- NavigableSet is the prefered choice when a sortedset is required.
- Two new methods added are:
- E pollFirst() - removes and returns the first element.
- E pollLast() - removes and returns the last element.
- All the range-view operations such as headSet(), tailSet and subSet() have boolean flags to toggle whether the bound e is included or not. (The bounds are excluded by default in SortedSet)
- Then there are closest matches methods
- ceiling(e) return least element that is greater than or equal to e and floor(e) - greatest element less than or equal to e. Note: Suppose e is contained in the set - e itself will be returned !
- higher(e) and lower(e) - difference from the ceiling and floor is they return values strictly higher and lower than e.
 
 
- TreeSet implements NavigableSet - so it is also a SortedSet.
public class NavigableSetTest {
	public static void main(String[] args) {
		
		NavigableSet<Integer> ns = new TreeSet<Integer>();
		int[] ia = new int[] {1,2,3,4,6,7,8,9,10};
		
		for(int i : ia) {
			ns.add(i);
		}
		
		System.out.println(ns.headSet(6, true)); //prints 1,2,3,4,6
		
		System.out.println(ns.higher(1)); //prints 2
		
		System.out.println(ns.ceiling(1)); //prints 1
		
		System.out.println(ns.ceiling(5)); //prints 6
		
		System.out.println(ns.pollLast()); //prints 10
		 
		System.out.println(ns.pollFirst()); //prints 1
		
		System.out.println(ns); //prints [2, 3, 4, 6, 7, 8, 9]
		
	}
}
Lists
- Lists maintain their elements in insertion order.
- List offers a ListIterator<E> which is a bidirectional Iterator.
Implementation
- ArrayList and Vector are implemented using resizable arrays and provide fast random access.
- The Vector is threadsafe.
- LinkedList is a doubly-linked list. Insertions and Deletions are very efficient. Access is in linear time.
Queues
- A queue is a collection that maintains elements in processing order. The Queue<E> interface specifies a general contract for queues.
- A Queue implementation provides the policy that yields the next element for processing - the head position specifies where the next element to be processed can be obtained.
- A queue is used by removing the head element according to a processing order.
- Methods for addition - add() and offer() - offer does not throw IllegalStateException when Q is full as opposed to add.
- For removal - poll() and remove() - both method retrieve the head and remove it. Difference if Q is empty poll() returns null but remove() will throw NoSuchElementException.
- peek() allows us to examine the head of the Q without removing it.
Implementation
- LinkedList - works as a FIFO Queue. Unless bidirectional traversal is necessary - other Q implementations should be considered.
- PriorityQueue is used to order elements according to their priority. Implementation is done using a priority heap.
- The priority order is defined either by the natural ordering of elements OR by a comparator. If several elements have same priority, one is chosen randomly.
- This implies that elements of a PQ have to implement Comparable or a comparator must be used.
- Elements of a PQ are NOT SORTED.. Only removal is in priority order - not traversal. The iterator can return elements in any order.
- e.g. if Integers are used as elements, they are natural ordered in an ascending fashion - so the smallest integer will have the highest priority.
Difference. See the below code example, where a FIFO Queue and a PriorityQueue are populated and elements are retrieved from the head of the queue, using poll().
public class QueueTest {
	public static void main(String[] args) {
		
		Queue<Integer> fifo = new LinkedList<Integer>();
		fifo.offer(2);
		fifo.offer(3);
		fifo.offer(1);
		
		Integer h = null;
		
		while((h = fifo.poll()) != null) {
			System.out.println(h);
		}
		//prints out elements in the FIFO order i.e. 2, 3, 1
		
		Queue<Integer> pQ = new PriorityQueue<Integer>();
		
		pQ.offer(2);
		pQ.offer(3);
		pQ.offer(1);
		
		while((h = pQ.poll()) != null) {
			System.out.println(h);
		}
		//prints out elements according to priority i.e. 1, 2, 3
		//in this case since Integers are used - natural ordering is used as priority
		
	}
}
 Deque 
- Deque extends the Queue interface to allow double-ended queues (deque).
- Operations are allowed at head and the tail - elements can be inserted or removed at either end.
- Used as FIFO queue - add at tail, remove at head.
- Used as LIFO queue (stack) - add and remove at the same end.
- offerLast() is same as Q's offer(). There is also an offerFirst() - insert at head.
- pollFirst() is same as Q's poll(). Also is a pollLast() - remove at tail.
- Also addFirst, removeLast etc, peekFirst(), ...
- push() - equal to addFirst() and pop() - removeFirst() (Stack convenience methods)
- getFirst()/Last() - throw NoSuchElementException() if Deque is empty. peek doesnt.
- ArrayDeque and LinkedList implement the Deque interface.
- ArrayDeque has better performance than the LinkedList for FIFO queues and better than java.util.Stack for implementing Stacks.
Maps
- A map defines mappings from keys to values. Represented by java.util.Map<E>
- A key,value pair is called an entry.
- No duplicate keys.
- A map is NOT a Collection.
- Methods: usual put(), get(), containsKey(), containsValue()
- putAll(Map)
Collection View methods - different methods to get info from map.
- Changes in map are reflected in view and vice-versa!
- keySet() - returns a Set<K> of all the keys.
- values() - returns a Collection<V> of all values. This is not a set since values can be duplicate.
- entrySet() - returns a Set<Entry<K,V>>. This set will contain all the key-value pairs.
- Entry<K,V> represents a single key,value pair. It is an interface defined within Map. Has the following methods:
- getKey()
- getValue()
- setValue()
 
Implementations
- Four implementations - HashMap, LinkedHashMap, TreeMap and Hashtable
- HashMap and Hashtable are unordered maps.
- HashMap permits one null key and many null values and is not thread-safe.
- Adding/removing/finding an entry are in constant time.
- Hashtable - no null keys and values. Is thread-safe.
 
- LinkedHashMap implements an ordered map. Is a subclass of HashMap.
- Ordering is key-insertion order. The first key inserted will be the first key returned.
 
- TreeMap implements a sorted map.
 SortedMap<K,V> and NavigableMap<K,V> Interfaces
- SortedMap<E> extends Map<E> and provides functionality for sorted keys.
- Methods: Correspond to methods in SortedSet<E> interface.
- firstKey() and lastKey(). Like first() and last() methods corresponding to Set.
- headMap(), tailMap(), subMap() - return a SortedMap analogous to set methods.
- NavigableMap<E> extends SortedMap<E> - has navigation methods to find closest matches for specific search targets.
- Methods analogous to NavigableSet<E>:
- pollFirstEntry(), pollLastEntry(), firstEntry(), lastEntry()
- headMap(), tailMap() and subMap() add the boolean inclusive flag.
- then all the celing, floor, higher and lower methods. they are two variants one for entry and for key.
- e.g. ceilingEntry(key) - returns the least entry which is >= to key. ceilingKey(k) returns only the key.
- descendingMap() returns a reverse-order map.
- The TreeMap<K,V> implements NavigableMap<K,V> so it is navigablemap and sortedmap.
| For all "view" methods that take a from element and to element, the rule is that: the from element is always included but the to element is excluded. e.g. headSet(e) - will not include e - because e is the "to" element, whereas in tailSet(e) - e will be included because it is the from element. In navigable sets maps, inclusion/exclusion can be controlled by flags. | 
Working with Collections
- Two classes java.util.Collections and java.util.Arrays provide various utility methods to work on collections and arrays.
- All the utlity methods are public and static.
Sorting
- Collections.sort() takes only List. (Because, Set and Map have sorted versions) and an optional comparator.
- There are other methods which change the order of the list.
- reverse()
- rotate(l, distance) - +ve distance moves all the elements towards the end of the list, -ve towards the beginning of the list.
- shuffle() - randomly shuffles the element.
Sorting Arrays
- Arrays.sort() takes an array and an optional comparator.
- sort(a, fromIndex, toIndex) also provided. as usual from - inclusive, to - exclusive.
- Primitive arrays are sorted based on natural order, so no comparator is permitted for primitive sorts.
Searching
- Collections class provides binarySearch for searching for items in Lists.
- Why only Lists ? Because binarySearch returns the position/index, which is applicable only for List types as they used index-based access.
- Since it is a binary search, The List being searched MUST be sorted.
- Search must be performed using the same sort order of the List.
- Successful searches return the index of element that is found.
- Unsuccessful searches return the possible insert position of the search element as a negative number. The insert position will be = (-(insert position) -1).
                String[] str = {"B", "D", "F", "H", "J"};
		String[] search = {"A", "C", "E", "G", "I", "K"};
		
		Arrays.sort(str);
		
		for(String s : search) {
			System.out.println(Arrays.binarySearch(str, s));
		}
                //prints: -1, -2, -3, -4, -5, -6
- Collections also offers max and min methods for Collections whose elements are comparable.
Searching Arrays
- Arrays provides binarySearch method like collections.
- Array must be sorted, like a Collection
- Also a variant takes a half-open interval - with from and to indices.
- index is returned if found, else (-(index) -1 )
- if duplicates are there - any one of the duplicates can be returned.
Converting Arrays To Lists To Arrays
- Converting Arrays To Lists - The asList() method : This "joins" the array to the list:
	        String[] sa = {"a", "b", "c"};
		
		List<String> sl = Arrays.asList(sa);
		
		System.out.println("1. sa : " + Arrays.toString(sa));
                //prints: a, b, c
		System.out.println("1. sl : " + sl);
		//prints: a, b, c
		sa[1] = "bb";
		//changes to the array write-through to the string
		System.out.println("2. sa : " + Arrays.toString(sa)); 
                //prints: a, bb, c
		System.out.println("2. sl : " + sl);
                //prints: a, bb, c
	
		sl.set(1, "b");
		//similarly, changes to the list write-through to the array
		System.out.println("3. sa : " + Arrays.toString(sa));
                //prints a, b, c
		System.out.println("3. sl : " + sl);
                //prints a, b, c
- Converting Lists to Arrays - the crazy toArray() method
- There are two versions of toArray(). The first:
public Object[] toArray(); 
//Returns an object array which must be cast into a specific array type
String[] strArray = (String[]) strList.toArray();
- The second:
public <T> T[] toArray(T[] a)
//Returns an array of the type specified in the parameter.
//Advantage is that you avoid the explicit cast as in the first method.
//The method will allocate a new array, if the argument is too small to contain the list.
//If it's too large, the left-over elements will be padded with null ! 
//There are different ways to use this:
List<String> list = new ArrayList<String>();
String[] sA = list.toArray(new String[0]);
//Here an anonymous String[] is passed, just for type identification and the returned array 
//will be newly allocated with the list's elements.
String[] sa = new String[list.size()];
//Here the array is pre-allocated with the correct size.
list.toArray(sa);
//sa will be populated with the list's elements, no need to use the returned value.
- Collections has a useful Collections.reverseOrder() method that returns a reverse Comparator for a given's collections sort order.
- e.g. sort Integers in descending order:
Map<Integer> ascMap = new TreeMap<Integer>();
//Sort in descending order.
Collections.sort(ascMap, Collections.reverseOrder());
