Wednesday, July 6, 2011

JAVA:Fail fast and Fail safe Iteratotrs

A Fail Fast Iterator is an iterator that attempts to raise an error if the sequence of elements processed by the iterator is changed during iteration.
Fail-safe means, The ability of a system to fail but not produce a catastrophic result.

An iterator is considered fail-fast if it throws a ConcurrentModificationException under either of the following two conditions:

1. In multithreaded processing: if one thread is trying to modify a Collection while another thread is iterating over it.

2. In single-threaded or in multithreaded processing: if after the creation of the Iterator, the container is modified at any time by any method other than the Iterator's own remove or add methods.

Note in particular what is implied by the second condition: After we create a container's iterator, during a loop that iterates over the container we must only use the remove (and when applicable add) methods defined for the iterator and that we must NOT use the same methods defined for the container itself. To illustrate this point, suppose we declare and initialize a List in the following manner

List list = new ArrayList();

list.add("Peter");

list.add("Paul");

list.add("Mary");

Let's say we wish to iterate over this list. We'd need to declare a ListIterator as follows:

ListIterator iter = list.listIterator();

Having created this iterator, we could now set up a loop like:

while(iter1.hasNext()){

String str = iter1.next();

// do something with str

}

Because iter is fail-fast, we are not allowed to invoke List's add or remove methods inside the loop. Inside the loop, we are only allowed to use ListIterator's add and remove methods. This makes sense because it is the Iterator object that knows where it is in a List as the List is being scanned. The List object itself would have no idea of that.

The Iterators supported by all the work-horse container classes, such as ArrayList, LinkedList, TreeSet, and HashSet, are fail-fast. The Iterator type retrofitted to the older container class Vector is also fail-fast. For associative containers, such as HashMap and the older HashTable, the Iterator type for the Collections corresponding to either the keys or the values or the pairs are fail-fast with respect to the container itself. That means that even if you are iterating over, say, just the keys of the container, any illegal concurrent modifications to the underlying container would be detected.

One final note regarding iterators versus enumerations: It is also possible to use an Enumeration object returned by the elements() method for iterating over the older container types such as Vector. However, Enumerations do not provide a fail-fast method. On the other hand, the more modern Iterator returned by a Vector's iterator() and listIterator() methods are fail-fast. Hence, iterators are recommended over enumerations for iterating over the elements of the older container types.

A PRACTICAL APPROACH----------->Iterators can be designed so that they are fail fast or fail safe. Depending on the underlying implementation of the Iterator, a ConcurrentModificationException is thrown if the Collection is modified while Iterating over the data structure. It pays to understand how an Iterator will behave under both conditions. Lets try to implement fail fast Vs fail safe iterators of our own.

Our data structure for this example is pretty simple. It defines an interface that abstracts set and get operations on a structure. How the underlying classes handle invalid set operations or the size of the structure is implementation dependent

A data structure interface:

1
2
3
4
5
6
public interface Data<T> extends Iterable<T>
{
    int size();
    T getElement(int position);
    void setElement(int position, T t);
}

An underlying implementation of such an Interface, could be say an array of Integers whose size is fixed. Invalid indexes on bounds are not allowed and the internal structure will not grow or shrink. An implementation is given below

An array of integers implementing the interface:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
public class ArrayOfIntegers implements Data<Integer>
{
    private static int DEFAULT_SIZE = 1024;
    private int mods = 0;
 
    private Integer[] integers = new Integer[DEFAULT_SIZE];
 
    @Override
    public Integer getElement(int position)
    {
        checkRange(position);
        return integers[position];
    }
 
    @Override
    public void setElement(int position, Integer integer)
    {
        checkRange(position);
        mods++;
        integers[position] = integer;
    }
 
    @Override
    public int size()
    {
        return integers.length;
    }
 
    @Override
    public Iterator<Integer> iterator()
    {
        // return new FailFastIter();
        // return new NoFailIter();
        return null;
    }
 
    private void checkRange(int position)
    {
        if (position > DEFAULT_SIZE || position < 0)
        {
            throw new ArrayIndexOutOfBoundsException(position);
        }
    }
 
    // Iterator implementations go here as a private class
}

If we wanted to Iterate over the ArrayOfIntegers structure, there are 2 ways to do it. Either ensure that the underlying data structure Integer[] integers is not modified while we iterate over ArrayOfIntegers, or make a copy of Integer[] integers so that any changes made to the internal structure will not affect the caller in any way. Let us look at 2 private Iterator classes that we can place into the ArrayOfIntegers class that will help us achieve both flavors of Iteration

Fail fast:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
private class FailFastIter implements Iterator<Integer>
{
    int currentIndex = 0;
    int check = -1;
    int size = integers.length;
 
    public FailFastIter()
    {
        check = mods;
    }
 
    @Override
    public boolean hasNext()
    {
        checkForModification();
        if (currentIndex < size)
        {
            return true;
        }
        return false;
    }
 
    @Override
    public Integer next()
    {
        checkForModification();
        Integer result = integers[currentIndex];
        currentIndex++;
        return result;
    }
 
    @Override
    public void remove()
    {
        throw new UnsupportedOperationException();
    }
 
    private void checkForModification()
    {
        if (check != mods)
        {
            throw new ConcurrentModificationException();
        }
    }
}

The fail fast Iterator in this example, refers to the internal data structure directly. Every modification to the internal structure is tracked using the ‘mods‘ variable. Our iterator stores this value when it was initially created using the ‘check‘ variable. Both variables are compared every time the hasNext() or next() methods are called. If they are unequal then it means that the underlying structure was changed. This is when the code throws a ConcurrentModificationException

Fail safe:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
private class NoFailIter implements Iterator<Integer>
{
    int currentIndex = 0;
    Integer[] internal = Arrays.copyOf(integers, size());
    int internalSize = -1;
 
    public NoFailIter()
    {
        internalSize = internal.length;
    }
 
    @Override
    public boolean hasNext()
    {
        if (currentIndex < internalSize)
        {
            return true;
        }
        return false;
    }
 
    @Override
    public Integer next()
    {
        if (hasNext())
        {
            Integer result = internal[currentIndex];
            currentIndex++;
            return result;
        }
        else
        {
            throw new NoSuchElementException();
        }
    }
 
    @Override
    public void remove()
    {
        throw new UnsupportedOperationException();
    }
}

The fail safe iterator makes a copy of the internal array data structure and uses it to iterate over the elements. This prevents any concurrent modification exceptions from being thrown if the underlying data structure changes. Of course, the overhead of copying the entire array is introduced. Both implementation can be tested using a program

Iterator test for both iterators:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
public class IterTest
{
    public static void main(String... args)
    {
        new IterTest().go();
    }
 
    public void go()
    {
        Data<Integer> dataArray = new ArrayOfIntegers();
        for (int iCounter = 0; iCounter < 100; iCounter++)
        {
            dataArray.setElement(iCounter, iCounter);
        }
        Iterator<Integer> iterator = dataArray.iterator();
        while (iterator.hasNext())
        {
            System.out.println(iterator.next());
            dataArray.setElement(1, 12);
        }
    }
}

By changing the Iterator implementation that the iterator() method returns, this test program will either throw a ConcurrentModificationException or iterate over all the elements without knowing what changed under the hood.

It makes sense to throw ConcurrentModificationException from an Iterator in certain cases. Under other scenarios you just don’t care if the Iterator will fail or survive an internal structure change. Taking a leaf out of the java Collection API, take a look at the Iterator behavior of an ArrayList Vs that of a CopyOnWriteArrayList.

Iterator test for 2 iterators within the JDK API:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
public class IterTest2
{
    public static void main(String... args)
    {
        new IterTest2().go();
    }
 
    public void go()
    {
        List<String> arList = new ArrayList<String>();
        List<String> copyList = new CopyOnWriteArrayList<String>();
        populate(arList);
        populate(copyList);
        iterate(arList);
        iterate(copyList);
    }
 
    private void populate(List<String> list)
    {
        for (int iCounter = 0; iCounter < 100; iCounter++)
        {
            list.add(new Integer(iCounter).toString());
        }
    }
 
    private void iterate(List<String> list)
    {
        try
        {
            for (String x : list)
            {
                System.out.println(x);
                list.add(x);
            }
        }
        catch (RuntimeException e)
        {
            e.printStackTrace();
        }
    }
}

The CopyOnWriteArrayList does not fail when the internal structure changes, but ArrayList does. This example better highlights the problem of fail fast Vs fail safe iterators. The next time your write a custom iterator for your classes, consider this decision.

No comments:

Post a Comment