Java lovers: Talk about the JDK List-ArrayList, Vector, LinkedList

to facilitate developers, JDK provides a set of primary data structure implementation, such as List, Map, etc. Comin talk about the List interface.

List some columns interface implementations, the most important thing is the most common of these three: ArrayList, Vector, LinkedList.

JDK in three classes defined:

public class ArrayList<E> extends AbstractList<E> 
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable 
 
public class Vector<E> 
    extends AbstractList<E> 
    implements List<E>, RandomAccess, Cloneable, java.io.Serializable 
 
public class LinkedList<E> 
    extends AbstractSequentialList<E> 
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable

from the three class definition you can see some information:

three AbstractList directly implements this abstract class
ArrayList and Vector both implements RandomAccess interface, LinkedList No, this is what does that mean? In the JDK, RandomAccess interface is an empty interface, so it has no real meaning, is a marker, marking this class supports fast random access, so, arrayList and vector supports random access, but does not support the LinkedList
serializbale interface table name, and they all support serialization

talk about this in detail below three List implementation.

three inside, ArrayList and Vector using an array implementation, the equivalent of the operation of the array package. This is why they are able to support fast random access reasons, say a, JDK implementation of all array-based data structures are able to support fast random access.

ArrayList and Vector realization of almost all use the same algorithm, their main difference is the ArrayList does not synchronize to any one way to do so is not thread safe; while most of the Vector method has done a thread synchronization is thread safe.

LinkedList using a two-way circular linked list data structure. Because it is based on the list, so there is no way to achieve random access, only sequential access, which also officially it does not implement RandomAccess interface causes.

officially as ArrayList, Vector and LinkedList with the data structure is different, they are destined for a completely different scene.

By reading these type of source, we can see them realize the difference. ArrayList and Vector essentially the same, we took the ArrayList and LinkedList do comparison.

add an element to the end

ArrayList in the add method to achieve the following:

1     public boolean add(E e) { 
2         ensureCapacityInternal(size + 1);  // Increments modCount!! 
3         elementData[size++] = e; 
4         return true; 
5     }

this method to do two things, first ensure that the array is big enough, and then added to the end of the array elements and to be completed after the + + makes the size +1.

can be seen from this code, if the array is large enough, then just add the array operation is O (1) performance and very efficient.

a look at ensureCapacityInternal implementation of this method:

 1     private void ensureCapacityInternal(int minCapacity) { 
 2         modCount++; 
 3         // overflow-conscious code 
 4         if (minCapacity - elementData.length > 0) 
 5             grow(minCapacity); 
 6     } 
 7  
 8     private void grow(int minCapacity) { 
 9         // overflow-conscious code 
10         int oldCapacity = elementData.length; 
11         int newCapacity = oldCapacity + (oldCapacity >> 1); 
12         if (newCapacity - minCapacity < 0) 
13             newCapacity = minCapacity; 
14         if (newCapacity - MAX_ARRAY_SIZE > 0) 
15             newCapacity = hugeCapacity(minCapacity); 
16         // minCapacity is usually close to size, so this is a win: 
17         elementData = Arrays.copyOf(elementData, newCapacity); 
18     }

can be seen, if the array is not enough space, then this method will do the expansion arrays and array copy operations, see line 11, JDK using the shift operator for expansion calculation >> 1 indicates a right shift In addition to 2, so that the expansion newCapacity 1.5 times the original.

PS: Here's the code is implemented in JDK1.7, JDK1.7 for 1.6 has been optimized a lot of code, such as expansion of the code above paragraph, the first 11 rows in JDK1.6 is a direct addition to two, apparently , shift operators to be more efficient.

a look at the LinkedList add method:

 1     public boolean add(E e) { 
 2         linkLast(e); 
 3         return true; 
 4     } 
 5  
 6     void linkLast(E e) { 
 7         final Node<E> l = last; 
 8         final Node<E> newNode = new Node<>(l, e, null); 
 9         last = newNode; 
10         if (l == null) 
11             first = newNode; 
12         else 
13             l.next = newNode; 
14         size++; 
15         modCount++; 
16     }

1         Node(Node<E> prev, E element, Node<E> next) { 
2             this.item = element; 
3             this.next = next; 
4             this.prev = prev; 
5         }

add code can be seen from this, LinkedList due to the use of the list, so do not need for expansion, linked directly to the element to the last, the precursor of the new element before the last element pointing and pointing to the last element of the new element it ok. It is also an O (1) performance.

test:

 1     public static void main(String[] args) { 
 2         // TODO Auto-generated method stub 
 3         long begin = System.currentTimeMillis(); 
 4          
 5 //        List<Object> list = new ArrayList<Object>(); 
 6         List<Object> list = new LinkedList<Object>(); 
 7         Object obj = new Object(); 
 8         for(int i=0; i<50000; i++){ 
 9             list.add(obj); 
10         } 
11          
12         long end = System.currentTimeMillis(); 
13         long time = end - begin; 
14         System.out.println(time+""); 
15  
16     }

respectively ArrayList and LinkedList to do at the end of the add operation, circulation 50,000 times, ArrayList consuming 6ms, while LinkedList consuming 8ms, which is due to LinkedList need more time in add object creation and assignment.

in any position to insert elements

ArrayList in the realization of the following:

1     public void add(int index, E element) { 
2         rangeCheckForAdd(index); 
3  
4         ensureCapacityInternal(size + 1);  // Increments modCount!! 
5         System.arraycopy(elementData, index, elementData, index + 1, 
6                          size - index); 
7         elementData[index] = element; 
8         size++; 
9     }

this code, first check the array capacity expansion capacity is not enough first, and then the index after the next move an array, and finally put a new element at the index position. Since the array is a contiguous memory space, so inserted in any position, can lead to the subsequent move an array after the situation needs to be done once the array copy operation, obviously, if there are a large number of randomly inserted, then the overhead of copying the array will be great, and insert more forward, the greater the array replication overhead.

LinkedList in the realization of:

 1     public void add(int index, E element) { 
 2         checkPositionIndex(index); 
 3  
 4         if (index == size) 
 5             linkLast(element); 
 6         else 
 7             linkBefore(element, node(index)); 
 8     } 
 9  
10     void linkBefore(E e, Node<E> succ) { 
11         // assert succ != null; 
12         final Node<E> pred = succ.prev; 
13         final Node<E> newNode = new Node<>(pred, e, succ); 
14         succ.prev = newNode; 
15         if (pred == null) 
16             first = newNode; 
17         else 
18             pred.next = newNode; 
19         size++; 
20         modCount++; 
21     }

this code, take to the original node at index precursor into the precursor of the new node, while the original index into a new node after the flooding, thus completing the insertion of the new node. This is the list of the advantages, there is no data copying operation, performance, and the last inserted is the same.

test an extreme case, each element in the forefront insert:

 1     public static void main(String[] args) { 
 2         // TODO Auto-generated method stub 
 3         long begin = System.currentTimeMillis(); 
 4          
 5 //        List<Object> list = new ArrayList<Object>(); 
 6         List<Object> list = new LinkedList<Object>(); 
 7         Object obj = new Object(); 
 8         for(int i=0; i<50000; i++){ 
 9             list.add(0,obj); 
10         } 
11          
12         long end = System.currentTimeMillis(); 
13         long time = end - begin; 
14         System.out.println(time+""); 
15  
16     }

test results are: ArrayList takes 1400ms, while consuming only LinkedList 12ms. As can be seen, when the random insertion, the performance difference between the two is quite clear.

summary, from the above source code analysis and test results can be seen to achieve these three List some typical application scenarios, if often done on an array of random insertion operation, particularly in the more forward inserted, then the performance advantages of LinkedList it is very obvious, but if just inserted at the end, then ArrayList is more an advantage, if you need thread safety, the non Vector perfectly.

Java lovers

2013年8月21日星期三

Talk about the JDK List-ArrayList, Vector, LinkedList

没有评论:

发表评论