April 21, 2018

面试官逼问系列之:List

本章介绍java.util.*包和java.util.concurrent.*包下的列表类,即java.util.Collection接口的实现类。

Collection接口的实现类们

这里只讲解列表类ArrayList、LinkedList、CopyOnWriteArrayList;
集合类HashSet、CopyOnWriteArraySet;
队列类ConcurrentLinkedQueue、ArrayBlockingQueue、LinkedBlockingQueue;

Collection接口定义了如下方法

1. ArrayList

ArrayList就是典型的java版本顺序表

                  
public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
...

    //数组
    private transient Object[] elementData;

    //数组大小
    private int size;
                  
                

数组扩容函数

                  
public void ensureCapacity(int minCapacity) {
    if (minCapacity > 0)
        ensureCapacityInternal(minCapacity);
}

private void ensureCapacityInternal(int minCapacity) {
    modCount++;
    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    // 最少扩充为1.5倍
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}
                  
                

向数组中添加元素

                  
public void add(int index, E element) {
    // 检查数组越界
    rangeCheckForAdd(index);

    ensureCapacityInternal(size + 1);  // Increments modCount!!
    // index以后的元素后移
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    elementData[index] = element;
    size++;
}
                  
                

删除数组指定位置元素

                  
public E remove(int index) {
    rangeCheck(index);

    modCount++;
    E oldValue = elementData(index);

    int numMoved = size - index - 1;
    if (numMoved > 0)
        // index以后的元素前移
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    elementData[--size] = null; // Let gc do its work
    return oldValue;
}
                  
                

2. LinkedList

LinkedList就是典型的java版本的双向链表

                  
public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable
{
    transient int size = 0;

    //头结点
    transient Node<E> first;

    //尾结点
    transient Node<E> last;
                  
                

清空链表

                  
public void clear() {
    for (Node<E> x = first; x != null; ) {
        Node<E> next = x.next;
        x.item = null;
        x.next = null;
        x.prev = null;
        x = next;
    }
    //头尾相连
    first = last = null;
    size = 0;
    modCount++;
}
                  
                

向链表中添加元素

                  
public boolean add(E e) {
    linkLast(e);
    return true;
}
...

void linkLast(E e) {
    final Node<E> l = last;
    //Node<>(l, e, null)的3个参数依次为prev, element, next
    //放入链表尾部
    final Node<E> newNode = new Node<>(l, e, null);
    last = newNode;
    if (l == null)
        first = newNode;
    else
        l.next = newNode;
    size++;
    modCount++;
}
                  
                

删除链表指定位置元素

                  
public E remove(int index) {
    //检查越界
    checkElementIndex(index);
    return unlink(node(index));
}
...

//找到该位置Node,也就是链表查找比数组慢的原因
Node<E> node(int index) {

    if (index < (size >> 1)) {
        Node<E> x = first;
        for (int i = 0; i < index; i++)
            x = x.next;
        return x;
    } else {
        Node<E> x = last;
        for (int i = size - 1; i > index; i--)
            x = x.prev;
        return x;
    }
}
...

E unlink(Node<E> x) {
    // assert x != null;
    final E element = x.item;
    final Node<E> next = x.next;
    final Node<E> prev = x.prev;

    //删除中间Node,将左右Node连接
    if (prev == null) {
        first = next;
    } else {
        prev.next = next;
        x.prev = null;
    }

    if (next == null) {
        last = prev;
    } else {
        next.prev = prev;
        x.next = null;
    }

    x.item = null;
    size--;
    modCount++;
    return element;
}
                  
                

3. CopyOnWriteArrayList

向数组中添加删除元素,先用ReentrantLock锁住,然后copy出新数组进行插入删除操作,原数组不做任何改变,就不会影响读取了。

                  
public void add(int index, E element) {
    final ReentrantLock lock = this.lock;
    //先锁住
    lock.lock();
    try {
        Object[] elements = getArray();
        int len = elements.length;
        if (index > len || index < 0)
            throw new IndexOutOfBoundsException("Index: "+index+
                                                ", Size: "+len);
        Object[] newElements;
        int numMoved = len - index;
        if (numMoved == 0)
            newElements = Arrays.copyOf(elements, len + 1);
        else {
            //copy出新数组进行插入操作,原数组不做任何改变,就不会影响get()方法了
            newElements = new Object[len + 1];
            System.arraycopy(elements, 0, newElements, 0, index);
            System.arraycopy(elements, index, newElements, index + 1,
                             numMoved);
        }
        newElements[index] = element;
        //插入后将数组指向copy出的新数组
        setArray(newElements);
    } finally {
        lock.unlock();
    }
}
                  
                

4. HashSet

HashSet其实基于HashMap做了一层封装,HashMap里面放键值对,HashSet放的是对象。

                  
public class HashSet<E>
    extends AbstractSet<E>
    implements Set<E>, Cloneable, java.io.Serializable
{

    private transient HashMap<E,Object> map;

    // PRESENT用来表示HashMap key-value的value
    private static final Object PRESENT = new Object();

    public HashSet() {
        map = new HashMap<>();
    }
    ...

    public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }

    public boolean remove(Object o) {
        return map.remove(o)==PRESENT;
    }
    ...
                  
                

5. CopyOnWriteArraySet

CopyOnWriteArraySet基于CopyOnWriteArrayList实现,CopyOnWriteArraySet具有以下特性:
1. 它最适合于Set大小通常保持很小,只读操作远多于可变操作的情景。
2. 它是线程安全的。
3. 因为通常需要复制整个基础数组,所以可变操作(add()、set() 和 remove() 等等)的开销很大。
4. 迭代器支持hasNext(), next()等不可变操作,但不支持可变 remove()等操作。
5. 使用迭代器进行遍历的速度很快,并且不会与其他线程发生冲突。在构造迭代器时,迭代器依赖于不变的数组快照。

                  
public class CopyOnWriteArraySet<E> extends AbstractSet<E>
        implements java.io.Serializable {

    private final CopyOnWriteArrayList<E> al;

    public CopyOnWriteArraySet() {
        al = new CopyOnWriteArrayList<E>();
    }
    ...

    public boolean contains(Object o) {
        return al.contains(o);
    }

    public boolean remove(Object o) {
        return al.remove(o);
    }

    public boolean add(E e) {
        return al.addIfAbsent(e);
    }
                  
                

6. Queue

(1)分为没有实现的阻塞接口的:
  * PriorityQueue
  * ConcurrentLinkedQueue
(2)实现阻塞接口BlockingQueue的:
  * ArrayBlockingQueue :一个由数组支持的有界队列。
  * LinkedBlockingQueue :一个由链接节点支持的可选有界队列。
  * PriorityBlockingQueue :一个由优先级堆支持的无界优先级队列。
  * DelayQueue :一个由优先级堆支持的、基于时间的调度队列。
  * SynchronousQueue :一个利用BlockingQueue接口的简单聚集(rendezvous)机制。

ConcurrentLinkedQueue

ConcurrentLinkedQueue是典型的基于链表实现的队列,同时基于CAS实现同步操作,以下代码表示该队列结构和链表结点类型:

                  
public class ConcurrentLinkedQueue<E> extends AbstractQueue<E>
        implements Queue<E>, java.io.Serializable {

    //Node结构体声明
    private static class Node<E> {
        volatile E item;
        volatile Node<E> next;
        ...
    }

    //头结点
    private transient volatile Node<E> head;

    //尾结点
    private transient volatile Node<E> tail;
                  
                

ConcurrentLinkedQueue的入队列操作使用了CAS算法,CAS无锁算法的C实现如下:

                  
int compare_and_swap (int* reg, int oldval, int newval) 
{
  ATOMIC();
  int old_reg_val = *reg;
  if (old_reg_val == oldval) 
     *reg = newval;
  END_ATOMIC();
  return old_reg_val;
}
                  
                

ConcurrentLinkedQueue的入队列操作。

                  
public boolean offer(E e) {
    checkNotNull(e);
    final Node<E> newNode = new Node<E>(e);

    for (Node<E> t = tail, p = t;;) {
        Node<E> q = p.next;
        if (q == null) {
            // 通过CAS方法更新最后结点的next属性
            if (p.casNext(null, newNode)) {
                // Successful CAS is the linearization point
                // for e to become an element of this queue,
                // and for newNode to become "live".
                if (p != t) // hop two nodes at a time
                    casTail(t, newNode);  // Failure is OK.
                return true;
            }
            // 与另外的线程CAS竞争失败,重新读取next结点
        }
        else if (p == q)
            // We have fallen off list.  If tail is unchanged, it
            // will also be off-list, in which case we need to
            // jump to head, from which all live nodes are always
            // reachable.  Else the new tail is a better bet.
            p = (t != (t = tail)) ? t : head;
        else
            // Check for tail updates after two hops.
            p = (p != t && t != (t = tail)) ? t : q;
    }
}
...

/**
 * compareAndSwapObject(Object var1, long var2, Object var3, Object var4)
 * var1 操作的对象
 * var2 内存值
 * var3 旧的预期值
 * var4 更新值
 */
boolean casNext(Node<E> cmp, Node<E> val) {
    return UNSAFE.compareAndSwapObject(this, nextOffset, cmp, val);
}
                  
                

ArrayBlockingQueue

ArrayBlockingQueue是典型的基于数组实现的队列,同时使用ReentrantLock实现同步操作。

                  
public class ArrayBlockingQueue<E> extends AbstractQueue<E>
      implements BlockingQueue<E>, java.io.Serializable {

  //数组
  final Object[] items;

  //指向队头
  int takeIndex;

  //指向队尾的下一个位置
  int putIndex;

  //数组长度
  int count;

  //入队列和出队列共用一把锁
  final ReentrantLock lock;
  /** Condition for waiting takes */
  private final Condition notEmpty;
  /** Condition for waiting puts */
  private final Condition notFull;

  //ArrayBlockingQueue是有界的队列
  final int inc(int i) {
      return (++i == items.length) ? 0 : i;
  }

  //ArrayBlockingQueue是有界的队列
  final int dec(int i) {
      return ((i == 0) ? items.length : i) - 1;
  }
                  
                

在这里发现一个很有意思的细节:对于泛型的使用,java集合类都这样来使用泛型

                  
@SuppressWarnings("unchecked")
static <E> E cast(Object item) {
    return (E) item;
}

final E itemAt(int i) {
    return this.<E>cast(items[i]);
}
                  
                

入队列操作:很简单的操作,先锁住-->入队列-->释放锁

                  
public boolean offer(E e) {
    checkNotNull(e);
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        if (count == items.length)
            return false;
        else {
            insert(e);
            return true;
        }
    } finally {
        lock.unlock();
    }
}
                  
                

LinkedBlockingQueue

LinkedBlockingQueue是典型的基于链表实现的队列,因为其对于入队列和出队列分别采用了独立的锁来控制数据同步,这也意味着在高并发的情况下生产者和消费者可以并行地操作队列中的数据,以此来提高整个队列的并发性能。

                  
public class LinkedBlockingQueue<E> extends AbstractQueue<E>
        implements BlockingQueue<E>, java.io.Serializable {
    private static final long serialVersionUID = -6903933977591709194L;

    //Node结构体声明
    static class Node<E> {
        E item;

        Node<E> next;

        Node(E x) { item = x; }
    }

    //链表容量
    private final int capacity;

    //链表长度
    private final AtomicInteger count = new AtomicInteger(0);

    //头结点
    private transient Node<E> head;

    //尾结点
    private transient Node<E> last;

    //读锁
    private final ReentrantLock takeLock = new ReentrantLock();

    //获取读锁条件
    private final Condition notEmpty = takeLock.newCondition();

    //写锁
    private final ReentrantLock putLock = new ReentrantLock();

    //获取写锁条件
    private final Condition notFull = putLock.newCondition();
                  
                

先来看下入队列操作,很简单:

                  
public boolean offer(E e) {
    //不允许空值
    if (e == null) throw new NullPointerException();
    //原子类型count
    final AtomicInteger count = this.count;
    if (count.get() == capacity)
        return false;
    int c = -1;
    Node<E> node = new Node(e);
    final ReentrantLock putLock = this.putLock; //写锁
    putLock.lock();
    try {
        if (count.get() < capacity) {
            enqueue(node);  //入队列
            c = count.getAndIncrement();  //使用CAS同步方法更新链表长度
            if (c + 1 < capacity)
                notFull.signal(); //更新写锁条件
        }
    } finally {
        putLock.unlock(); //写操作完成,释放写锁
    }
    if (c == 0)
        signalNotEmpty(); //更新读锁条件
    return c >= 0;
}
...

private void signalNotEmpty() {
    final ReentrantLock takeLock = this.takeLock;
    takeLock.lock();
    try {
        notEmpty.signal();
    } finally {
        takeLock.unlock();
    }
}
                  
                

ArrayBlockingQueue与LinkedBlockingQueue的不同之处:

1、锁的实现不同

ArrayBlockingQueue队列中的锁是没有分离的,即入队列和出队列用的是同一个锁,由此也意味着两者无法真正并行运行。
LinkedBlockingQueue实现的队列中的锁是分离的,即入队列用的是putLock,出队列是takeLock。

2、内部元素操作不同

ArrayBlockingQueue实现的队列中在入队列和出队列的时候,是直接将枚举对象插入或移除的,即不会产生任何额外的对象实例。
LinkedBlockingQueue实现的队列中在入队列和出队列的时候,需要把枚举对象转换为Node进行插入或移除,这在长时间内需要高效并发地处理大批量数据的系统中,对GC和性能会有一定影响。

3、 队列初始化方式不同

ArrayBlockingQueue实现的队列中必须指定队列的大小。
LinkedBlockingQueue实现的队列中可以不指定队列的大小,默认是Integer.MAX_VALUE。