[数据结构与算法] Leetcode 每日一题 -- 217. 存在重复元素 -- 简单

开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 数据结构与算法 -> Leetcode 每日一题 -- 217. 存在重复元素 -- 简单 -> 正文阅读

[数据结构与算法]Leetcode 每日一题 -- 217. 存在重复元素 -- 简单

文章目录

1. 题目描述

给定一个整数数组，判断是否存在重复元素。

如果存在一值在数组中出现至少两次，函数返回 true 。如果数组中每个元素都不相同，则返回 false

示例 1:

输入: [1,2,3,1]
输出: true

示例 2:

输入: [1,2,3,4]
输出: false

示例 3:

输入: [1,1,1,3,3,4,3,2,4,2]
输出: true

2. 解题想法

我看到题目的第一反应是将元素装入 List 集合中，在放入元素时判断，元素是否已经存在，存在就直接返回 true ，不存在就放入元素，不管三七二十一先跑起来再说，初步测试能通过，提交时，直接挂了。。。，然后在考虑其他的容器，map逻辑相同，但是是可以执行的，set天生支持重复元素判断，但是之前没有仔细看过set 是如何判断的~~，最后看了其他的答案，发现使用数组辅助类 Arrays 进行排序后，在对数组进行一次遍历，也很快。

所以在实现之后，需要看下问题出现的原因进行自查：

集合的 contains方法为什么会慢， HashMap 的 containsKey()为啥快
set 集合是如何判断元素重复的，为什么快
Arrays 内部排序逻辑

3. 代码实现

通过 List 集合的 contains 方法判断，存在就返回true，不存在就加入到集合中, 运行结果没有

    /**
     * 通过集合List contains 方法判断，但是当输入元素较多时，存在性能问题
     * */
    public boolean containsDuplicateWithList(int[] nums) {
        if(null == nums || nums.length <= 1){
            return false;
        }

        List<Integer> list = new ArrayList<>();

        for(int i = 0 ; i < nums.length ; i++){
            // contains 方法本身会进行内部整体循环直到找到为止，当元素在很多时会有问题
            if(list.contains(nums[i])){
                return true;
            }else{
                list.add(nums[i]);
            }
        }

        return false;
    }

通过Map 的key 判断，有点类型set

    public boolean containsDuplicateWithMap(int[] nums) {
        if(null == nums || nums.length == 0){
            return false;
        }

        Map<Integer,String> map = new HashMap<>();

        for(int i = 0 ; i < nums.length ; i++){
            // contains 方法本身会进行内部整体循环直到找到为止，当元素在很多时会有问题
            if (map.containsKey(nums[i])){
                return  true;
            }else{
                map.put(nums[i],"");
            }

        }

        return false;
    }

执行用时：9 ms

内存消耗：44.1 MB

通过Set 判断

    public boolean containDuplicateWithSet(int[] nums){

        if(null == nums || nums.length == 0){
            return false;
        }

        Set<Integer> set = new HashSet<>();

        for (int i = 0 ; i < nums.length ; i++){
            if (!set.add(nums[i])){
                return true;
            }
        }

        return false;
    }

执行用时：5 ms

内存消耗：42.3 MB

通过排序后比较

 public boolean containDuplicateWithSort(int[] nums){

        if(null == nums || nums.length < 2){
            return false;
        }

        Arrays.sort(nums);

        for (int i = 1 ; i < nums.length ; i++){
            if (nums[i] == nums[i-1]){
                return true;
            }
        }

        return false;
    }

执行用时：3 ms

内存消耗：41.6 MB

4 问题分析

为什么 ArrayList 的container 方法比较慢？

    public int indexOf(Object o) {
        if (o == null) {
            for (int i = 0; i < size; i++)
                if (elementData[i]==null)
                    return i;
        } else {
            for (int i = 0; i < size; i++)
                if (o.equals(elementData[i]))
                    return i;
        }
        return -1;
    }

在contains 方法内部也是用过遍历所有元素进行判断元素是否相等的,由于外部循环n 次，内部也会循环n次，所以时间复杂度为
$f(x) =O(n^2)$

HashMap 的containsKey 方法又是如何判断的呢？

    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

可以看到是先hash 函数定位到数组位置，然后遍历红黑树，或者遍历链表判断key是否相等，时间复杂度为
$f (x) = O (n)$

set是在通过add 方法添加元素的时候，是有返回值的，之前没有注意，当添加的元素set容器中没有时，返回true，存在时返回false。那set又是如何判断元素存不存在，还比较高效的呢？

其实HashSet本身是借助 HashMap 来实现的，通过Set元素作为 Map 的key ，value为同一个Object，所以在add时，调用的HashMap 的Put方法

  final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }