[人工智能] TensorFlow和Keras训练网络过程中的AttributeError：EXPERIMENTAL_LIST

开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 人工智能 -> TensorFlow和Keras训练网络过程中的AttributeError：EXPERIMENTAL_LIST_DEVICES 和 h5py错误 -> 正文阅读

[人工智能]TensorFlow和Keras训练网络过程中的AttributeError：EXPERIMENTAL_LIST_DEVICES 和 h5py错误

TensorFlow和Keras训练网络时的错误

- 1. AttributeError: module 'tensorflow_core._api.v2.config' has no attribute 'experimental_list_devices'
- 2. 'str' object has no attribute 'decode'以及load模型时找不到checkpoint保存的文件问题

写在前面：这个代码是师兄给我做测试的，师兄写的时候用的是 keras2.2.4 和 tensorflow 1.14。而我装的是 keras 2.3.1 和 TensorFlow 2.1.0，为了方便我并没有另外新建一个conda环境，所以会存在些问题。

1. AttributeError: module ‘tensorflow_core._api.v2.config’ has no attribute ‘experimental_list_devices’

AttributeError: module ‘tensorflow_core._api.v2.config’ has no attribute ‘experimental_list_devices’

这个问题是：获取不到显卡列表。也不知道什么问题（可能是我用的TensorFlow和keras的版本较高），看别人回答是要修改源码。我试过两次都可以完美运行，而且不影响其他原来就可以正常运行的工程。

具体办法：打开报错的代码，修改源码。

源码位置：conda环境位置\Lib\site-packages\keras\backend\tensorflow_backend.py

例如我的：（E:\Anaconda\envs\TF2.1\Lib\site-packages\keras\backend\tensorflow_backend.py ）

将第506行的：

_LOCAL_DEVICES = tf.config.experimental_list_devices()

删掉，然后

改为：

devices = tf.config.list_logical_devices()
_LOCAL_DEVICES = [x.name for x in devices]

前后对比，更改前：

def _get_available_gpus():
    """Get a list of available gpu devices (formatted as strings).

    # Returns
        A list of available GPU devices.
    """
    global _LOCAL_DEVICES
    if _LOCAL_DEVICES is None:
        if _is_tf_1():
            devices = get_session().list_devices()
            _LOCAL_DEVICES = [x.name for x in devices]
        else:
            _LOCAL_DEVICES = tf.config.experimental_list_devices()
    return [x for x in _LOCAL_DEVICES if 'device:gpu' in x.lower()]

更改后：

def _get_available_gpus():
    """Get a list of available gpu devices (formatted as strings).

    # Returns
        A list of available GPU devices.
    """
    global _LOCAL_DEVICES
    if _LOCAL_DEVICES is None:
        if _is_tf_1():
            devices = get_session().list_devices()
            _LOCAL_DEVICES = [x.name for x in devices]
        else:
            # _LOCAL_DEVICES = tf.config.experimental_list_devices()
            devices = tf.config.list_logical_devices()
            _LOCAL_DEVICES = [x.name for x in devices]
    return [x for x in _LOCAL_DEVICES if 'device:gpu' in x.lower()]

2. ‘str’ object has no attribute 'decode’以及load模型时找不到checkpoint保存的文件问题

best_model_file = './Result/Best_CNN_Model_Weights.h5'
best_model = ModelCheckpoint(best_model_file, monitor='val_accuracy', verbose=1, save_best_only=True, save_weights_only=True, mode='auto')

这个问题发生在这一行：

model.load_weights(best_model_file)

加载保存的模型那里，我一开始保存的是h5文件，运行到这一行时发现是找不到相应文件 Error: Unable to open file(…)

（我确实也没发现有这个文件，checkpoint那里没能保存，不知什么问题，有人知道的话还请赐教）。

其实是hdf5格式保存的文件只有权值，所以需要先构建模型，而h5 文件保存的既有权值又有模型的结构。因此呢，在load_weights时是不一样的。

model.load_weights()	# hdf5
model = load_model()	# hdf5或h5

于是我将上面的 .h5 改成 .hdf5 后，文件确实保存也能找到了，运行到load_weights那里，又出现了新的问题：

AttributeError: ‘str’ object has no attribute 'decode’

看很多人的帖子，多数都是在说Python2和Python3的编码问题，但我这里也不涉及decode和encode啊。于是再次考虑版本问题，看了下h5py这个模块的版本，是h5py3.3.0，版本太高了！于是决定卸载，重装2.10.0版的。（我的tf是2.1）

在这里，建议使用

pip uninstall h5py
pip install h5py==2.10.0

非常不建议使用“conda uninstall h5py，然后再conda安装”的办法，因为我试过，conda会卸载很多东西，也会重装很多东西，下载过程真的要命。即使用的国内的几个镜像源，即使离线安装都识别不了（操作也没错），导致我整个环境重新装，又经历了一遍以上过程。（早知道还不如一开始直接配置个tensorflow1.14的环境）

总结：版本问题真是坑。

写在最后：第一次发错误记录，希望对您有帮助，如果写的有什么不妥，欢迎指正交流！谢谢，大家一起进步！

人工智能最新文章

2022吴恩达机器学习课程——第二课（神经网

第十五章规则学习

FixMatch: Simplifying Semi-Supervised Le

数据挖掘Java——Kmeans算法的实现

大脑皮层的分割方法

【翻译】GPT-3是如何工作的

论文笔记:TEACHTEXT: CrossModal Generaliz

python从零学（六）

详解Python 3.x 导入(import)