IT数码 购物 网址 头条 软件 日历 阅读 图书馆
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
图片批量下载器
↓批量下载图片,美女图库↓
图片自动播放器
↓图片自动播放器↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁
 
   -> 开发工具 -> Jupyter关联规则挖掘-莫名其妙的问题 -> 正文阅读

[开发工具]Jupyter关联规则挖掘-莫名其妙的问题

在使用Jupyter练习关联规则挖掘时出现了一些莫名其妙的问题,已解决,记录一下。

给的例子如下:

from efficient_apriori import apriori
import pandas as pd
def data_generator(filename):
    """
    Data generator, needs to return a generator to be called several times.
    """
    def data_gen():
        with open(filename) as file:
            for line in file:
                yield tuple(k.strip() for k in line.split(','))
                #transactions.append(list(line.strip().split(',')))
    return data_gen
# file_path = "https://github.com/seratch/apriori.js/blob/master/dataset.csv"
transactions = data_generator("dataset.csv")
itemsets, rules = apriori(transactions, min_support=0.5,  min_confidence=1)
itemsets
rules
transactions_2 = data_generator("store_data.csv")
itemsets_2, rules_2 = apriori(transactions_2, min_support=0.0045,  min_confidence=0.2)
rules
for rule in rules[:10]:
    print(rule)

然后在第四个代码块下报错:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-17-5d285d16ed6b> in <module>()
      1 # file_path = "https://github.com/seratch/apriori.js/blob/master/dataset.csv"
      2 transactions = data_generator("dataset.csv")
----> 3 itemsets, rules = apriori(transactions, min_support=0.5,  min_confidence=1)

C:\ProgramData\Anaconda3\lib\site-packages\efficient_apriori\apriori.py in apriori(transactions, min_support, min_confidence, max_length, verbosity, output_transaction_ids)
     61         max_length,
     62         verbosity,
---> 63         output_transaction_ids=True,
     64     )
     65 

C:\ProgramData\Anaconda3\lib\site-packages\efficient_apriori\itemsets.py in itemsets_from_transactions(transactions, min_support, max_length, verbosity, output_transaction_ids)
    276 
    277     # Store in transaction manager
--> 278     manager = TransactionManager(transactions)
    279 
    280     # STEP 1 - Generate all large itemsets of size 1

C:\ProgramData\Anaconda3\lib\site-packages\efficient_apriori\itemsets.py in __init__(self, transactions)
     29 
     30         # Populate
---> 31         for i, transaction in enumerate(transactions):
     32             for item in transaction:
     33                 self._indices_by_item[item].add(i)

TypeError: 'function' object is not iterable

查了一下给的例子是官网efficient-apriori的1.0.0以及其他一些版本的?:efficient-apriori · PyPI

最新版本没有这个例子:efficient-apriori · PyPI

研究了好几天,结果发现return那儿加个括号就行,即把return data_gen修改为return data_gen()……

最终稍加修改,包括一些路径和参数,如下:

from efficient_apriori import apriori
import pandas as pd
def data_generator(filename):
    """
    Data generator, needs to return a generator to be called several times.
    """
    def data_gen():
        with open(filename) as file:
            for line in file:
                yield tuple(k.strip() for k in line.split(','))
                #transactions.append(list(line.strip().split(',')))
    return data_gen()
#https://pypi.org/project/efficient-apriori/1.0.0/,运行下一句报错,return data_gen修改为return data_gen()解决
# file_path = "https://github.com/seratch/apriori.js/blob/master/dataset.csv"
transactions = data_generator(r"C:\Users\userab\Desktop\第二次实验\dataset.csv")
itemsets, rules = apriori(transactions, min_support=0.1,  min_confidence=1)
itemsets
{1: {('Brooklyn',): 216,
  ('',): 1413,
  ('MBE',): 953,
  ('WBE',): 678,
  ('BLACK',): 427,
  ('ASIAN',): 287,
  ('New York',): 419,
  ('HISPANIC',): 233,
  ('NON-MINORITY',): 426},
 2: {('', 'ASIAN'): 287,
  ('', 'BLACK'): 423,
  ('', 'Brooklyn'): 215,
  ('', 'HISPANIC'): 231,
  ('', 'MBE'): 946,
  ('', 'NON-MINORITY'): 426,
  ('', 'New York'): 418,
  ('', 'WBE'): 671,
  ('ASIAN', 'MBE'): 284,
  ('BLACK', 'MBE'): 427,
  ('Brooklyn', 'MBE'): 160,
  ('HISPANIC', 'MBE'): 233,
  ('MBE', 'New York'): 242,
  ('MBE', 'WBE'): 240,
  ('NON-MINORITY', 'New York'): 168,
  ('NON-MINORITY', 'WBE'): 426,
  ('New York', 'WBE'): 249},
 3: {('', 'ASIAN', 'MBE'): 284,
  ('', 'BLACK', 'MBE'): 423,
  ('', 'Brooklyn', 'MBE'): 159,
  ('', 'HISPANIC', 'MBE'): 231,
  ('', 'MBE', 'New York'): 241,
  ('', 'MBE', 'WBE'): 233,
  ('', 'NON-MINORITY', 'New York'): 168,
  ('', 'NON-MINORITY', 'WBE'): 426,
  ('', 'New York', 'WBE'): 248,
  ('NON-MINORITY', 'New York', 'WBE'): 168},
 4: {('', 'NON-MINORITY', 'New York', 'WBE'): 168}}
rules
[{ASIAN} -> {},
 {NON-MINORITY} -> {},
 {BLACK} -> {MBE},
 {HISPANIC} -> {MBE},
 {NON-MINORITY} -> {WBE},
 {ASIAN, MBE} -> {},
 {, BLACK} -> {MBE},
 {, HISPANIC} -> {MBE},
 {NON-MINORITY, New York} -> {},
 {NON-MINORITY, WBE} -> {},
 {, NON-MINORITY} -> {WBE},
 {NON-MINORITY} -> {, WBE},
 {NON-MINORITY, New York} -> {WBE},
 {NON-MINORITY, New York, WBE} -> {},
 {, NON-MINORITY, New York} -> {WBE},
 {NON-MINORITY, New York} -> {, WBE}]
transactions_2 = data_generator(r"C:\Users\userab\Desktop\第二次实验\store_data.csv")
itemsets_2, rules_2 = apriori(transactions_2, min_support=0.0045,  min_confidence=0.2)
rules
[{ASIAN} -> {},
 {NON-MINORITY} -> {},
 {BLACK} -> {MBE},
 {HISPANIC} -> {MBE},
 {NON-MINORITY} -> {WBE},
 {ASIAN, MBE} -> {},
 {, BLACK} -> {MBE},
 {, HISPANIC} -> {MBE},
 {NON-MINORITY, New York} -> {},
 {NON-MINORITY, WBE} -> {},
 {, NON-MINORITY} -> {WBE},
 {NON-MINORITY} -> {, WBE},
 {NON-MINORITY, New York} -> {WBE},
 {NON-MINORITY, New York, WBE} -> {},
 {, NON-MINORITY, New York} -> {WBE},
 {NON-MINORITY, New York} -> {, WBE}]
for rule in rules[:10]:
    print(rule)
{ASIAN} -> {} (conf: 1.000, supp: 0.202, lift: 1.005, conv: 4929577.465)
{NON-MINORITY} -> {} (conf: 1.000, supp: 0.300, lift: 1.005, conv: 4929577.465)
{BLACK} -> {MBE} (conf: 1.000, supp: 0.301, lift: 1.490, conv: 328873239.437)
{HISPANIC} -> {MBE} (conf: 1.000, supp: 0.164, lift: 1.490, conv: 328873239.437)
{NON-MINORITY} -> {WBE} (conf: 1.000, supp: 0.300, lift: 2.094, conv: 522535211.268)
{ASIAN, MBE} -> {} (conf: 1.000, supp: 0.200, lift: 1.005, conv: 4929577.465)
{, BLACK} -> {MBE} (conf: 1.000, supp: 0.298, lift: 1.490, conv: 328873239.437)
{, HISPANIC} -> {MBE} (conf: 1.000, supp: 0.163, lift: 1.490, conv: 328873239.437)
{NON-MINORITY, New York} -> {} (conf: 1.000, supp: 0.118, lift: 1.005, conv: 4929577.465)
{NON-MINORITY, WBE} -> {} (conf: 1.000, supp: 0.300, lift: 1.005, conv: 4929577.465)

最后吐槽一下,倒数第三个代码块,你这是不是没输出结果啊?

  开发工具 最新文章
Postman接口测试之Mock快速入门
ASCII码空格替换查表_最全ASCII码对照表0-2
如何使用 ssh 建立 socks 代理
Typora配合PicGo阿里云图床配置
SoapUI、Jmeter、Postman三种接口测试工具的
github用相对路径显示图片_GitHub 中 readm
Windows编译g2o及其g2o viewer
解决jupyter notebook无法连接/ jupyter连接
Git恢复到之前版本
VScode常用快捷键
上一篇文章      下一篇文章      查看所有文章
加:2021-11-11 12:55:01  更:2021-11-11 12:57:07 
 
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁

360图书馆 购物 三丰科技 阅读网 日历 万年历 2024年11日历 -2024/11/15 21:02:41-

图片自动播放器
↓图片自动播放器↓
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
图片批量下载器
↓批量下载图片,美女图库↓
  网站联系: qq:121756557 email:121756557@qq.com  IT数码