IT数码 购物 网址 头条 软件 日历 阅读 图书馆
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
图片批量下载器
↓批量下载图片,美女图库↓
图片自动播放器
↓图片自动播放器↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁
 
   -> 大数据 -> ERROR SparkContext: Error initializing SparkContext. org.apache.spark.SparkException: Could not pars -> 正文阅读

[大数据]ERROR SparkContext: Error initializing SparkContext. org.apache.spark.SparkException: Could not pars

ERROR SparkContext: Error initializing SparkContext.

org.apache.spark.SparkException: Could not parse Master URL: ‘<pyspark.conf.SparkConf object at 0x7fb5da40a850>’

报错信息

22/03/14 10:58:26 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
22/03/14 10:58:26 INFO SparkContext: Running Spark version 3.0.0
22/03/14 10:58:26 INFO ResourceUtils: ==============================================================
22/03/14 10:58:26 INFO ResourceUtils: Resources for spark.driver:

22/03/14 10:58:26 INFO ResourceUtils: ==============================================================
22/03/14 10:58:26 INFO SparkContext: Submitted application: Spark01_FindWord.py
22/03/14 10:58:26 INFO SecurityManager: Changing view acls to: root
22/03/14 10:58:26 INFO SecurityManager: Changing modify acls to: root
22/03/14 10:58:26 INFO SecurityManager: Changing view acls groups to: 
22/03/14 10:58:26 INFO SecurityManager: Changing modify acls groups to: 
22/03/14 10:58:26 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
22/03/14 10:58:27 INFO Utils: Successfully started service 'sparkDriver' on port 44307.
22/03/14 10:58:27 INFO SparkEnv: Registering MapOutputTracker
22/03/14 10:58:27 INFO SparkEnv: Registering BlockManagerMaster
22/03/14 10:58:27 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
22/03/14 10:58:27 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
22/03/14 10:58:27 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
22/03/14 10:58:27 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-a4554f81-46f5-4499-b6fd-2d5a9005552b
22/03/14 10:58:27 INFO MemoryStore: MemoryStore started with capacity 366.3 MiB
22/03/14 10:58:27 INFO SparkEnv: Registering OutputCommitCoordinator
22/03/14 10:58:27 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
22/03/14 10:58:27 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
22/03/14 10:58:27 WARN Utils: Service 'SparkUI' could not bind on port 4042. Attempting port 4043.
22/03/14 10:58:27 WARN Utils: Service 'SparkUI' could not bind on port 4043. Attempting port 4044.
22/03/14 10:58:27 WARN Utils: Service 'SparkUI' could not bind on port 4044. Attempting port 4045.
22/03/14 10:58:27 INFO Utils: Successfully started service 'SparkUI' on port 4045.
22/03/14 10:58:27 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://node01:4045
22/03/14 10:58:27 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Could not parse Master URL: '<pyspark.conf.SparkConf object at 0x7fb5da40a850>'
	at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2924)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:528)
	at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:238)
	at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
	at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:748)
22/03/14 10:58:27 INFO SparkUI: Stopped Spark web UI at http://node01:4045
22/03/14 10:58:27 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
22/03/14 10:58:27 INFO MemoryStore: MemoryStore cleared
22/03/14 10:58:27 INFO BlockManager: BlockManager stopped
22/03/14 10:58:28 INFO BlockManagerMaster: BlockManagerMaster stopped
22/03/14 10:58:28 WARN MetricsSystem: Stopping a MetricsSystem that is not running
22/03/14 10:58:28 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
22/03/14 10:58:28 INFO SparkContext: Successfully stopped SparkContext
Traceback (most recent call last):
  File "/ext/servers/spark3.0/MyCode/Spark01_FindWord.py", line 4, in <module>
    sc = SparkContext(conf)
  File "/ext/servers/spark3.0/python/lib/pyspark.zip/pyspark/context.py", line 131, in __init__
  File "/ext/servers/spark3.0/python/lib/pyspark.zip/pyspark/context.py", line 193, in _do_init
  File "/ext/servers/spark3.0/python/lib/pyspark.zip/pyspark/context.py", line 310, in _initialize_context
  File "/ext/servers/spark3.0/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1569, in __call__
  File "/ext/servers/spark3.0/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: org.apache.spark.SparkException: Could not parse Master URL: '<pyspark.conf.SparkConf object at 0x7fb5da40a850>'
	at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2924)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:528)
	at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:238)
	at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
	at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:748)

22/03/14 10:58:28 INFO ShutdownHookManager: Shutdown hook called
22/03/14 10:58:28 INFO ShutdownHookManager: Deleting directory /tmp/spark-32987338-486b-448a-b9f4-3cf4d68aa85d
22/03/14 10:58:28 INFO ShutdownHookManager: Deleting directory /tmp/spark-2160f024-2430-4ab1-b5ed-5c3afc402351
(Spark_python) root@node01:/ext/servers/spark3.0/MyCode# spark-submit Spark01_FindWord.py 

修改方法

错误代码

from pyspark import SparkConf, SparkContext

conf = SparkConf().setMaster("local").setAppName("MyApp")
sc = SparkContext(conf)
hdfs_file = "hdfs://node01:/user/hadoop/test06.txt"
hdfs_rdd = sc.textFile(hdfs_file).count()
print(hdfs_rdd)

修改之后的代码

from pyspark import SparkConf, SparkContext

conf = SparkConf().setMaster("local").setAppName("MyApp")
sc = SparkContext(conf=conf)
hdfs_file = "hdfs://node01:/user/hadoop/test06.txt"
hdfs_rdd = sc.textFile(hdfs_file).count()
print(hdfs_rdd)


总结

第一个就是SparkContext这个方法的传参问题,我们如果不进行指定的话,你直接写conf是没有办法进行默认传参的,所以后面就直接报错了,所以我们要指定参数

  大数据 最新文章
实现Kafka至少消费一次
亚马逊云科技:还在苦于ETL?Zero ETL的时代
初探MapReduce
【SpringBoot框架篇】32.基于注解+redis实现
Elasticsearch:如何减少 Elasticsearch 集
Go redis操作
Redis面试题
专题五 Redis高并发场景
基于GBase8s和Calcite的多数据源查询
Redis——底层数据结构原理
上一篇文章      下一篇文章      查看所有文章
加:2022-03-15 22:37:17  更:2022-03-15 22:41:00 
 
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁

360图书馆 购物 三丰科技 阅读网 日历 万年历 2024年11日历 -2024/11/24 7:47:26-

图片自动播放器
↓图片自动播放器↓
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
图片批量下载器
↓批量下载图片,美女图库↓
  网站联系: qq:121756557 email:121756557@qq.com  IT数码