参考了许多文章,包括b站up主老程序员老关、csdn:Liu Zhian等等。
1.将osm文件有效数据提取到csv
osm文件有两大部分,分开操作,将其分为两个文件 分成两个文件: 下面是两个文件
<?xml version='1.0' encoding='UTF-8'?>
<nodes>
<node id='-101752' lat='40.6592449' lon='109.7662487' />
<node id='-101753' lat='40.6587893' lon='109.7727179' />
<node id='-101755' lat='40.6585103' lon='109.7767641' />
<node id='-101756' lat='40.6581603' lon='109.7838497' />
<node id='-101757' lat='40.6581441' lon='109.7841119' />
<node id='-101758' lat='40.6459848' lon='109.886676' />
<node id='-101759' lat='40.6456042' lon='109.8865427' />
<node id='-101760' lat='40.6426309' lon='109.8838162' />
<node id='-101761' lat='40.6425291' lon='109.8836902' />
<node id='-101762' lat='40.6424558' lon='109.8835789' />
<node id='-101763' lat='40.6424222' lon='109.8833187' />
<node id='-101764' lat='40.6426996' lon='109.8780347' />
</nodes>
<?xml version='1.0' encoding='UTF-8'?>
<links>
<way id='-101783' action='modify'>
<nd ref='-101752' />
<nd ref='-101753' />
<nd ref='-101755' />
<nd ref='-101756' />
<nd ref='-101757' />
<tag k='highway' v='primary' />
<tag k='osm_id' v='13456925' />
<tag k='z_order' v='7' />
</way>
<way id='-101785' action='modify'>
<nd ref='-101758' />
<nd ref='-101759' />
<nd ref='-101760' />
<nd ref='-101761' />
<nd ref='-101762' />
<nd ref='-101763' />
<nd ref='-101764' />
<nd ref='-101767' />
<nd ref='-101768' />
<nd ref='-101769' />
<nd ref='-101771' />
<nd ref='-101772' />
<nd ref='-101773' />
<nd ref='-101774' />
<nd ref='-101775' />
<nd ref='-101777' />
<nd ref='-101778' />
<tag k='highway' v='primary' />
<tag k='osm_id' v='13457299' />
<tag k='z_order' v='7' />
</way>
</links>
下面是python处理代码:
import xml.etree.ElementTree as ET
tree = ET.parse('nodes.xml')
print(type(tree))
root = tree.getroot()
print(type(root))
print(root.tag)
with open("nodes.txt","w")as f:
for index,child,in enumerate(root):
f.write('{0}'.format(child.attrib))
f.write('\n')
from xml.dom.minidom import parse
def readXML():
domTree = parse("./links.xml")
rootNode = domTree.documentElement
print(rootNode.nodeName)
links = rootNode.getElementsByTagName("way")
with open("links.txt","w")as f:
f.write("****所有way信息****")
f.write('\n')
for way in links:
ways = way.getElementsByTagName("nd")
if way.hasAttribute("id"):
f.write('\n')
f.write('id:{0}'.format(way.getAttribute("id")))
for nd in ways:
if nd.hasAttribute("ref"):
f.write('ref:{0}'.format(nd.getAttribute("ref")))
if __name__ == '__main__':
readXML()
将生成的结果处理一下,去掉无用字符,就得到数据啦,再把数据做成逗号分隔格式,就可存入csv保存了。
2.将提取出的数据转换为xml格式
提取出的数据中,nodes部分需要将地理坐标系转换为投影坐标系;links部分需要将站点编号与地理坐标对应,从而求出每条路径长度。
nodes与links的目标格式:
link部分的from-to数据处理:
from xml.etree.ElementTree import Element, ElementTree, tostring,SubElement
from itertools import islice
import argparse
import os
import csv
with open('links.txt','r')as f:
reader = csv.reader(f)
count=len(open(r"links.txt",'rU').readlines())
print("行数:",count)
rows=[row for row in reader]
print(rows[0])
print(len(rows[0]))
print(type(rows[0]))
print(rows[0][1])
with open ("links2.txt","w")as f:
for i in range(count):
length=len(rows[i])-1
for j in range(length):
f.write('{0}'.format(rows[i][j])+','+'{0}'.format(rows[i][j+1]))
f.write('\n')
下午再写
|