一.背景概述
本周接到一个需求,需要将SQLite中一张表的数据迁移到MySQL库里,拆分为多张表存放数据。SQLite中的迁移对象表为:tm_world_region,其表结构如下:
CREATE TABLE tm_world_region(id integer not null,name varchar(60),districtid varchar(20),fullname varchar(60),geoloc blob, lng double default 0, lat double default 0, pic varchar, desc text, objectid varchar(30), level varchar(20), show_name varchar(60),constraint tm_world_region_key primary key (id));
在迁移过程中,需要将列名为geoloc,类型为Blob的数据存放到MySQL表lbs_rgc_wkt的列名为wkt,类型为geometry的列中。geoloc为gzip压缩的二进制类型数据,属于空间地理类型数据,而非图片和音频数据。
二.过程
简单的将这个任务分析一下,要完成数据迁移,首先得从SQLite库中读取数据,然后将读到的数据解压,转换为字符串类型,最后使用MySQL的GeomFromText(旧版数据库用GeomFromText,新版的MySQL8.0已经更名为ST_GeomFromText))将数据转换存入到wkt列中。
首先面临的问题就是读取,笔者尝试了多种类型,比如:
String geoLoc = rs.getString("geoloc");
将Blob转换为字符串类型,不会报错,但是控制台打印输出啥也没有。
换个思路,先拿到Blob,然后再转其他类型,可不可以?比如:
Blob geoLoc = rs.getBlob("geoloc");
测试运行,会抛异常,如图:
?那么转换为流,可不可以?比如:
InputStream inputStream = rs.getBinaryStream("geoloc");
测试发现,还是不行,结果如图:
那么使用SQLite的函数呢?比如hex()和quote()这两个函数都可以成功读取geoloc字段的数据,得到字符串数据,比如:
X'1F8B080000000000000085954D8E93310C86AF32CB562A551CFFE60408696058C082FB5F0427B6F365A419A8BA4A9EDAAF5FDBE9F7DFAFBFBEFD7C7BFDF3F5EDC7ED76FB22F01413117B017DB64E44D0D607D01EF356D11A8FFCC80BC873D850B03C09C8FA0029A82F68C8208A508D7A502A009894AE84FEAB4AD82C63793E81A428A8A17D9F7052283B9648A84769A38225C586458D803C3C26C49A94F6DEF30C1685D838D54393A00600F4CB08A75CBB94378A9FC482C65D4B438927D0560E067494986E6157AD7260250441B3ED733A2F384AD548CA25ECE8145DF466D0EE5905C332A25938A153D87BF5D3C08A9F101DEA5783005DBC968605893A145E8EA32D9A97AD173E307A3CFC888BCA1862364A0D2ECAC8E557193DA9E9CA69662317BD2B4BC88357191C19A931575F2C335AE7CCD49546F5D6F8F1CF45B9DF1FB14AEA7BA0D5469EA434F36F89C966A04F929C8A998EA91909CDF69FA3CBE43F835A4A1FC7329CE2D6CBA9910D88C1BB5E33158C1F541ACEB10666B653B29BE77B93942485AE702F5224B4A3B25C37F4F87A8E0F4FCBB68D015157CB4B2FB11259EE0FF9C9DEEBA0DAEEA6EF5250BEC25CEE6862D7CAD6C6B2B73DB48E4E7B6E3463B889834EB13E6E6E6B9DE4DBA0DCAD9C5E6F96F8B0347D5F926293F2B5AD8CDADCC47247B2369EAFCFE9A1777D3B003D33923764C70F6A4E7DC6EADBEA6BD224A96B756BD2E66B546EF61476CD49937CE3E7634AA73019BAE7B1E5D3A684D75BBA9644661BEAF258698B18E47F0A52438D1FFE9B4C8A762A00AE0591FD6446AC6941C5927C87FEB76CF7FBFD2F494C949BEF060000
或者
1F8B080000000000000085954D8E93310C86AF32CB562A551CFFE60408696058C082FB5F0427B6F365A419A8BA4A9EDAAF5FDBE9F7DFAFBFBEFD7C7BFDF3F5EDC7ED76FB22F01413117B017DB64E44D0D607D01EF356D11A8FFCC80BC873D850B03C09C8FA0029A82F68C8208A508D7A502A009894AE84FEAB4AD82C63793E81A428A8A17D9F7052283B9648A84769A38225C586458D803C3C26C49A94F6DEF30C1685D838D54393A00600F4CB08A75CBB94378A9FC482C65D4B438927D0560E067494986E6157AD7260250441B3ED733A2F384AD548CA25ECE8145DF466D0EE5905C332A25938A153D87BF5D3C08A9F101DEA5783005DBC968605893A145E8EA32D9A97AD173E307A3CFC888BCA1862364A0D2ECAC8E557193DA9E9CA69662317BD2B4BC88357191C19A931575F2C335AE7CCD49546F5D6F8F1CF45B9DF1FB14AEA7BA0D5469EA434F36F89C966A04F929C8A998EA91909CDF69FA3CBE43F835A4A1FC7329CE2D6CBA9910D88C1BB5E33158C1F541ACEB10666B653B29BE77B93942485AE702F5224B4A3B25C37F4F87A8E0F4FCBB68D015157CB4B2FB11259EE0FF9C9DEEBA0DAEEA6EF5250BEC25CEE6862D7CAD6C6B2B73DB48E4E7B6E3463B889834EB13E6E6E6B9DE4DBA0DCAD9C5E6F96F8B0347D5F926293F2B5AD8CDADCC47247B2369EAFCFE9A1777D3B003D33923764C70F6A4E7DC6EADBEA6BD224A96B756BD2E66B546EF61476CD49937CE3E7634AA73019BAE7B1E5D3A684D75BBA9644661BEAF258698B18E47F0A52438D1FFE9B4C8A762A00AE0591FD6446AC6941C5927C87FEB76CF7FBFD2F494C949BEF060000
但是在导入数据时,会报错,如图:
?
那么,该怎么办呢?我主动找同事求助,同事给出了思路:这是个二进制数据,你先读到java程序里,然后gzip解压缩,最后转字符串。
再次尝试,代码如下:
public List<String> selectToList() {
Connection connection = null;
List<String> selectResList = new ArrayList<>();
try {
connection = SqLiteConnection.getConnection();
connection.setAutoCommit(false);
String sql = "SELECT objectid,geoloc FROM tm_world_region;";
PreparedStatement stmt = connection.prepareStatement(sql);
ResultSet rs = stmt.executeQuery();
while (rs.next()) {
String objectId = rs.getString("objectid");
byte[] bytes = rs.getBytes("geoloc");
System.out.println(bytes.length);
byte[] b = null;
// 读取字节数组
ByteArrayInputStream bis = new ByteArrayInputStream(bytes);
// gzip解压字节数组
GZIPInputStream gzip = new GZIPInputStream(bis);
byte[] storage = new byte[1024];
int num = -1;
ByteArrayOutputStream bos = new ByteArrayOutputStream();
while ((num = gzip.read(storage, 0, storage.length)) != -1) {
bos.write(storage, 0, num);
}
// 获取解压后的字节数组,并转换为字符串
b = bos.toByteArray();
System.out.println(b.length);
bos.flush();
bos.close();
gzip.close();
bis.close();
String geoLoc = String.valueOf(b);
System.out.println(geoLoc);
String content = objectId + "\t" + geoLoc;
selectResList.add(content);
}
rs.close();
stmt.close();
connection.close();
} catch (Exception e) {
e.printStackTrace();
}
return selectResList;
}
?但是获得的数据明显不符合要求,测试也无法通过。
最后还是同事给出了解决方法,代码如下:
public List<String> selectToList() {
Connection connection = null;
List<String> selectResList = new ArrayList<>();
try {
connection = SqLiteConnection.getConnection();
connection.setAutoCommit(false);
String sql = "SELECT objectid,geoloc FROM tm_world_region;";
PreparedStatement stmt = connection.prepareStatement(sql);
ResultSet rs = stmt.executeQuery();
while (rs.next()) {
String objectId = rs.getString("objectid");
byte[] bytes = rs.getBytes("geoloc");
System.out.println(bytes.length);
ByteArrayInputStream byteInput = new ByteArrayInputStream(bytes);
BufferedInputStream bis = new BufferedInputStream(new GZIPInputStream(byteInput));
BufferedReader reader = new BufferedReader(new InputStreamReader(bis));
StringBuilder builder = new StringBuilder();
String str = "";
while ((str = reader.readLine()) != null) {
builder.append(str);
}
reader.close();
bis.close();
byteInput.close();
String geoLoc = builder.toString();
System.out.println(geoLoc);
String content = objectId + "\t" + geoLoc;
FileUtils.appendInfoToFile(dataOutput, content);
selectResList.add(content);
}
rs.close();
stmt.close();
connection.close();
} catch (Exception e) {
e.printStackTrace();
}
return selectResList;
}
这段代码可以直接将Blob类型数据转换为MULTIPOLYGON空间类型数据,之后使用GeomFromText函数即可将数据导入到库中。
需要注意的是,在使用第三方软件比如secureCRT通过跳板机进入服务器操做数据导入时,选择一条SQL插入语句在主库测试,会发现数据导入后再查询,会产生乱码问题。而且笔者测试,即使是小批数据量直接使用insert插入,也无法完成。还有查询相关空间数据也需要使用AsText()函数才能正确显示MULTIPOLYGON数据。
三.结尾
本篇算是一个笔记吧,记录了工作中一个需求的小小细节。在完成该需求的时候,也曾查找了很多资料,走了不少弯路。所以记录下这个大概过程,以供自己以后参考。也希望能对大家有所帮助。
|