CREATE TABLE `dwd.dwd_all_devicelocus_new`(
`projectid` string,
`vin` string,
`equipmentid` string,
`starttime` string,
`kmtotal` string,
`lat` string,
`lon` string,
`endtime` string,
`e_kmtotal` string,
`e_lat` string,
`e_lon` string,
`data` map<string,string>,
`tpdsn` string)
PARTITIONED BY (
`pdate` string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n'
当创建表的字段中有map复杂类型的时候,我们在装载数据的时候,如果使用的是下边的这种方式
load data local inpath "/var/lib/hadoop-hdfs/test/hm.txt" into table dwd.dwd_all_devicelocus_new PARTITIONED(pdate='20220406')
当查询数据的时候,可能会出现data字段的数据和装载的数据出现一定的差异,如下所示
?
?为了解决这个问题,我们需要重新建一张新的表,之后通过insert into? select * from 新表的方式,将数据插入到表中
话不多说,上代码
CREATE TABLE `dwd.dwd_all_devicelocus_new`(
`projectid` string,
`vin` string,
`equipmentid` string,
`starttime` string,
`kmtotal` string,
`lat` string,
`lon` string,
`endtime` string,
`e_kmtotal` string,
`e_lat` string,
`e_lon` string,
`data` string,
`tpdsn` string)
PARTITIONED BY (
`pdate` string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n'
load data local inpath "/var/lib/hadoop-hdfs/test/hm.txt" into table dwd.dwd_all_devicelocus_new PARTITION(pdate='20220406')
注意:这个地方data字段是string类型。
insert overwrite table dwd.dwd_all_devicelocus_new PARTITION(pdate='20220408')
select
`projectid` ,
`vin` ,
`equipmentid` ,
`starttime` ,
`kmtotal` ,
`lat` ,
`lon` ,
`endtime` ,
`e_kmtotal` ,
`e_lat` ,
`e_lon` ,
str_to_map(regexp_replace(regexp_replace(regexp_replace(data,'\"',''),'\\{',''),'\\}',''), ',', ':') ,
`tpdsn` from dwd.dwd_all_devicelocus_new
利用str_to_map将数据类型进行了转换。
最后查询插入的结果和我们插入的数据是一致的
|