MongoDB是NoSQL数据库的典型代表,支持文档结构的存储方式数据存储和使用更为便捷,数据存取效率也很高,但计算能力较弱,实际使用中涉及MongoDB的计算尤其是复杂计算会很麻烦,这就需要具备强计算能力的数据处理引擎与其配合。
开源集算器SPL是一款专业结构化数据计算引擎,拥有丰富的计算类库和完备、不依赖数据库的计算能力。SPL提供了独立的过程计算语法,尤其擅长复杂计算,可以增强MongoDB的计算能力,完成分组汇总、关联计算、子查询等通通不在话下。
常规查询
MongoDB不容易搞定的连接JOIN运算,用SPL很容易搞定:
| A | B | 1 | =mongo_open("mongodb://127.0.0.1:27017/raqdb") | /连接MongDB | 2 | =mongo_shell(A1,"c1.find()").fetch() | /获取数据 | 3 | =mongo_shell(A1,"c2.find()").fetch() | | 4 | =A2.join(user1:user2,A3:user1:user2,output) | /关联计算 | 5 | >A1.close() | /关闭连接 |
?
单表多次参与运算,复用计算结果:
| A | B | 1 | =mongo_open("mongodb://127.0.0.1:27017/raqdb") | | 2 | =mongo_shell(A1,“course.find(,{_id:0})”).fetch() | /获取数据 | 3 | =A2.group(Sno).((avg ? = ~.avg(Grade), ~.select(Grade>avg))).conj() | /计算成绩大于平均值 | 4 | >A1.close() | |
?
IN计算:
| A | B | 1 | =mongo_open("mongodb://localhost:27017/test") | | 2 | =mongo_shell(A1,"orders.find(,{_id:0})") | /获取数据 | 3 | =mongo_shell(A1,"employee.find({STATE:'California'},{_id:0})").fetch() | /过滤employee数据 | 4 | =A3.(EID).sort() | /取出EID并排序 | 5 | =A2.select(A4.pos@b(SELLERID)).fetch() | /二分法查找 | 6 | >A1.close() | |
?
外键对象化,外键指针不仅方便,效率也高:
| A | B | 1 | =mongo_open("mongodb://localhost:27017/local") | | 2 | =mongo_shell(A1,"Progress.find({}, ? {_id:0})").fetch() | /获取Progress数据 | 3 | =A2.groups(courseid; ? count(userId):popularityCount) | /按课程分组计数 | 4 | =mongo_shell(A1,"Course.find(,{title:1})").fetch() | /获取Course数据 | 5 | =A3.switch(courseid,A4:_id) | /外键连接 | 6 | =A5.new(popularityCount,courseid.title) | /创建结果集 | 7 | =A1.close() | |
?
APPLY算法的简单实现:
| A | B | 1 | =mongo_open("mongodb://127.0.0.1:27017/raqdb") | | 2 | =mongo_shell(A1,"users.find()").fetch() | /获取users数据 | 3 | =mongo_shell(A1,"workouts.find()").fetch() | /获取workouts数据 | 4 | =A2.conj(A3.select(A2.workouts.pos(_id)).derive(A2.name)) | /查询_id 值workouts 序列的记录 | 5 | >A1.close() | |
?
集合运算,合并交差:
| A | B | 1 | =mongo_open("mongodb://127.0.0.1:27017/raqdb") | | 2 | =mongo_shell(A1,"emp1.find()").fetch() | | 3 | =mongo_shell(A1,"emp2.find()").fetch() | | 4 | =[A2,A3].conj() | /多序列合集 | 5 | =[A2,A3].merge@ou() | /全行对比求并集 | 6 | =[A2,A3].merge@ou(_id, ? NAME) | /键值对比求并集 | 7 | =[A2,A3].merge@oi() | /全行对比求交集 | 8 | =[A2,A3].merge@oi(_id, ? NAME) | /键值对比求交集 | 9 | =[A2,A3].merge@od() | /全行对比求差集 | 10 | =[A2,A3].merge@od(_id, ? NAME) | /键值对比求差集 | 11 | >A1.close() | |
?
在序列中查找成员序号:
| A | B | 1 | =mongo_open("mongodb://localhost:27017/local) | | 2 | =mongo_shell(A1,"users.find({name:'jim'},{name:1,friends:1,_id:0})") ? .fetch() | | 3 | =A2.friends.pos("luke") | /从friends序列中获取成员序号 | 4 | =A1.close() | |
?
多成员集合的交集:
| A | B | 1 | [Chemical, ? Biology, Math] | /课程 | 2 | =mongo_open("mongodb://127.0.0.1:27017/raqdb") | | 3 | =mongo_shell(A2,"student.find()").fetch() | /获取student数据 | 4 | =A3.select(Lesson^A1!=[]) | /查询选修至少一门的记录 | 5 | =A4.new(_id, ? Name, ~.Lesson^A1:Lession) | /计算出结果 | 6 | >A2.close() | |
复杂计算
TOPN运算:
| A | B | | 1 | =mongo_open("mongodb://127.0.0.1:27017/test") | | | 2 | =mongo_shell(A1,"last3.find(,{_id:0};{variable:1})") | /获取last3数据,并按variable排序 | | 3 | for A2;variable | =A3.top(3;-timestamp) | /选出timestamp最晚的3个 | 4 | | =@|B3 | /将选出文档追加到B4中 | 5 | =B4.minp(~.timestamp)????? | /选出timstamp最早的文档 | | 6 | >mongo_close(A1) | | |
?
嵌套结构的聚合:
| A | 1 | =mongo_open("mongodb://127.0.0.1:27017/raqdb") | 2 | =mongo_shell(A1,"computer.find()").fetch() | 3 | =A2.new(_id:ID,income.array().sum():INCOME,output.array().sum():OUTPUT) | 4 | >A1.close() |
?
合并多属性子文档:
| A | B | C | 1 | =mongo_open("mongodb://localhost:27017/local") | | | 2 | =mongo_shell(A1,"c1.find(,{_id:0};{name:1})") | | | 3 | =create(_id, ? readUsers) | | /创建结果序表 | 4 | for ? A2;name | =A4.conj(acls.read.users|acls.append.users|acls.edit.users|acls.fullControl.users).id() | /取出所有users字段 | 5 | | >A3.insert(0, ? A4.name, B4) | /插入本组数据 | 6 | =A1.close() | | |
?
嵌套List子文档的查询
| A | B | 1 | =mongo_open("mongodb://localhost:27017/local") | | 2 | =mongo_shell(A1,"Cbettwen.find(,{_id:0})").fetch() | | 3 | =A2.conj((t=~.objList.data.dataList, ? t.select((s=float(~.split@c1()(1)), s>6154 ? && s<=6155)))) | /找到符合条件的字符串 | 4 | =A1.close() | |
?
交叉汇总:
| A | 1 | =mongo_open("mongodb://localhost:27017/local") | 2 | =mongo_shell(A1,"student.find()").fetch() | 3 | =A2.group(school) | 4 | =A3.new(school:school,~.align@a(5,sub1).(~.len()):sub1,~.align@a(5,sub2).(~.len()):sub2) | 5 | =A4.new(school,sub1(5):sub1-5,sub1(4):sub1-4,sub1(3):sub1-3,sub1(2):sub1-2,sub1(1):sub1-1,sub2(5):sub2-5,sub2(4):sub2-4,sub2(3):sub2-3,sub2(2):sub2-2,sub2(1):sub2-1) | 6 | =A1.close() |
?
分段分组
| A | B | 1 | [3000,5000,7500,10000,15000] | /Sales分段区间 | 2 | =mongo_open("mongodb://127.0.0.1:27017/raqdb") | | 3 | =mongo_shell(A2,"sales.find()").fetch() | | 4 | =A3.groups(A1.pseg(~.SALES):Segment;count(1): ? number) | /根据 SALES 区间分组统计员工数 | 5 | >A2.close() | |
?
分类分组
| A | B | 1 | =mongo_open("mongodb://127.0.0.1:27017/raqdb") | | 2 | =mongo_shell(A1,"books.find()") | | 3 | =A2.groups(addr,book;count(book): ? Count) | /分组计数 | 4 | =A3.groups(addr;sum(Count):Total) | /分组统计 | 5 | =A3.join(addr,A4:addr,Total) | /关联计算 | 6 | >A1.close() | |
?
数据写入
导出成CSV:
| A | B | 1 | =mongo_open("mongodb://localhost:27017/raqdb") | | 2 | =mongo_shell(A1,"carInfo.find(,{_id:0})") | | 3 | =A2.conj((t=~,cars.car.new(t.id:id, ? t.cars.name, ~:car))) | /对car字段进行拆分成行 | 4 | =file("D:\\data.csv").export@tc(A3) | /导出生成csv文件 | 5 | >A1.close() | |
?
更新数据库(MongoDB到MySQL):
| A | B | 1 | =mongo_open("mongodb://localhost:27017/raqdb") | /连接MongDB | 2 | =mongo_shell(A1,"course.find(,{_id:0})").fetch() | | 3 | =connect("myDB1") | /连接mysql | 4 | =A3.query@x("select ? * from course2").keys(Sno, Cno) | | 5 | >A3.update(A2:A4, ? course2, Sno, Cno, Grade; Sno,Cno) | /向mysql更新数据 | 6 | >A1.close() | |
?
更新数据库(MySQL到MongoDB):
| A | B | 1 | =connect("mysql") | /连接mysql | 2 | =A1.query@x("select ? * from course2") | /获取表course2数据 | 3 | =mongo_open("mongodb://localhost:27017/raqdb") | /连接MongDB | 4 | =mongo_insert(A3, ? "course",A2) | /将MySQL表course2导入MongoDB集合course | 5 | >A3.close() | |
?
混合计算
借助SPL还很容易实现MongoDB与其他数据源进行混合计算:
| A | B | 1 | =mongo_open("mongodb://localhost:27017/test") | /连接MongDB | 2 | =mongo_shell(A1,"emp.find({'$and':[{'Birthday':{'$gte':'"+string(begin)+"'}},{'Birthday':{'$lte':'"+string(end)+"'}}]},{_id:0})").fetch() | /查询某时间段的记录 | 3 | =A1.close() | /关闭MongoDB | 4 | =myDB1.query("select ? * from cities") | /获取mysql中表cities数据 | 5 | =A2.switch(CityID,A4: ? CityID) | /外键关联 | 6 | =A5.new(EID,Dept,CityID.CityName:CityName,Name,Gender) | /创建结果集 | 7 | return ? A6 | /返回 |
SQL支持
SPL除了原生语法,还提供了相当于SQL92标准的SQL支持,可以使用SQL查询MongoDB了,比如前面的关联计算:
| A | 1 | =mongo_open("mongodb://127.0.0.1:27017/test") | 2 | =mongo_shell(A1,"c1.find()").fetch() | 3 | =mongo_shell@x(A1,"c2.find()").fetch() | 4 | $select s.* from {A2} as s left join {A3} ? as r on s.user1=r.user1 and s.user2=r.user2 where r.income>0.3 |
应用集成
不仅如此,SPL提供了标准JDBC/ODBC等应用程序接口,集成调用很方便。如JDBC的使用:
…
Class.forName("com.esproc.jdbc.InternalDriver");
Connection conn = DriverManager.getConnection("jdbc:esproc:local://");
PrepareStatement st=con.prepareStatement("call splScript(?)"); // splScript为spl脚本文件名
st.setObject(1,"California");
st.execute();
ResultSet rs = st.getResultSet();
…
有了这些功能,增强MongoDB的计算能力可不是说说而已,要不要下载试试?
SPL资料
|