过滤器的作用:
- 过滤器的作用是在服务端判断数据是否满足条件,然后只将满足条件的数据返回给客户端
- 过滤器的类型很多,但是可以分为两大类:
比较过滤器:可应用于rowkey、列簇、列、列值过滤器 专用过滤器:只能适用于特定的过滤器
1、比较过滤器
1.1、比较运算符
- LESS <
- LESS_OR_EQUAL <=
- EQUAL =
- NOT_EQUAL <>
- GREATER_OR_EQUAL >=
- GREATER >
- NO_OP 排除所有
1.2、比较器
常见的比较器都是继承于抽象类ByteArrayComparable 的 常见比较器解释
-
BinaryComparator 按字节索引顺序比较指定字节数组,采用Bytes.compareTo(byte[]) -
BinaryPrefixComparator 通BinaryComparator,只是比较左端前缀的数据是否相同 -
NullComparator 判断给定的是否为空 -
BitComparator 按位比较 -
RegexStringComparator 提供一个正则的比较器,仅支持 EQUAL 和非EQUAL -
SubstringComparator 判断提供的子串是否出现在中
1.3、常见比较过滤器
比较过滤器继承图 解释:
- RowFilter :rowkey过滤器,基于行键来过滤数据
- FamilyFilter:列簇过滤器,基于列族来过滤数据
- QualifierFilter :列过滤器,基于列(列名)来过滤数据
- ValueFilter:列值过滤器,基于cell的值来过滤数据
1.3.1、RowFilter 过滤器示例
通过RowFilter与BinaryComparator过滤比rowKey 1500100010小的所有值出来
@Test
public void BinaryComparatorFilter() throws IOException {
Table students = conn.getTable(TableName.valueOf("students"));
BinaryComparator binaryComparator = new BinaryComparator(Bytes.toBytes(1500100010));
RowFilter rowFilter = new RowFilter(CompareFilter.CompareOp.LESS, binaryComparator);
Scan scan = new Scan();
scan.setFilter(rowFilter);
ResultScanner scanner = students.getScanner(scan);
Result rs = scanner.next();
while (rs != null) {
String id = Bytes.toString(rs.getRow());
String name = Bytes.toString(rs.getValue("info".getBytes(), "name".getBytes()));
int age = Bytes.toInt(rs.getValue("info".getBytes(), "age".getBytes()));
String gender = Bytes.toString(rs.getValue("info".getBytes(), "gender".getBytes()));
String clazz = Bytes.toString(rs.getValue("info".getBytes(), "clazz".getBytes()));
System.out.println(id + "\t" + name + "\t" + age + "\t" + gender + "\t" + clazz + "\t");
rs = scanner.next();
}
}
1.3.2、FamilyFilter示例
通过FamilyFilter与SubstringComparator查询列簇名包含in的所有列簇下面的数据
@Test
public void SubstringComparatorFilter() throws IOException {
Table students = conn.getTable(TableName.valueOf("students"));
SubstringComparator substringComparator = new SubstringComparator("in");
FamilyFilter familyFilter = new FamilyFilter(CompareFilter.CompareOp.EQUAL, substringComparator);
Scan scan = new Scan();
scan.setFilter(familyFilter);
ResultScanner scanner = students.getScanner(scan);
Result rs = scanner.next();
while (rs != null) {
String id = Bytes.toString(rs.getRow());
String name = Bytes.toString(rs.getValue("info".getBytes(), "name".getBytes()));
int age = Bytes.toInt(rs.getValue("info".getBytes(), "age".getBytes()));
String gender = Bytes.toString(rs.getValue("info".getBytes(), "gender".getBytes()));
String clazz = Bytes.toString(rs.getValue("info".getBytes(), "clazz".getBytes()));
System.out.println(id + "\t" + name + "\t" + age + "\t" + gender + "\t" + clazz + "\t");
rs = scanner.next();
}
}
1.3.3、QualifierFilter 示例
通过QualifierFilter与SubstringComparator查询列名包含in的列的值
public void printRS(ResultScanner scanner) throws IOException {
for (Result rs : scanner) {
String rowkey = Bytes.toString(rs.getRow());
System.out.println("当前行的rowkey为:" + rowkey);
for (Cell cell : rs.listCells()) {
String family = Bytes.toString(CellUtil.cloneFamily(cell));
String qualifier = Bytes.toString(CellUtil.cloneQualifier(cell));
byte[] bytes = CellUtil.cloneValue(cell);
if ("age".equals(qualifier)) {
int value = Bytes.toInt(bytes);
System.out.println(family + ":" + qualifier + "的值为" + value);
} else {
String value = Bytes.toString(bytes);
System.out.println(family + ":" + qualifier + "的值为" + value);
}
}
}
}
@Test
public void SubstringComparatorFilter() throws IOException {
Table students = conn.getTable(TableName.valueOf("students"));
SubstringComparator substringComparator = new SubstringComparator("in");
FamilyFilter familyFilter = new FamilyFilter(CompareFilter.CompareOp.EQUAL, substringComparator);
Scan scan = new Scan();
scan.setFilter(familyFilter);
ResultScanner scanner = students.getScanner(scan);
Result rs = scanner.next();
while (rs != null) {
String id = Bytes.toString(rs.getRow());
String name = Bytes.toString(rs.getValue("info".getBytes(), "name".getBytes()));
int age = Bytes.toInt(rs.getValue("info".getBytes(), "age".getBytes()));
String gender = Bytes.toString(rs.getValue("info".getBytes(), "gender".getBytes()));
String clazz = Bytes.toString(rs.getValue("info".getBytes(), "clazz".getBytes()));
System.out.println(id + "\t" + name + "\t" + age + "\t" + gender + "\t" + clazz + "\t");
rs = scanner.next();
}
}
1.3.4、ValueFilter示例
列值过滤器比较的是每一个cell,只要一个行中的任意一个值满足都会被列出来,列出年龄大于23的所有学生
@Test
public void ValueFilter() throws IOException {
TableName students = TableName.valueOf("students");
Table table = conn.getTable(students);
ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.GREATER, new BinaryComparator("23".getBytes()));
Scan scan = new Scan();
scan.setFilter(valueFilter);
ResultScanner scanner = table.getScanner(scan);
for (Result result : scanner) {
String id = Bytes.toString(result.getRow());
String name = Bytes.toString(result.getValue("info".getBytes(), "name".getBytes()));
String age = Bytes.toString(result.getValue("info".getBytes(), "age".getBytes()));
String gender = Bytes.toString(result.getValue("info".getBytes(), "gender".getBytes()));
String clazz = Bytes.toString(result.getValue("info".getBytes(), "clazz".getBytes()));
System.out.println(id + "," + name + "," + age + "," + gender + "," + clazz);
}
}
2、专用过滤器
2.1、单列值过滤器
SingleColumnValueFilter SingleColumnValueFilter会返回满足条件的cell所在行的所有cell的值(即会返回一行数据)
示例 通过SingleColumnValueFilter与查询文科班所有学生信息
@Test
public void RegexStringComparatorFilter() throws IOException {
Table students = conn.getTable(TableName.valueOf("students"));
SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter(
"info".getBytes(),
"clazz".getBytes(),
CompareFilter.CompareOp.EQUAL,
new RegexStringComparator("^文科.*")
);
Scan scan = new Scan();
scan.setFilter(singleColumnValueFilter);
ResultScanner scanner = students.getScanner(scan);
Result rs = scanner.next();
while (rs != null) {
String id = Bytes.toString(rs.getRow());
String name = Bytes.toString(rs.getValue("info".getBytes(), "name".getBytes()));
int age = Bytes.toInt(rs.getValue("info".getBytes(), "age".getBytes()));
String gender = Bytes.toString(rs.getValue("info".getBytes(), "gender".getBytes()));
String clazz = Bytes.toString(rs.getValue("info".getBytes(), "clazz".getBytes()));
System.out.println(id + "\t" + name + "\t" + age + "\t" + gender + "\t" + clazz + "\t");
rs = scanner.next();
}
}
2.2、列值排除过滤器
SingleColumnValueExcludeFilter 与SingleColumnValueFilter相反,会排除掉指定的列,其他的列全部返回
示例 通过SingleColumnValueExcludeFilter与BinaryComparator查询文科一班所有学生信息,最终不返回clazz列
@Test
public void RegexStringComparatorExcludeFilter() throws IOException {
Table students = conn.getTable(TableName.valueOf("students"));
SingleColumnValueExcludeFilter singleColumnValueExcludeFilter = new SingleColumnValueExcludeFilter(
"info".getBytes(),
"clazz".getBytes(),
CompareFilter.CompareOp.EQUAL,
new BinaryComparator("文科一班".getBytes())
);
Scan scan = new Scan();
scan.setFilter(singleColumnValueExcludeFilter);
ResultScanner scanner = students.getScanner(scan);
Result rs = scanner.next();
while (rs != null) {
String id = Bytes.toString(rs.getRow());
String name = Bytes.toString(rs.getValue("info".getBytes(), "name".getBytes()));
int age = Bytes.toInt(rs.getValue("info".getBytes(), "age".getBytes()));
String gender = Bytes.toString(rs.getValue("info".getBytes(), "gender".getBytes()));
String clazz = Bytes.toString(rs.getValue("info".getBytes(), "clazz".getBytes()));
System.out.println(id + "\t" + name + "\t" + age + "\t" + gender + "\t" + clazz + "\t");
rs = scanner.next();
}
}
2.3、rowkey前缀过滤器
PrefixFilter 示例 通过PrefixFilter查询以150010008开头的所有前缀的rowkey
@Test
public void PrefixFilterFilter() throws IOException {
Table students = conn.getTable(TableName.valueOf("students"));
PrefixFilter prefixFilter = new PrefixFilter("150010008".getBytes());
Scan scan = new Scan();
scan.setFilter(prefixFilter);
ResultScanner scanner = students.getScanner(scan);
Result rs = scanner.next();
while (rs != null) {
String id = Bytes.toString(rs.getRow());
String name = Bytes.toString(rs.getValue("info".getBytes(), "name".getBytes()));
int age = Bytes.toInt(rs.getValue("info".getBytes(), "age".getBytes()));
String gender = Bytes.toString(rs.getValue("info".getBytes(), "gender".getBytes()));
String clazz = Bytes.toString(rs.getValue("info".getBytes(), "clazz".getBytes()));
System.out.println(id + "\t" + name + "\t" + age + "\t" + gender + "\t" + clazz + "\t");
rs = scanner.next();
}
}
2.4、分页过滤器
PageFilter 使用PageFilter分页效率比较低,每次都需要扫描前面的数据,直到扫描到所需要查的数据,可设计一个合理的rowkey来实现分页需求
示例 通过PageFilter查询第三页的数据,每页10条
@Test
public void PageFilter() throws IOException {
Table students = conn.getTable(TableName.valueOf("students"));
int PageNum = 3;
int PageSize = 10;
Scan scan = new Scan();
if (PageNum == 1) {
scan.withStartRow("".getBytes());
PageFilter pageFilter = new PageFilter(PageSize);
scan.setFilter(pageFilter);
ResultScanner scanner = students.getScanner(scan);
printRS(scanner);
} else {
String current_page_start_rows = "";
int scanDatas = (PageNum - 1) * PageSize + 1;
PageFilter pageFilter = new PageFilter(scanDatas);
scan.setFilter(pageFilter);
ResultScanner scanner = students.getScanner(scan);
for (Result rs : scanner) {
current_page_start_rows = Bytes.toString(rs.getRow());
}
scan.withStartRow(current_page_start_rows.getBytes());
PageFilter pageFilter1 = new PageFilter(PageSize);
scan.setFilter(pageFilter1);
ResultScanner scanner1 = students.getScanner(scan);
printRS(scanner1);
}
}
3、多过滤器综合查询
示例 查询文科班中的学生中学号以150010008开头并且年龄小于23的学生信息
@Test
public void FilterListFilter() throws IOException {
Table students = conn.getTable(TableName.valueOf("students"));
Scan scan = new Scan();
SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter(
"info".getBytes()
, "clazz".getBytes()
, CompareFilter.CompareOp.EQUAL
, new RegexStringComparator("^文科.*"));
PrefixFilter prefixFilter = new PrefixFilter("150010008".getBytes());
SingleColumnValueFilter singleColumnValueFilter1 = new SingleColumnValueFilter(
"info".getBytes()
, "age".getBytes()
, CompareFilter.CompareOp.LESS
, new BinaryComparator(Bytes.toBytes(23)));
FilterList filterList = new FilterList();
filterList.addFilter(singleColumnValueFilter);
filterList.addFilter(prefixFilter);
filterList.addFilter(singleColumnValueFilter1);
scan.setFilter(filterList);
ResultScanner scanner = students.getScanner(scan);
printRS(scanner);
}
|