If all the information needed to perform filtering is in the index, there’s no need to write your own filter because the QueryWrapperFilter can handle it, as described insection 5.6.5. But there are good reasons to factor external information into a custom filter. In this section we tackle the following example: using our book example data and pretending we’re running an online bookstore, we want users to be able to search within our special hot deals of the day.
You might be tempted to simply store the specials flag as an indexed field, but keeping this up-to-date might prove too costly. Rather than reindex entire documents when specials change, we’ll implement a custom filter that keeps the specials flagged in our (hypothetical) relational database. Then we’ll see how to apply our filter during searching, and finally we’ll explore an alternative option for applying the filter.
Implementing a custom filter
We start with abstracting away the source of our specials by defining this interface:
- package demo.ch6;
- public interface SpecialsAccessor {
- String[] isbns();
- }
- Listing 6.14 Retrieving filter information from external source with SpecialsFilter
- import java.io.IOException;
- import org.apache.lucene.index.AtomicReader;
- import org.apache.lucene.index.AtomicReaderContext;
- import org.apache.lucene.index.DocsEnum;
- import org.apache.lucene.index.Term;
- import org.apache.lucene.search.DocIdSet;
- import org.apache.lucene.search.DocIdSetIterator;
- import org.apache.lucene.search.Filter;
- import org.apache.lucene.util.Bits;
- import org.apache.lucene.util.OpenBitSet;
- public class SpecialsFilter extends Filter {
- private SpecialsAccessor accessor;
- public SpecialsFilter(SpecialsAccessor accessor) {
- this.accessor = accessor;
- }
- @Override
- public DocIdSet getDocIdSet(AtomicReaderContext ctx, Bits bits)
- throws IOException {
- AtomicReader reader = ctx.reader();
- OpenBitSet oBits = new OpenBitSet(reader.maxDoc());
- String[] isbns = accessor.isbns();
- for (String isbn : isbns)
- {
- DocsEnum docEnum = reader.termDocsEnum(new Term("isbn", isbn));
- while(docEnum.nextDoc()!= DocIdSetIterator.NO_MORE_DOCS)
- {
- if(docEnum.freq()>0)
- {
- oBits.set(docEnum.docID());
- }
- }
- }
- return oBits;
- }
- }
Using our custom filter during searching
To test that our filter is working, we created a simple TestSpecialsAccessor to return a specified set of ISBNs, giving our test case control over the set of specials:
- public class TestSpecialsAccessor implements SpecialsAccessor {
- private String[] isbns;
- public TestSpecialsAccessor(String[] isbns) {
- this.isbns = isbns;
- }
- public String[] isbns() {
- return isbns;
- }
- }
- public void testCustomFilter() throws Exception {
- Query allBooks = new TermQuery(new Term("contents", "manning"));
- String[] isbns = new String[] { "1933988940", "9781935182023" };
- SpecialsAccessor accessor = new TestSpecialsAccessor(isbns);
- Filter filter = new SpecialsFilter(accessor);
- TopDocs hits = searcher.search(allBooks, filter, 10);
- assertEquals("the specials", isbns.length, hits.totalHits);
- }
An alternative: FilteredQuery
To add to the filter terminology overload, one final option is FilteredQuery. FilteredQuery inverts the situation that searching with a filter presents. Using a filter, anIndexSearcher’s search method applies a single filter during querying. Using the FilteredQuery, though, you can turn any filter into a query, which opens up neat possibilities, such as adding a filter as a clause to a BooleanQuery.
Let’s take the SpecialsFilter as an example again. This time, we want a more sophisticated query: books in an education category on special, or books on Logo. We couldn’t accomplish this with a direct query using the techniques shown thus far, but FilteredQuery makes this possible. Had our search been only for books in the education category on special, we could’ve used the technique shown in the previous code snippet instead.
Our test case, in listing 6.15, demonstrates the described query using a BooleanQuery with a nested TermQuery and FilteredQuery.
Listing 6.15 Using a FilteredQuery
- public void testFilteredQuery() throws Exception {
- // 1)
- String[] isbns = new String[] { "9781935182023" };
- // 2)
- SpecialsAccessor accessor = new TestSpecialsAccessor(isbns);
- Filter filter = new SpecialsFilter(accessor);
- WildcardQuery educationBooks = new WildcardQuery(new Term("category", "*education*"));
- FilteredQuery edBooksOnSpecial = new FilteredQuery(educationBooks, filter);
- // 3)
- TermQuery logoBooks = new TermQuery(new Term("subject", "logo"));
- // 4)
- BooleanQuery logoOrEdBooks = new BooleanQuery();
- logoOrEdBooks.add(logoBooks, BooleanClause.Occur.SHOULD);
- logoOrEdBooks.add(edBooksOnSpecial, BooleanClause.Occur.SHOULD);
- TopDocs hits = searcher.search(logoOrEdBooks, 10);
- System.out.println(logoOrEdBooks.toString());
- assertEquals("Papert and Steiner", 2, hits.totalHits);
- }
Filtering is a powerful means of overriding which documents a query may match, and in this section you’ve seen how to create custom filters and use them during searching, as well as how to wrap a filter as a query so that it may be used wherever a query may be used. Filters give you a lot of flexibility for advanced searching.
沒有留言:
張貼留言