Searching Repository Content
1 Introduction
You can find the JCR configuration file here: .../portal/WEB-INF/conf/jcr/repository-configuration.xml. Please read also Search Configuration for more information about index configuration.2 Bi-directional RangeIterator (since 1.9)
QueryResult.getNodes() will return bi-directional NodeIterator implementation. TwoWayRangeIterator interface:/** * Skip a number of elements in the iterator. * * @param skipNum the non-negative number of elements to skip * @throws java.util.NoSuchElementException if skipped past the first element * in the iterator. */ public void skipBack(long skipNum);
NodeIterator iter = queryResult.getNodes(); while (iter.hasNext()) { if (skipForward) { iter.skip(10); // Skip 10 nodes in forward direction } else if (skipBack) { TwoWayRangeIterator backIter = (TwoWayRangeIterator) iter; backIter.skipBack(10); // Skip 10 nodes back } ....... }
3 Fuzzy Searches (since 1.0)
JCR supports such features as Lucene Fuzzy Searches Apache Lucene - Query Parser Syntax. To use it you have to form a query like described below:QueryManager qman = session.getWorkspace().getQueryManager();
Query q = qman.createQuery("select * from nt:base where contains(field, 'ccccc~')", Query.SQL);
QueryResult res = q.execute();4 SynonymSearch (since 1.9)
Searching with synonyms is integrated in the jcr:contains() function and uses the same syntax as synonym searches in Google. If a search term is prefixed by a tilde symbol ( ~ ) also synonyms of the search term are taken into consideration. Example:SQL: select * from nt:resource where contains(., '~parameter') XPath: //element(*, nt:resource)[jcr:contains(., '~parameter')
<param name="synonymprovider-config-path" value="..you path to configuration file....."/> <param name="synonymprovider-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.PropertiesSynonymProvider"/>
/** * <code>SynonymProvider</code> defines an interface for a component that * returns synonyms for a given term. */ public interface SynonymProvider { /** * Initializes the synonym provider and passes the file system resource to * the synonym provider configuration defined by the configuration value of * the <code>synonymProviderConfigPath</code> parameter. The resource may be * <code>null</code> if the configuration parameter is not set. * * @param fsr the file system resource to the synonym provider * configuration. * @throws IOException if an error occurs while initializing the synonym * provider. */ public void initialize(InputStream fsr) throws IOException; /** * Returns an array of terms that are considered synonyms for the given * <code>term</code>. * * @param term a search term. * @return an array of synonyms for the given <code>term</code> or an empty * array if no synonyms are known. */ public String[] getSynonyms(String term); }
5 Highlighting (Since 1.9)
An ExcerptProvider retrieves text excerpts for a node in the query result and marks up the words in the text that match the query terms. Per default highlighting words that matched the query is disabled because this feature requires that additional information is written to the search index. To enable this feature you need to add a configuration parameter to the query-handler element in your jcr configuration file to enable it.<param name="support-highlighting" value="true"/><param name="excerptprovider-class" value="org.exoplatform.services.jcr.impl.core.query.lucene.DefaultXMLExcerpt"/>5.1 DefaultXMLExcerpt
This excerpt provider creates an XML fragment of the following form:<excerpt> <fragment> <highlight>exoplatform</highlight> implements both the mandatory XPath and optional SQL <highlight>query</highlight> syntax. </fragment> <fragment> Before parsing the XPath <highlight>query</highlight> in <highlight>exoplatform</highlight>, the statement is surrounded </fragment> </excerpt>
5.2 DefaultHTMLExcerpt
This excerpt provider creates an HTML fragment of the following form:<div> <span> <strong>exoplatform</strong> implements both the mandatory XPath and optional SQL <strong>query</strong> syntax. </span> <span> Before parsing the XPath <strong>query</strong> in <strong>exoplatform</strong>, the statement is surrounded </span> </div>
5.3 How to use it
If you are using XPath you must use the rep:excerpt() function in the last location step, just like you would select properties:QueryManager qm = session.getWorkspace().getQueryManager(); Query q = qm.createQuery("//*[jcr:contains(., 'exoplatform')]/(@Title|rep:excerpt(.))", Query.XPATH); QueryResult result = q.execute(); for (RowIterator it = result.getRows(); it.hasNext(); ) { Row r = it.nextRow(); Value title = r.getValue("Title"); Value excerpt = r.getValue("rep:excerpt(.)"); }
QueryManager qm = session.getWorkspace().getQueryManager(); Query q = qm.createQuery("select excerpt(.) from nt:resource where contains(., 'exoplatform')", Query.SQL); QueryResult result = q.execute(); for (RowIterator it = result.getRows(); it.hasNext(); ) { Row r = it.nextRow(); Value excerpt = r.getValue("rep:excerpt(.)"); }
6 SpellChecker
The lucene based query handler implementation supports a pluggable spell checker mechanism. Per default spell checking is not available and you have to configure it first. See parameter spellCheckerClass on page Search Configuration JCR currently provides an implementation class , which uses the lucene-spellchecker contrib . The dictionary is derived from the fulltext indexed content of the workspace and updated periodically. You can configure the refresh interval by picking one of the available inner classes of org.exoplatform.services.jcr.impl.core.query.lucene.spell.LuceneSpellChecker:- OneMinuteRefreshInterval
- FiveMinutesRefreshInterval
- ThirtyMinutesRefreshInterval
- OneHourRefreshInterval
- SixHoursRefreshInterval
- TwelveHoursRefreshInterval
- OneDayRefreshInterval
6.1 How do I use it?
You can spell check a fulltext statement either with an XPath or a SQL query:// rep:spellcheck('explatform') will always evaluate to true
Query query = qm.createQuery("/jcr:root[rep:spellcheck('explatform')]/(rep:spellcheck())", Query.XPATH);
RowIterator rows = query.execute().getRows();
// the above query will always return the root node no matter what string we check
Row r = rows.nextRow();
// get the result of the spell checking
Value v = r.getValue("rep:spellcheck()");
if (v == null) {
// no suggestion returned, the spelling is correct or the spell checker
// does not know how to correct it.
} else {
String suggestion = v.getString();
}// SPELLCHECK('exoplatform') will always evaluate to true
Query query = qm.createQuery("SELECT rep:spellcheck() FROM nt:base WHERE jcr:path = '/' AND SPELLCHECK('explatform')", Query.SQL);
RowIterator rows = query.execute().getRows();
// the above query will always return the root node no matter what string we check
Row r = rows.nextRow();
// get the result of the spell checking
Value v = r.getValue("rep:spellcheck()");
if (v == null) {
// no suggestion returned, the spelling is correct or the spell checker
// does not know how to correct it.
} else {
String suggestion = v.getString();
}7 Similarity (Since 1.12)
Starting with version, 1.12 JCR allows you to search for nodes that are similar to an existing node. Similarity is determined by looking up terms that are common to nodes. There are some conditions that must be met for a term to be considered. This is required to limit the number possibly relevant terms.- Only terms with at least 4 characters are considered.
- Only terms that occur at least 2 times in the source node are considered.
- Only terms that occur in at least 5 nodes are considered.
<param name="support-highlighting" value="true"/>
//element(*, nt:resource)[rep:similar(., '/parentnode/node.txt/jcr:content')]
on 23/10/2009 at 14:53