Package org.ametys.web.indexing.solr
Class SolrPageIndexer
- java.lang.Object
-
- org.ametys.runtime.plugin.component.AbstractLogEnabled
-
- org.ametys.web.indexing.solr.SolrPageIndexer
-
- All Implemented Interfaces:
SolrFieldNames
,LogEnabled
,SolrWebFieldNames
,Component
,Contextualizable
,Serviceable
public class SolrPageIndexer extends AbstractLogEnabled implements Component, Serviceable, SolrWebFieldNames, Contextualizable
Component responsible for indexing a page with all its contents.
-
-
Field Summary
Fields Modifier and Type Field Description protected AdditionalPropertyIndexerExtensionPoint
_additionalPropertiesIndexerEP
The additional property indexer extension point.protected AmetysObjectResolver
_ametysObjectResolver
The Ametys object resolverprotected Context
_context
The avalon contextprivate ContentTypesHelper
_cTypesHelper
protected PageVisibleAttachmentIndexerExtensionPoint
_pageVisibleAttachmentIndexerEP
The extension point for PageVisibleAttachmentIndexersprotected ServiceExtensionPoint
_serviceExtensionPoint
The service extension point.protected SolrClientProvider
_solrClientProvider
The Solr client providerprotected SolrContentIndexer
_solrContentIndexer
Solr Ametys contents indexerprotected SolrIndexer
_solrIndexer
The Solr indexerprotected SolrResourceIndexer
_solrResourceIndexer
Solr Ametys resources indexerprotected TagProviderExtensionPoint
_tagProviderEP
The tag provider extension point.static String
ROLE
The avalon role.-
Fields inherited from interface org.ametys.cms.content.indexing.solr.SolrFieldNames
ACL_INIT_VALUE_ALLOWED_GROUPS, ACL_INIT_VALUE_ALLOWED_USERS, ACL_INIT_VALUE_ANONYMOUS, ACL_INIT_VALUE_ANYCONNECTED, ACL_INIT_VALUE_DENIED_GROUPS, ACL_INIT_VALUE_DENIED_USERS, ALL_CONTENT_TYPES, ALL_MIXIN_TYPES, ALL_TAGS, ATTACHMENT_CONTENT_ID, CONTENT_COMMENTS, CONTENT_COMMENTS_NONVALIDATED, CONTENT_COMMENTS_VALIDATED, CONTENT_CREATOR, CONTENT_LANGUAGE, CONTENT_LANGUAGES, CONTENT_LAST_CONTRIBUTOR, CONTENT_NAME, CONTENT_OUTGOING_REFEERENCES_RESOURCE_IDS, CONTENT_TITLES, CONTENT_TYPE_RESOURCE, CONTENT_TYPES, CONTENT_VISIBLE_ATTACHMENT_RESOURCE_IDS, CREATION_DATE, DC_CONTRIBUTOR, DC_COVERAGE, DC_CREATOR, DC_DATE, DC_DESCRIPTION, DC_FORMAT, DC_IDENTIFIER, DC_LANGUAGE, DC_PUBLISHER, DC_RELATION, DC_RIGHTS, DC_SOURCE, DC_SUBJECT, DC_TITLE, DC_TYPE, DOCUMENT_TYPE, EXCERPT, FILENAME, FIRST_VALIDATION, FULL_EXACT_WS, FULL_GENERAL, FULL_PREFIX, FULL_STEMMED_PREFIX, ID, IS_AMETYS_OBJECT, LAST_MAJOR_VALIDATION, LAST_MODIFIED, LAST_VALIDATION, LENGTH, MIME_TYPES, MIXIN_TYPES, PATH, PSEUDO_CONTENT_TYPE_VALUE_RESOURCE, PSEUDO_CONTENT_TYPES, REPEATER_ENTRY_POSITION, RESOURCE_ANCESTOR_AND_SELF_IDS, RESOURCE_ANCESTOR_IDS, RESOURCE_CREATOR, RESOURCE_DATE, RESOURCE_LAST_MODIFIED, RESOURCE_MIME_TYPE_GROUP, RESOURCE_ROOT_ID, SIMPLE_CONTENT_PARENTS, SUB_CONTENT, TAGS, TITLE, TITLE_SORT, TYPE_CONTENT, TYPE_CONTENT_ATTACHMENT_RESOURCE, TYPE_CONTENT_ATTRIBUTE_RESOURCE, TYPE_REPEATER, TYPE_RESOURCE, TYPE_WF_ENTRY, TYPE_WF_STEP, WORKFLOW_CURRENT_STEPS, WORKFLOW_CURRENT_STEPS_DV, WORKFLOW_ENTRY_STATE, WORKFLOW_HISTORY_STEPS, WORKFLOW_HISTORY_STEPS_DV, WORKFLOW_NAME, WORKFLOW_REF, WORKFLOW_REF_DV, WORKFLOW_STEP, WORKFLOW_STEP_ACTIONID, WORKFLOW_STEP_CALLER, WORKFLOW_STEP_DUEDATE, WORKFLOW_STEP_FINISHDATE, WORKFLOW_STEP_ID, WORKFLOW_STEP_OWNER, WORKFLOW_STEP_STARTDATE, WORKFLOW_STEP_STATUS
-
Fields inherited from interface org.ametys.web.indexing.solr.SolrWebFieldNames
ATTACHMENT_PAGE_ID, CONTENT_IDS, CONTENT_INTERESTING_DATES, DATE_FOR_SORTING, DATES_FACET, FACETABLE_CONTENT_FIELD_PREFIX, LASTNAME_FOR_SORTING, ORPHAN, PAGE_ANCESTOR_IDS, PAGE_CONTENT_TYPES, PAGE_DEPTH, PAGE_IDS, PAGE_LONG_TITLE, PAGE_OUTGOING_REFEERENCES_RESOURCE_IDS, PAGE_PARENT_ID, PAGE_TITLE, PAGE_TYPE, PAGE_VISIBLE_ATTACHMENT_RESOURCE_IDS, PRIVACY, SECTION_PAGE_TITLE, SERVICE_IDS, SHARED, SITE_NAME, SITE_TYPE, SITEMAP_NAME, TEMPLATE, TYPE_PAGE, TYPE_PAGE_RESOURCE
-
-
Constructor Summary
Constructors Constructor Description SolrPageIndexer()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
_findAndIndexFacetableField(String[] pathSegments, String lang, CompositeMetadata metadata, MetadataDefinition definition, IndexingField field, org.apache.solr.common.SolrInputDocument pageDocument)
Index the facetable fields of a content into the page solr documentprotected Date
_getFirstDate(Page page, Function<Content,Date> dateRetriever)
Computes a "first date" of a page, using the simple and naive following algorithm:
From all the dates from each of its contents, keep the lowest of them.protected Date
_getFirstValidationDate(Page page)
Computes the first validation date of a page.protected Date
_getLastDate(Page page, Function<Content,Date> dateRetriever)
Computes a "last date" of a page, using the simple and naive following algorithm:
From all the dates from each of its contents, keep the greatest of them.protected Date
_getLastMajorValidationDate(Page page)
Computes the last major validation date of a page.protected Date
_getLastModificationDate(Page page)
Computes the last modification date of a page.protected Date
_getLastValidationDate(Page page)
Computes the last validation date of a page.protected Set<String>
_getTagsWithAncestors(Page page)
Get all the page tags with their ancestors.protected void
_indexFacetableField(Content content, org.apache.solr.common.SolrInputDocument document)
Index the facetable fields of a content into the page solr documentprivate void
_indexPage(Page page, String workspaceName, boolean indexRecursively, boolean indexAttachments, org.apache.solr.client.solrj.SolrClient solrClient)
private void
_indexPageAttachment(Resource resource, Page page, org.apache.solr.client.solrj.SolrClient solrClient)
private void
_indexPageAttachments(ResourceCollection collection, Page page, org.apache.solr.client.solrj.SolrClient solrClient)
protected void
_indexPageDocument(Page page, org.apache.solr.common.SolrInputDocument document, String workspaceName, org.apache.solr.client.solrj.SolrClient solrClient)
Index a populated solr input document of type Page.protected void
_indexResourceDocument(Resource resource, org.apache.solr.common.SolrInputDocument document, org.apache.solr.client.solrj.SolrClient solrClient)
Index a populated solr input document of type Resource.private void
_indexStringFields(org.apache.solr.common.SolrInputDocument document, String documentId, String fieldName, String fieldValue, String language)
private void
_indexVisibleAttachments(Page page, org.apache.solr.common.SolrInputDocument document)
protected void
_populateAdditionalProperties(Page page, org.apache.solr.common.SolrInputDocument document)
Populate the solr input document by adding fields to index.protected void
_populateDatesOfPage(Page page, org.apache.solr.common.SolrInputDocument document)
Populate the solr input document with dates from the pageprivate void
_populatePageAttachmentDocument(Resource resource, org.apache.solr.common.SolrInputDocument document, Page page)
protected void
_populatePageContentsDocument(Page page, org.apache.solr.common.SolrInputDocument document)
Index the content of the page.protected void
_populatePageDocument(Page page, org.apache.solr.common.SolrInputDocument document)
Populate the solr input document by adding fields to index.protected void
_unindexPageDocument(String pageId, String workspaceName, boolean unindexRecursively, boolean unindexAttachments)
Deindex a document of type Page.void
contextualize(Context context)
void
indexPage(String pageId, boolean indexRecursively, boolean indexAttachments)
Index a page and eventually its children, recursively, in all workspaces and commit
By default, children pages will be actually indexed if indexRecursively is true and if those pages are not already indexed.void
indexPage(String pageId, String workspaceName, boolean indexRecursively, boolean indexAttachments)
Index a page and eventually its children, recursively.
By default, children pages will be actually indexed if indexRecursively is true and if those pages are not already indexed.void
indexPage(String pageId, String workspaceName, boolean indexRecursively, boolean indexAttachments, org.apache.solr.client.solrj.SolrClient solrClient)
Index a page and eventually its children, recursively.
By default, children pages will be actually indexed if indexRecursively is true and if those pages are not already indexed.void
indexPageAttachment(Resource resource, Page page)
Index a page attachmentvoid
indexPageAttachments(ResourceCollection collection, Page page)
Index page attachments as new entries in the index.void
reindexPage(String pageId, boolean reindexRecursively, boolean reindexAttachments)
Reindex a page by its ID for all workspaces and commitvoid
reindexPage(String pageId, String workspaceName, boolean reindexRecursively, boolean reindexAttachments)
Reindex a page by its ID.void
service(ServiceManager manager)
void
unindexPage(String pageId, boolean unindexRecursively, boolean unindexAttachments)
Un-index a page by its ID for all workspaces and commitvoid
unindexPage(String pageId, String workspaceName, boolean unindexRecursively, boolean unindexAttachments)
De-index a page (and optionally its children pages).-
Methods inherited from class org.ametys.runtime.plugin.component.AbstractLogEnabled
getLogger, setLogger
-
-
-
-
Field Detail
-
_solrClientProvider
protected SolrClientProvider _solrClientProvider
The Solr client provider
-
_solrIndexer
protected SolrIndexer _solrIndexer
The Solr indexer
-
_solrContentIndexer
protected SolrContentIndexer _solrContentIndexer
Solr Ametys contents indexer
-
_solrResourceIndexer
protected SolrResourceIndexer _solrResourceIndexer
Solr Ametys resources indexer
-
_pageVisibleAttachmentIndexerEP
protected PageVisibleAttachmentIndexerExtensionPoint _pageVisibleAttachmentIndexerEP
The extension point for PageVisibleAttachmentIndexers
-
_additionalPropertiesIndexerEP
protected AdditionalPropertyIndexerExtensionPoint _additionalPropertiesIndexerEP
The additional property indexer extension point.
-
_tagProviderEP
protected TagProviderExtensionPoint _tagProviderEP
The tag provider extension point.
-
_serviceExtensionPoint
protected ServiceExtensionPoint _serviceExtensionPoint
The service extension point.
-
_ametysObjectResolver
protected AmetysObjectResolver _ametysObjectResolver
The Ametys object resolver
-
_cTypesHelper
private ContentTypesHelper _cTypesHelper
-
-
Constructor Detail
-
SolrPageIndexer
public SolrPageIndexer()
-
-
Method Detail
-
service
public void service(ServiceManager manager) throws ServiceException
- Specified by:
service
in interfaceServiceable
- Throws:
ServiceException
-
contextualize
public void contextualize(Context context) throws ContextException
- Specified by:
contextualize
in interfaceContextualizable
- Throws:
ContextException
-
indexPage
public void indexPage(String pageId, boolean indexRecursively, boolean indexAttachments) throws Exception
Index a page and eventually its children, recursively, in all workspaces and commit
By default, children pages will be actually indexed if indexRecursively is true and if those pages are not already indexed.- Parameters:
pageId
- the page to be indexed.indexRecursively
- to also process children pages.indexAttachments
- to index page attachments- Throws:
Exception
- if an error occurs during indexation.
-
indexPage
public void indexPage(String pageId, String workspaceName, boolean indexRecursively, boolean indexAttachments) throws IndexingException
Index a page and eventually its children, recursively.
By default, children pages will be actually indexed if indexRecursively is true and if those pages are not already indexed.- Parameters:
pageId
- the page to be indexed.workspaceName
- the workspace where to indexindexRecursively
- to also process children pages.indexAttachments
- to index page attachments- Throws:
IndexingException
- if an error occurs during indexation.
-
indexPage
public void indexPage(String pageId, String workspaceName, boolean indexRecursively, boolean indexAttachments, org.apache.solr.client.solrj.SolrClient solrClient) throws IndexingException
Index a page and eventually its children, recursively.
By default, children pages will be actually indexed if indexRecursively is true and if those pages are not already indexed.- Parameters:
pageId
- the page to be indexed.workspaceName
- the workspace where to indexindexRecursively
- to also process children pages.indexAttachments
- to index page attachmentssolrClient
- The solr client to use- Throws:
IndexingException
- if an error occurs during indexation.
-
_indexPage
private void _indexPage(Page page, String workspaceName, boolean indexRecursively, boolean indexAttachments, org.apache.solr.client.solrj.SolrClient solrClient) throws IndexingException
- Throws:
IndexingException
-
_populatePageDocument
protected void _populatePageDocument(Page page, org.apache.solr.common.SolrInputDocument document) throws Exception
Populate the solr input document by adding fields to index.- Parameters:
page
- the page to index.document
- the solr input document- Throws:
Exception
- if something goes wrong when processing the indexation of the page
-
_indexVisibleAttachments
private void _indexVisibleAttachments(Page page, org.apache.solr.common.SolrInputDocument document)
-
_populateDatesOfPage
protected void _populateDatesOfPage(Page page, org.apache.solr.common.SolrInputDocument document)
Populate the solr input document with dates from the page- Parameters:
page
- The pagedocument
- The Solr document
-
_indexStringFields
private void _indexStringFields(org.apache.solr.common.SolrInputDocument document, String documentId, String fieldName, String fieldValue, String language)
-
_getTagsWithAncestors
protected Set<String> _getTagsWithAncestors(Page page)
Get all the page tags with their ancestors.- Parameters:
page
- The page.- Returns:
- All the page tags with their ancestors.
-
_populatePageContentsDocument
protected void _populatePageContentsDocument(Page page, org.apache.solr.common.SolrInputDocument document) throws Exception
Index the content of the page.- Parameters:
page
- the page to index.document
- the document to populate.- Throws:
Exception
- if an error occurs.
-
_indexFacetableField
protected void _indexFacetableField(Content content, org.apache.solr.common.SolrInputDocument document)
Index the facetable fields of a content into the page solr document- Parameters:
content
- The contentdocument
- The main page solr document.
-
_findAndIndexFacetableField
protected void _findAndIndexFacetableField(String[] pathSegments, String lang, CompositeMetadata metadata, MetadataDefinition definition, IndexingField field, org.apache.solr.common.SolrInputDocument pageDocument)
Index the facetable fields of a content into the page solr document- Parameters:
pathSegments
- The path of metadatalang
- The languagemetadata
- The parent composite metadatadefinition
- The metadata definitionfield
- The indexing fieldpageDocument
- The Solr page document
-
_getLastModificationDate
protected Date _getLastModificationDate(Page page)
Computes the last modification date of a page.- Parameters:
page
- the page.- Returns:
- the last modification date or
null
.
-
_getFirstValidationDate
protected Date _getFirstValidationDate(Page page)
Computes the first validation date of a page.- Parameters:
page
- the page.- Returns:
- the first validation date or
null
.
-
_getLastValidationDate
protected Date _getLastValidationDate(Page page)
Computes the last validation date of a page.- Parameters:
page
- the page.- Returns:
- the last validation date or
null
.
-
_getLastMajorValidationDate
protected Date _getLastMajorValidationDate(Page page)
Computes the last major validation date of a page.- Parameters:
page
- the page.- Returns:
- the last major validation date or
null
.
-
_getLastDate
protected Date _getLastDate(Page page, Function<Content,Date> dateRetriever)
Computes a "last date" of a page, using the simple and naive following algorithm:
From all the dates from each of its contents, keep the greatest of them.- Parameters:
page
- the page.dateRetriever
- The function to retrieve a Date from a Content of the Page- Returns:
- the "last date" or
null
.
-
_getFirstDate
protected Date _getFirstDate(Page page, Function<Content,Date> dateRetriever)
Computes a "first date" of a page, using the simple and naive following algorithm:
From all the dates from each of its contents, keep the lowest of them.- Parameters:
page
- the page.dateRetriever
- The function to retrieve a Date from a Content of the Page- Returns:
- the "first date" or
null
.
-
_populateAdditionalProperties
protected void _populateAdditionalProperties(Page page, org.apache.solr.common.SolrInputDocument document) throws Exception
Populate the solr input document by adding fields to index.- Parameters:
page
- the page to index.document
- the solr input document- Throws:
Exception
- if something goes wrong when processing the indexation of the page
-
indexPageAttachments
public void indexPageAttachments(ResourceCollection collection, Page page) throws Exception
Index page attachments as new entries in the index.- Parameters:
collection
- the collection of attachmentspage
- the page whose attachments will be indexed- Throws:
Exception
- if something goes wrong when indexing the attachments of the page
-
_indexPageAttachments
private void _indexPageAttachments(ResourceCollection collection, Page page, org.apache.solr.client.solrj.SolrClient solrClient) throws Exception
- Throws:
Exception
-
indexPageAttachment
public void indexPageAttachment(Resource resource, Page page) throws Exception
Index a page attachment
-
_indexPageAttachment
private void _indexPageAttachment(Resource resource, Page page, org.apache.solr.client.solrj.SolrClient solrClient) throws Exception
- Throws:
Exception
-
_populatePageAttachmentDocument
private void _populatePageAttachmentDocument(Resource resource, org.apache.solr.common.SolrInputDocument document, Page page) throws Exception
- Throws:
Exception
-
_indexPageDocument
protected void _indexPageDocument(Page page, org.apache.solr.common.SolrInputDocument document, String workspaceName, org.apache.solr.client.solrj.SolrClient solrClient) throws org.apache.solr.client.solrj.SolrServerException, IOException
Index a populated solr input document of type Page.- Parameters:
page
- the page from which the input document is createddocument
- the input document to add to the solr indexworkspaceName
- The workspace namesolrClient
- The solr client to use- Throws:
org.apache.solr.client.solrj.SolrServerException
- if there is an error on the Solr serverIOException
- if there is a communication error with the server
-
_indexResourceDocument
protected void _indexResourceDocument(Resource resource, org.apache.solr.common.SolrInputDocument document, org.apache.solr.client.solrj.SolrClient solrClient) throws org.apache.solr.client.solrj.SolrServerException, IOException
Index a populated solr input document of type Resource.- Parameters:
resource
- the resource from which the input document is createddocument
- the input documentsolrClient
- The solr client to use- Throws:
org.apache.solr.client.solrj.SolrServerException
- if there is an error on the serverIOException
- if there is a communication error with the server
-
unindexPage
public void unindexPage(String pageId, boolean unindexRecursively, boolean unindexAttachments) throws Exception
Un-index a page by its ID for all workspaces and commit- Parameters:
pageId
- The page ID.unindexRecursively
- also unindex child pages if requested.unindexAttachments
- also unindex page attachments- Throws:
Exception
- if an error occurs during index update.
-
unindexPage
public void unindexPage(String pageId, String workspaceName, boolean unindexRecursively, boolean unindexAttachments) throws Exception
De-index a page (and optionally its children pages).- Parameters:
pageId
- the page to be de-indexed.workspaceName
- The workspace where to work inunindexRecursively
- also unindex child pages if requested.unindexAttachments
- also unindex page attachments- Throws:
Exception
- if an error occurs during index update.
-
_unindexPageDocument
protected void _unindexPageDocument(String pageId, String workspaceName, boolean unindexRecursively, boolean unindexAttachments) throws org.apache.solr.client.solrj.SolrServerException, IOException, QuerySyntaxException
Deindex a document of type Page. Also deindex attachments of a page- Parameters:
pageId
- the id of the page to deindexworkspaceName
- The workspace nameunindexRecursively
- also unindex child pages if requested.unindexAttachments
- also unindex page attachments- Throws:
org.apache.solr.client.solrj.SolrServerException
- if there is an error on the serverIOException
- if there is a communication error with the serverQuerySyntaxException
- if the uri query can't be built because of a syntax error.
-
reindexPage
public void reindexPage(String pageId, boolean reindexRecursively, boolean reindexAttachments) throws Exception
Reindex a page by its ID for all workspaces and commit- Parameters:
pageId
- The page ID.reindexRecursively
- also reindex child pages if requested.reindexAttachments
- also reindex page attachments- Throws:
Exception
- if an error occurs during index update.
-
reindexPage
public void reindexPage(String pageId, String workspaceName, boolean reindexRecursively, boolean reindexAttachments) throws IndexingException
Reindex a page by its ID.- Parameters:
pageId
- The page ID.workspaceName
- The workspace where to work inreindexRecursively
- also reindex child pages if requested.reindexAttachments
- also reindex page attachments- Throws:
IndexingException
- if an error occurs during index update.
-
-