Package org.ametys.web.indexing.solr
Class SolrPageIndexer
java.lang.Object
org.ametys.runtime.plugin.component.AbstractLogEnabled
org.ametys.web.indexing.solr.SolrPageIndexer
- All Implemented Interfaces:
SolrFieldNames
,LogEnabled
,SolrWebFieldNames
,Component
,Contextualizable
,Serviceable
public class SolrPageIndexer
extends AbstractLogEnabled
implements Component, Serviceable, SolrWebFieldNames, Contextualizable
Component responsible for indexing a page with all its contents.
-
Field Summary
Modifier and TypeFieldDescriptionprotected AdditionalPropertyIndexerExtensionPoint
The additional property indexer extension point.protected AmetysObjectResolver
The Ametys object resolverprotected Context
The avalon contextThe extension point for PageVisibleAttachmentIndexersprotected ServiceExtensionPoint
The service extension point.protected SolrClientProvider
The Solr client providerprotected SolrIndexer
The Solr indexerprotected SolrResourceIndexer
Solr Ametys resources indexerprotected TagProviderExtensionPoint
The tag provider extension point.static final String
The avalon role.Fields inherited from interface org.ametys.cms.content.indexing.solr.SolrFieldNames
ACL_INIT_VALUE_ALLOWED_GROUPS, ACL_INIT_VALUE_ALLOWED_USERS, ACL_INIT_VALUE_ANONYMOUS, ACL_INIT_VALUE_ANYCONNECTED, ACL_INIT_VALUE_DENIED_GROUPS, ACL_INIT_VALUE_DENIED_USERS, ALL_CONTENT_TYPES, ALL_MIXIN_TYPES, ALL_TAGS, ATTACHMENT_CONTENT_ID, CONTENT_COMMENTS_NONVALIDATED, CONTENT_COMMENTS_VALIDATED, CONTENT_CREATOR, CONTENT_FIRST_VALIDATOR, CONTENT_LANGUAGE, CONTENT_LANGUAGES, CONTENT_LAST_CONTRIBUTOR, CONTENT_LAST_MAJOR_VALIDATOR, CONTENT_LAST_VALIDATOR, CONTENT_NAME, CONTENT_OUTGOING_REFEERENCES_RESOURCE_IDS, CONTENT_TITLES, CONTENT_TYPE_RESOURCE, CONTENT_TYPES, CONTENT_VISIBLE_ATTACHMENT_RESOURCE_IDS, CREATION_DATE, DC_CONTRIBUTOR, DC_COVERAGE, DC_CREATOR, DC_DATE, DC_DESCRIPTION, DC_FORMAT, DC_IDENTIFIER, DC_LANGUAGE, DC_PUBLISHER, DC_RELATION, DC_RIGHTS, DC_SOURCE, DC_SUBJECT, DC_TITLE, DC_TYPE, DOCUMENT_TYPE, EXACT_WS_OPERATOR, EXCERPT, FILENAME, FULL, ID, IS_AMETYS_OBJECT, LANGUAGE_SEPARATOR, LAST_MODIFIED, LENGTH, MIME_TYPES, MIXIN_TYPES, PATH, PSEUDO_CONTENT_TYPE_VALUE_RESOURCE, PSEUDO_CONTENT_TYPES, REPEATER_ENTRY_POSITION, RESOURCE_ANCESTOR_AND_SELF_IDS, RESOURCE_ANCESTOR_IDS, RESOURCE_CREATOR, RESOURCE_DATE, RESOURCE_LAST_MODIFIED, RESOURCE_MIME_TYPE_GROUP, RESOURCE_ROOT_ID, SIMPLE_CONTENT_PARENTS, STEMMED_OPERATOR, SUB_CONTENT, SYSTEM_FULL, TAGS, TITLE, TITLE_SORT, TYPE_CONTENT, TYPE_CONTENT_ATTACHMENT_RESOURCE, TYPE_CONTENT_ATTRIBUTE_RESOURCE, TYPE_REPEATER, TYPE_RESOURCE, TYPE_WF_ENTRY, TYPE_WF_STEP, WORKFLOW_CURRENT_STEPS, WORKFLOW_CURRENT_STEPS_DV, WORKFLOW_ENTRY_STATE, WORKFLOW_HISTORY_STEPS, WORKFLOW_HISTORY_STEPS_DV, WORKFLOW_NAME, WORKFLOW_REF, WORKFLOW_REF_DV, WORKFLOW_STEP_ACTIONID, WORKFLOW_STEP_CALLER, WORKFLOW_STEP_DUEDATE, WORKFLOW_STEP_FINISHDATE, WORKFLOW_STEP_ID, WORKFLOW_STEP_OWNER, WORKFLOW_STEP_STARTDATE, WORKFLOW_STEP_STATUS
Fields inherited from interface org.ametys.web.indexing.solr.SolrWebFieldNames
ATTACHMENT_PAGE_ID, CONTENT_IDS, CONTENT_INTERESTING_DATES, DATE_FOR_SORTING, DATES_FACET, FACETABLE_CONTENT_FIELD_PREFIX, LASTNAME_FOR_SORTING, PAGE_ANCESTOR_IDS, PAGE_CONTENT_TYPES, PAGE_DEPTH, PAGE_IDS, PAGE_LONG_TITLE, PAGE_OUTGOING_REFEERENCES_RESOURCE_IDS, PAGE_PARENT_ID, PAGE_TITLE, PAGE_TYPE, PAGE_VISIBLE_ATTACHMENT_RESOURCE_IDS, SECTION_PAGE_TITLE, SERVICE_IDS, SHARED, SITE_NAME, SITE_TYPE, SITEMAP_NAME, TEMPLATE, TYPE_PAGE, TYPE_PAGE_RESOURCE
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionprotected void
_findAndIndexFacetableField
(org.apache.solr.common.SolrInputDocument pageDocument, ModelAwareDataHolder dataHolder, ModelItem modelItem, DataContext context) Index the facetable fields of a data holder into the page solr documentprotected ZonedDateTime
_getFirstDate
(Page page, Function<Content, ZonedDateTime> dateRetriever) Computes a "first date" of a page, using the simple and naive following algorithm:
From all the dates from each of its contents, keep the lowest of them.protected ZonedDateTime
_getFirstValidationDate
(Page page) Computes the first validation date of a page.protected ZonedDateTime
_getLastDate
(Page page, Function<Content, ZonedDateTime> dateRetriever) Computes a "last date" of a page, using the simple and naive following algorithm:
From all the dates from each of its contents, keep the greatest of them.protected ZonedDateTime
Computes the last major validation date of a page.protected ZonedDateTime
_getLastModificationDate
(Page page) Computes the last modification date of a page.protected ZonedDateTime
_getLastValidationDate
(Page page) Computes the last validation date of a page._getTagsWithAncestors
(Page page) Get all the page tags with their ancestors.protected Collection<String>
_getValuesToIndex
(ModelAwareDataHolder dataHolder, ElementDefinition elementDefinition, DataContext context) Retrieves the values to index if the field is facetable, or an empty collectionprotected void
_indexFacetableFields
(Content content, org.apache.solr.common.SolrInputDocument document) Index the facetable fields of a content into the page solr documentprotected void
_indexPageDocument
(Page page, org.apache.solr.common.SolrInputDocument document, String workspaceName, org.apache.solr.client.solrj.SolrClient solrClient) Index a populated solr input document of type Page.protected void
_indexResourceDocument
(Resource resource, org.apache.solr.common.SolrInputDocument document, org.apache.solr.client.solrj.SolrClient solrClient) Index a populated solr input document of type Resource.protected void
_populateAdditionalProperties
(Page page, org.apache.solr.common.SolrInputDocument document) Populate the solr input document by adding fields to index.protected void
_populateDatesOfPage
(Page page, org.apache.solr.common.SolrInputDocument document) Populate the solr input document with dates from the pageprotected void
_populatePageContentsDocument
(Page page, org.apache.solr.common.SolrInputDocument document) Index the content of the page.protected void
_populatePageDocument
(Page page, org.apache.solr.common.SolrInputDocument document) Populate the solr input document by adding fields to index.protected void
_unindexPageDocument
(String pageId, String workspaceName, boolean unindexRecursively, boolean unindexAttachments) Deindex a document of type Page.void
contextualize
(Context context) void
Index a page and eventually its children, recursively, in all workspaces and commit
By default, children pages will be actually indexed if indexRecursively is true and if those pages are not already indexed.void
Index a page and eventually its children, recursively.
By default, children pages will be actually indexed if indexRecursively is true and if those pages are not already indexed.void
indexPage
(String pageId, String workspaceName, boolean indexRecursively, boolean indexAttachments, org.apache.solr.client.solrj.SolrClient solrClient) Index a page and eventually its children, recursively.
By default, children pages will be actually indexed if indexRecursively is true and if those pages are not already indexed.void
indexPageAttachment
(Resource resource, Page page) Index a page attachmentvoid
indexPageAttachments
(ResourceCollection collection, Page page) Index page attachments as new entries in the index.void
reindexPage
(String pageId, boolean reindexRecursively, boolean reindexAttachments) Reindex a page by its ID for all workspaces and commitvoid
reindexPage
(String pageId, String workspaceName, boolean reindexRecursively, boolean reindexAttachments) Reindex a page by its ID.void
service
(ServiceManager manager) void
unindexPage
(String pageId, boolean unindexRecursively, boolean unindexAttachments) Un-index a page by its ID for all workspaces and commitvoid
unindexPage
(String pageId, String workspaceName, boolean unindexRecursively, boolean unindexAttachments) De-index a page (and optionally its children pages).Methods inherited from class org.ametys.runtime.plugin.component.AbstractLogEnabled
getLogger, setLogger
-
Field Details
-
ROLE
The avalon role. -
_solrClientProvider
The Solr client provider -
_solrIndexer
The Solr indexer -
_solrResourceIndexer
Solr Ametys resources indexer -
_pageVisibleAttachmentIndexerEP
The extension point for PageVisibleAttachmentIndexers -
_additionalPropertiesIndexerEP
The additional property indexer extension point. -
_tagProviderEP
The tag provider extension point. -
_serviceExtensionPoint
The service extension point. -
_ametysObjectResolver
The Ametys object resolver -
_context
The avalon context
-
-
Constructor Details
-
SolrPageIndexer
public SolrPageIndexer()
-
-
Method Details
-
service
- Specified by:
service
in interfaceServiceable
- Throws:
ServiceException
-
contextualize
- Specified by:
contextualize
in interfaceContextualizable
- Throws:
ContextException
-
indexPage
public void indexPage(String pageId, boolean indexRecursively, boolean indexAttachments) throws Exception Index a page and eventually its children, recursively, in all workspaces and commit
By default, children pages will be actually indexed if indexRecursively is true and if those pages are not already indexed.- Parameters:
pageId
- the page to be indexed.indexRecursively
- to also process children pages.indexAttachments
- to index page attachments- Throws:
Exception
- if an error occurs during indexation.
-
indexPage
public void indexPage(String pageId, String workspaceName, boolean indexRecursively, boolean indexAttachments) throws IndexingException Index a page and eventually its children, recursively.
By default, children pages will be actually indexed if indexRecursively is true and if those pages are not already indexed.- Parameters:
pageId
- the page to be indexed.workspaceName
- the workspace where to indexindexRecursively
- to also process children pages.indexAttachments
- to index page attachments- Throws:
IndexingException
- if an error occurs during indexation.
-
indexPage
public void indexPage(String pageId, String workspaceName, boolean indexRecursively, boolean indexAttachments, org.apache.solr.client.solrj.SolrClient solrClient) throws IndexingException Index a page and eventually its children, recursively.
By default, children pages will be actually indexed if indexRecursively is true and if those pages are not already indexed.- Parameters:
pageId
- the page to be indexed.workspaceName
- the workspace where to indexindexRecursively
- to also process children pages.indexAttachments
- to index page attachmentssolrClient
- The solr client to use- Throws:
IndexingException
- if an error occurs during indexation.
-
_populatePageDocument
protected void _populatePageDocument(Page page, org.apache.solr.common.SolrInputDocument document) throws Exception Populate the solr input document by adding fields to index.- Parameters:
page
- the page to index.document
- the solr input document- Throws:
Exception
- if something goes wrong when processing the indexation of the page
-
_populateDatesOfPage
Populate the solr input document with dates from the page- Parameters:
page
- The pagedocument
- The Solr document
-
_getTagsWithAncestors
Get all the page tags with their ancestors.- Parameters:
page
- The page.- Returns:
- All the page tags with their ancestors.
-
_populatePageContentsDocument
protected void _populatePageContentsDocument(Page page, org.apache.solr.common.SolrInputDocument document) throws Exception Index the content of the page.- Parameters:
page
- the page to index.document
- the document to populate.- Throws:
Exception
- if an error occurs.
-
_indexFacetableFields
protected void _indexFacetableFields(Content content, org.apache.solr.common.SolrInputDocument document) Index the facetable fields of a content into the page solr document- Parameters:
content
- The contentdocument
- The main page solr document.
-
_findAndIndexFacetableField
protected void _findAndIndexFacetableField(org.apache.solr.common.SolrInputDocument pageDocument, ModelAwareDataHolder dataHolder, ModelItem modelItem, DataContext context) Index the facetable fields of a data holder into the page solr document- Parameters:
pageDocument
- The Solr page documentdataHolder
- the parent data holdermodelItem
- the model itemcontext
- the context of the data to index
-
_getValuesToIndex
protected Collection<String> _getValuesToIndex(ModelAwareDataHolder dataHolder, ElementDefinition elementDefinition, DataContext context) Retrieves the values to index if the field is facetable, or an empty collection- Parameters:
dataHolder
- the data holderelementDefinition
- the definition of the fieldcontext
- the context of the data to index- Returns:
- the values to index if the field is facetable, or an empty collection
-
_getLastModificationDate
Computes the last modification date of a page.- Parameters:
page
- the page.- Returns:
- the last modification date or
null
.
-
_getFirstValidationDate
Computes the first validation date of a page.- Parameters:
page
- the page.- Returns:
- the first validation date or
null
.
-
_getLastValidationDate
Computes the last validation date of a page.- Parameters:
page
- the page.- Returns:
- the last validation date or
null
.
-
_getLastMajorValidationDate
Computes the last major validation date of a page.- Parameters:
page
- the page.- Returns:
- the last major validation date or
null
.
-
_getLastDate
Computes a "last date" of a page, using the simple and naive following algorithm:
From all the dates from each of its contents, keep the greatest of them.- Parameters:
page
- the page.dateRetriever
- The function to retrieve a Date from a Content of the Page- Returns:
- the "last date" or
null
.
-
_getFirstDate
Computes a "first date" of a page, using the simple and naive following algorithm:
From all the dates from each of its contents, keep the lowest of them.- Parameters:
page
- the page.dateRetriever
- The function to retrieve a Date from a Content of the Page- Returns:
- the "first date" or
null
.
-
_populateAdditionalProperties
protected void _populateAdditionalProperties(Page page, org.apache.solr.common.SolrInputDocument document) throws Exception Populate the solr input document by adding fields to index.- Parameters:
page
- the page to index.document
- the solr input document- Throws:
Exception
- if something goes wrong when processing the indexation of the page
-
indexPageAttachments
Index page attachments as new entries in the index.- Parameters:
collection
- the collection of attachmentspage
- the page whose attachments will be indexed- Throws:
Exception
- if something goes wrong when indexing the attachments of the page
-
indexPageAttachment
Index a page attachment -
_indexPageDocument
protected void _indexPageDocument(Page page, org.apache.solr.common.SolrInputDocument document, String workspaceName, org.apache.solr.client.solrj.SolrClient solrClient) throws org.apache.solr.client.solrj.SolrServerException, IOException Index a populated solr input document of type Page.- Parameters:
page
- the page from which the input document is createddocument
- the input document to add to the solr indexworkspaceName
- The workspace namesolrClient
- The solr client to use- Throws:
org.apache.solr.client.solrj.SolrServerException
- if there is an error on the Solr serverIOException
- if there is a communication error with the server
-
_indexResourceDocument
protected void _indexResourceDocument(Resource resource, org.apache.solr.common.SolrInputDocument document, org.apache.solr.client.solrj.SolrClient solrClient) throws org.apache.solr.client.solrj.SolrServerException, IOException Index a populated solr input document of type Resource.- Parameters:
resource
- the resource from which the input document is createddocument
- the input documentsolrClient
- The solr client to use- Throws:
org.apache.solr.client.solrj.SolrServerException
- if there is an error on the serverIOException
- if there is a communication error with the server
-
unindexPage
public void unindexPage(String pageId, boolean unindexRecursively, boolean unindexAttachments) throws Exception Un-index a page by its ID for all workspaces and commit- Parameters:
pageId
- The page ID.unindexRecursively
- also unindex child pages if requested.unindexAttachments
- also unindex page attachments- Throws:
Exception
- if an error occurs during index update.
-
unindexPage
public void unindexPage(String pageId, String workspaceName, boolean unindexRecursively, boolean unindexAttachments) throws Exception De-index a page (and optionally its children pages).- Parameters:
pageId
- the page to be de-indexed.workspaceName
- The workspace where to work inunindexRecursively
- also unindex child pages if requested.unindexAttachments
- also unindex page attachments- Throws:
Exception
- if an error occurs during index update.
-
_unindexPageDocument
protected void _unindexPageDocument(String pageId, String workspaceName, boolean unindexRecursively, boolean unindexAttachments) throws org.apache.solr.client.solrj.SolrServerException, IOException, QuerySyntaxException Deindex a document of type Page. Also deindex attachments of a page- Parameters:
pageId
- the id of the page to deindexworkspaceName
- The workspace nameunindexRecursively
- also unindex child pages if requested.unindexAttachments
- also unindex page attachments- Throws:
org.apache.solr.client.solrj.SolrServerException
- if there is an error on the serverIOException
- if there is a communication error with the serverQuerySyntaxException
- if the uri query can't be built because of a syntax error.
-
reindexPage
public void reindexPage(String pageId, boolean reindexRecursively, boolean reindexAttachments) throws Exception Reindex a page by its ID for all workspaces and commit- Parameters:
pageId
- The page ID.reindexRecursively
- also reindex child pages if requested.reindexAttachments
- also reindex page attachments- Throws:
Exception
- if an error occurs during index update.
-
reindexPage
public void reindexPage(String pageId, String workspaceName, boolean reindexRecursively, boolean reindexAttachments) throws IndexingException Reindex a page by its ID.- Parameters:
pageId
- The page ID.workspaceName
- The workspace where to work inreindexRecursively
- also reindex child pages if requested.reindexAttachments
- also reindex page attachments- Throws:
IndexingException
- if an error occurs during index update.
-