Foldr Zen Zone

Foldr Server

Search – Indexing Google Drive using delta queries to track changes

Introduction

The administrator can configure Foldr to allow users to search Google Drive or Google Shared Drives sites by one of two methods:

1. Using the Google cloud service API. This requires very little setup and configuration in Foldr, but relies on the cloud service for search results.  Complex searches are not available using this method:

2. Using Foldr’s built in search to crawl Google Drive and store its contents in an index. Using Foldr’s own search requires additional configuration but allows administrators to provide users with powerful search using its query builder and extra functionality by making use of the other Foldr features such as Captur, MaSH, custom fields and built-in or third-party cloud OCR.

When using option 2, Foldr can optionally leverage delta queries to quickly pick up changes made to Files outside of Foldr. Without using delta queries, following its initial crawl , the Foldr index crawler would need to check each file in turns to find if it already resides in the index and if it’s been updated/modified.

Requirements

Foldr search configured and a full crawl is performed to establish a baseline.
Delta queries cannot be used when using the cloud service API to provide search to users.

Enabling Delta Queries

Assuming Foldr search has enabled and configured, navigate to Google Drive or the Shared Google Drive as configured in Foldr Settings > Files & Storage.

Click the Search & Data tab

Click the Settings tab

Scroll down and enable the Index Deltas toggle

Click Save Changes

Establishing a baseline

Now delta queries are enabled, Foldr must now perform a full index (or a forced full reindex if the site is already indexed) to establish its baseline. Once a baseline has been established, subsequent crawls can use deltas queries for any changes made since the baseline was created. This allows Foldr to pick up changes quickly that are made outside of Foldr.

Click the Search & Data > Activity tab and select Crawl Now

Enable the toggle labelled Force re-indexing of all data 

Click Crawl Now

The server will now run a full reindex and establish the baseline to allow delta queries to be used in later scheduled or manual crawl jobs

Configure Schedule

It is recommended to configure a suitable schedule to set Foldr to run subsequent crawl jobs without manual intervention. These jobs will automatically use delta queries to allow Google Workspace to report what files have been modified, created or deleted since the baseline was established.

Click Search & Data > Settings tab

Under Crawl Jobs  > Schedule, select a suitable option from the drop-down menu.  Either Daily, Weekly, Monthly with the required time.  Alternatively you can use cron to set a custom schedule, these work well when using delta queries to query the site on a more frequent basis.

Using Cron for Advanced Scheduling

Using the Cron option, it is possible to configure granular schedules. The Cron syntax is explained below:


Examples – Run crawl job:

Every 20 minutes: */20 * * * *

Every hour:  0 * * * *

Every day at 12pm: 0 12 * * *

Every day at 1:30pm, 3:30pm and 5:30pm: 30 13,15,17 * * *

Every Monday, Wednesday and Friday at 8pm:  0 20 * * 1,3,5

Additional Cron examples can be found on the useful online resource at https://cron.help

Every journey begins with a single step

Declutter, Focus, Zone In. Repeat.

Begin your File Zen Journey