SharePoint to Syncplicity

DataHub is an enterprise data integration platform that enables organizations to maximize business value and productivity from their content.  

It connects disparate storage platforms and business applications together, allowing organizations to move, copy, synchronize, gather and organize files as well as their related data across any system. 

DataHub empowers your users with unified access to the most relevant, complete and up-to-date content – no matter where it resides.    

DataHub delivers a user-friendly web-based experience that is optimized for PC, tablet and mobile phone interfaces—so you can monitor and control your file transfers anywhere, from any device.   

DataHub’s true bi-directional hybrid/sync capabilities enable organizations to leverage and preserve content across on-premises systems and any cloud service. Seamless to users, new files/file changes from either system are automatically reflected in the other.  

How does DataHub Work? 

Cloud storage and collaboration platforms continue to be the driving force of digital transformation within the enterprise. However, users need to readily access the content that resides within your existing network file systems, ECM, and other storage platforms – enabling them to be productive, wherever they are.  DataHub is purpose-built to provide boundless enterprise content integration possibilities, the DataHub Platform is 100% open and provides a highly-scalable architecture that enables enterprises to easily meet evolving technology and user demands—no matter how complex. 

The DataHub platform provides: 

  • A low risk approach to moving content to the cloud while maintaining on-premises systems

  • No impact to users, IT staff, business operations or existing storage integrations

  • Ability to extend cloud storage anywhere/any device capabilities to locally-stored content

  • Easy integration of newly acquired business storage platforms into existing infrastructures

The Engine 

DataHub’s bi-directional synchronization engine enables your enterprise to fully-integrate and synchronize your existing on-premises platforms with any cloud service. 

It empowers your users to freely access the content they need while IT staff maintains full governance and control. DataHub integrates with each system's published Application Program Interface (API) at the deepest level—optimizing transfer speeds and preserving all file attributes. 

Security 

DataHub’s 100 percent security-neutral model does not incorporate or use any type of proxy cloud service or other intermediary presence point. All content and related data is streamed directly via HTTPS [256-bit encryption] from the origin to the destination system(s). Additionally, DataHub works with native database encryption. 

Analyzer - Simulation Mode

The DataHub analyzer is a powerful enterprise file transfer simulation that eliminates the guesswork. You will gain granular insight into your entire content landscape including its structure, the use of your files, how old and what type they are, what the metadata contains and more, no matter where the files are located—whether in local storage, remote offices or on user desktops. 

Simulation mode allows you to create a job with all desired configuration options set and execute it as a dry-run. In this mode, no data will actually transfer, no permissions will be set, no changes will be made to either the source or the destination. This can be useful in answering several questions about your content prior to actually running any jobs against your content.

Features and Functionality 

The DataHub Platform enables you with complete integration and control over: 

  • User accounts 

  • User networked home drives 

  • User and group permissions 

  • Document types, notes, and file attributes 

  • Timestamps 

  • Versions 

  • Departmental, project, and team folders 

  • Defined and custom metadata 

Architecture and Performance 

The DataHub platform is built upon a pluggable, content–streaming architecture that enables highly–automated file/data transfer and synchronization capabilities for up to billions of files. File bytes stream in from one connection endpoint (defined by the administrator), across a customer owned and operated network and/or cloud service, then streamed out to a second connection endpoint. Content can also flow bidirectionally across two endpoints, rather than solely from an "origin" to a "destination". 

Supported Features

The DataHub Platform Comparison tool allows you to compare platform features and technical details to determine which are supported for your transfer scenario.

Viewing the Platform Comparison results for your integration will display a list of features of each platform and provide insight early in the integration planning process on what details may need further investigation.

The Platform Comparison tool is available via the Connection, Platforms menu options.

Connection Setup

DataHub is built on a concept of connections.  A connection is made to the source platform and then another connection is made to the destination platform. A job is created to tie the two platforms together. 

When DataHub connects to a content platform, it does so by using the publicly available Application Programming Interface (API) for the specific platform.  This ensures that DataHub is “playing by the rules” for each platform. 

Connections “connect” to a platform as a specific user account. The user account requires the proper permissions to the platform to read/write/update/delete the content, according to what actions the DataHub job is to perform. 

The connection user account should also be set up so that the password does not expire, otherwise the connection will no longer be able to access the platform until the connection has been refreshed with the new password. 

Most connections require a specific user account and its corresponding password.  The user account is typically an email address. 

Authenticated Connections

Authenticated Connections are accounts that have been verified with the cloud-based or network-based platform when created. The connection can be user/password-based or done through OAuth2 flow, where a token is generated based on the granting authorization to DataHub through a user login. This authorization allows DataHub access to the user's drive information (files and folder) on the platform. These connections are used as the source or the destination authentication to transfer your content.

OAuth2 Interactive (Web) Flow

  • Connectors such as Box, Google Drive and Dropbox use the OAuth2 interactive (or web) flow

OAuth2 Client Credentials Flow

  • Connections such as Syncplicity and GSuite uses the OAuth2 client credentials flow

SharePoint

  • SharePoint (all versions, CSOM) uses a custom username/password authentication model

OAuth2 Interactive (Web) Flow

You will need the following information when creating a connection to Network File System, Box, Dropbox and Dropbox for Business:

  • A name for the connection

  • The account User ID, such as jsmith@company.com

  • The password for the User ID

Create Connections

DataHub is built on a concept of connections. A connection is made to the source platform and then another connection is made to the destination platform. Next a job is created to tie the two platforms together.

When DataHub connects to a content platform, it does so by using the publicly available Application Programming Interface (API) for the specific platform. This ensures that DataHub is “playing by the rules” for each platform.

Connections “connect” to a platform as a specific user account. The user account requires the proper permissions to the platform to read/write/update/delete the content, according to what actions the DataHub job is to perform.

The connector user account should also be set up so that the password does not expire, otherwise, the connection will no longer be able to access the platform until the connection has been refreshed with the new password.

Most connections require a specific user account and its corresponding password. The user account is typically an email address.

Creating a Connection

Creating a connection in the DataHub Platform user-interface is easy! Simply add  a connection, select your platform and enter the requested information. DataHub will securely validate your credentials and connect to your source content. 

Microsoft SharePoint 2010/2013/2016

DataHub connections to SharePoint 2010, 2013, and 2016 on-premise platforms can be made by using any valid SharePoint user account with applicable permissions to access content. There are several ways to transfer content, this document outlines each of those situations and how to configure them.

Microsoft SharePoint supported versions:

  • 2010

  • 2013

  • 2016

Create Connection - DataHub Application User-Interface

  • Create New Connection → Select Platform

  • Populate dialog referencing the below table for your connection type

  • Authorize the connection → Test Connection

  • Confirm to add connection

Field

Value

Type

Notes

Display As

Any Value

Optional

Use any value you would like for the connection name.  It will be displayed in the application and you can search and filter on it.

URL

SharePoint URL

Required

URL of SharePoint including site collection where the target Document Library exists

User Name

SharePoint Account User Name

Required

SharePoint User Name. Can coincide with Active Directory UPN (Universal Principal Name)

Password

SharePoint Account Password

Required


SSO Provider

SSO Provider Type

Optional

Examples: Okta, ADFS

SSO URL

SSO URL

Optional

URL of configured SSO Provider

Features and Limitations 

Platforms all have unique features and limitations. DataHub’s transfer engine manages these differences between platforms and allows you to configure actions based on Job Policies and Behaviors. Utilize the Platform Comparison tool to see how your integration platforms may interact regarding features and limitations. 

Note that features and limitations may be different across SharePoint versions. 

SharePoint has the following file/folder restrictions:

  • The max file-size DataHub can upload is 2GB for older SharePoint versions

  • The maximum length of any folder or file name is 128

  • The maximum total length is 255

  • SharePoint 2010/2013 restricted characters in file/folder name include |, #, {, }, %, &, ", ~, +, :, *, ?, <, > and the tab key

  • SharePoint 2010/2013 additional file/folder name restrictions:

    • Leading or trailing periods and whitespace

    • Trailing periods and whitespace before extension

    • Two consecutive periods

    • Non-printable ASCII characters.

    • ~$ prefix

  • SharePoint 2016 restricted characters in file/folder name include |, #, %, ", , /, :, *, ?, < and >

  • SharePoint 2016 additional file/folder name restrictions:

    • Leading whitespace

    • Trailing periods and whitespace. If a file extension is present, trailing periods and whitespace are allowed before the extension

    • Non-printable ASCII characters.

    • ~$ prefix when there is no extension

  • SharePoint restricts certain file extensions. Click here for the complete list

  • Restricted file/folder names:

    • _vti_test

    • _w

    • _t

  • Restricted folder names:

    • Test_files

  • Restricted folders names ending in:

    • .files

    • _files

    • -Dateien

    • _fichiers

    • _bestanden

    • _file

    • _archivos

    • -filer

    • _tiedostot

    • _pliki

    • _soubory

    • _elemei

    • _ficheiros

    • _arquivos

    • _dosyalar

    • _datoteke

    • _fitxers

    • _failid

    • _fails

    • _bylos

    • _fajlovi

    • _fitxategiak

  • SharePoint does not allow duplicate Enterprise Keywords

  • Longest string allowed in an Enterprise Keywords is 255 characters. 

  • For more information on SharePoint restrictions, see Microsoft’s official documentation.

Other platform restrictions:

  • Versioning is supported, if versioning is enabled on the library

  • Date preservation works with SharePoint if you are using author-preservation

  • The transformation of URLs into SharePoint can sometimes increase the overall URL length which can cause either 400 or 414 URL failures (Spaces are the primary way this can happen).  We recommend the removal of spaces or shrink the URL length to resolve this error

  • DataHub does not support Libraries that have Require Check-out  enabled

Create a Syncplicity Connection

The Syncplicity connector in DataHub allows you to analyze, migrate, copy, and synchronize files from your Syncplicity service to cloud storage repositories and on-premise network file shares. DataHub connections to Syncplicity require OAuth 2.0 access. In order to create a connection from DataHub to Syncplicity, you will need to complete configuration on the Syncplicity side, and you will need to provide several pieces of authentication information.  To learn more, click here

Create a Syncplicity Connection

  1. Select Connections > Add connection.

  2. Select Syncplicity as the platform on the Add connection modal.

  3. Enter the connection information. Reference the table below for details about each field.

  4. Test the connection to ensure DataHub can connect using the information entered.

  5. Select Done.


Field

Value

Description

Optional/Required

1

Display as

User-Defined Text Field

Enter the display name for the connection. If you will be creating multiple connections, ensure the name readily identifies the connection. The name displays in the application, and you can use it to search for the connection and filter lists.

Required

2

Application token

Provided by your Syncplicity administrator

Each user can provision a personal application token, which may be used to authenticate in UI-less scenarios via API. This is especially useful for account administration tasks that run in a headless session. If provisioned, an application token is the only information required to log in a user using OAuth 2.0 resource owner credentials grant. You should protect this token.To learn more, click here.

Required

3

App key

Provided by your Syncplicity administrator

dentifier of the third-party application as defined by OAuth 2.0.  To learn more, click here.

Required

4

App secret

Provided by your Syncplicity administrator

The secret (password) of the third-party application as defined by OAuth 2.0. Used with an application key to authenticate a third-party application.  To learn more, click here.

Required

5

New SyncPoint type

Syncpoint type choice

This option instructs SkySync as to what type of folder should be created when a top level folder is created through a DataHub process.  To learn more about these options, click here.

Optional

Features and Limitations

Platforms all have unique features and limitations. DataHub’s transfer engine manages these differences between platforms and allows you to configure actions based on Job Policies and Behaviors. Utilize the Platform Comparison tool to see how your integration platforms may interact regarding features and limitations. 

Supported Features 

Unsupported Features 

Other Features/Limitations 

Version preservation

File lock propagation

Segment path length: 260

Timestamp preservation

Mirror lock ownership

No leading spaces in file name/folder names

Author/Owner preservation

File size maximum

No trailing spaces before or after file extensions

Account map

Path length maximum

No non-printable ASCII characters

Group map

Restricted types

Invalid characters: \  /  <  > 

Permission preservation

Metadata map

Only syncpoints can be shared with other users and have permissions persist.

User impersonation

Tags map

Users with a large number of syncpoints are not supported by Syncplicity.

If you are creating a new impersonation job with a Syncplicity connection and the source or destination location is empty, the user you are impersonating has too many syncpoints. You will need to delete the syncpoints before you can create the job.

Connection Pooling

When transferring data between a source and destination, there are a number of factors that can limit the transfer speed.  Most cloud providers have rate limitations that reduce the transfer rate, but if those limits are account based and it supports impersonation, DataHub can create a pool of accounts that issues commands in a round-robin format across all of the accounts connected to the pool. Any modifications to the connection pool will used on the next job run. 

For example, if a connection pool has two accounts, all commands will be alternated between them. If a third account is added to the pool, the next run of the job will use all three accounts.

Not Supported:

  • "My Computer" and Network File Share (NFS) connections are not supported with Connection Pooling.

User & Group Maps 

A user account or group map provides the ability to explicitly associate users and groups for the purposes of setting ownership and permissions on items transferred.  These mappings can happen automatically using rules or explicitly using an exception.  Accounts or groups can be excluded by specifying an exclusion, and unmapped users can be defaulted to a known user.  

Here are a few things to consider when creating an account or group map: 

  • A source and destination connection are required and need to match the source and destination of the job that will be referencing the user or group map. 

  • A map can be created before or during the creation of the job. 

  • A map can be used across multiple jobs. 

  • Once updated, the updates will not be reapplied to content that has already been transferred. 

User & Group Map Import Templates

Please see Account Map / Group Map | CSV File Guidelines for map templates and sample downloads.

User & Group Map Exceptions

A user or group map exception provides the ability to explicitly map a specific user from one platform to another.  These are exceptions to the automatic account or group mapping policies specified.  User account or group map exceptions can be defined during the creation of the map or can be imported from a comma-separated values (CSV) file. 

User & Group Map Exclusions

A user or group map exclusion provides the ability to explicitly exclude an account or group from owner or permissions preservation.  User account or group map exclusions can be defined during the creation of the map or can be imported from a comma-separated values (CSV) file.  

Transfer Planner 

 At the start of a project, it is common to begin planning with questions like "How long should I expect this to take?" 

Transfer Planner allows you to outline the basic assumptions of any integration, primarily around the initial content copy at the beginning of a migration or first synchronization.  It uses basic assumptions to begin visualization of the process, without requiring any setup of connections or jobs. 

The tool estimates and graphs a time line to complete the transfer based on the information entered in the Assumptions area. The time line assumes a start date of today and uses the values in the Assumptions section to model the content transfer.  

The Transfer Planner automatically recalculates the predicted time line if you change any of the values, making simple “what if?” scenario evaluations.  Press Reset to restore default values for the transfer planner tool. 

The window displays projected Total Transfer in dark blue and Daily Transfer Rate in light blue. Hovering the mouse pointer over the graph displays estimated transfer details for that day. 

You can see the impact on the project timeline by changing the values in the Assumptions area. The graph will redraw to reflect your new values. 

Note that the Transfer Planner is primarily driven by the amount of data needing to be processed. DataHub has various tools for transferring versions of files (if the platform supports this feature), which can increase the size of your data set. It also has the ability to filter out specific files by their type or by other rules you set. At this stage, a rough estimate of total size is recommended as it can refined later using Simulation Mode.

Simulation Mode

Simulation mode allows you to create a job with all desired configuration options set and execute it as a dry run. In this mode, no data will actually transfer, no permissions will be set, and no changes will be made to either the source or the destination.

This can be useful in answering several questions about your content prior to actually running any jobs against your content.

How much content do I have?

  • An important first step in any migration is to determine how much content you actually have, as this can help in determining how long a migration will take.

What kinds of content do I have?

  • Another important step in any migration is to determine what kinds of content you actually have.

  • Many organizations have accumulated a lot of content and some of that may not be useful on the desired destination platform.

  • The results of a simulation mode job can help you determine if you should introduce any filter rules to narrow the scope of the job.

  • An example would be if you should exclude executable files (.exe or .bat files) or exclude files older than 3 years old.

What kinds of issues should I expect to run into?

  • During the course of a migration, there are many things to consider and unknown issues that can arise, many of which will only present themselves once you start doing something with the source and destination.

  • Running a job in simulation mode can help you identify some of those issues before you actually start transferring content.

Examples can include:

  • Are my user mappings configured correctly?

  • Does the scope of the job capture everything that I expected it to capture?

  • Do I have files that are too large for the destination platform?

  • Do I have permissions that are incompatible with the destination platform (i.e. ACL vs waterfall)?

  • Do I have files or folders that are too long or contain invalid characters that the destination platform will not accept?

Create a Simulation Job

During the job creation workflow, the last stage before creating the job there will be an option to enable simulation mode.

When a job is in simulation mode, it can be run and scheduled like any other job, but no data will be transferred. 

Transition a Simulation Job to Transfer Content

After review, a simulation job can be transitioned to a live job that will begin to transfer your content to the destination platform.

Create a Job

DataHub delivers a user-friendly web-based experience that is optimized for PC, tablet and mobile phone interfaces—so you can monitor and control your file transfers anywhere, from any device.   

DataHub’s true bi-directional hybrid/sync capabilities enable organizations to leverage and preserve content across on-premises systems and any cloud service. Seamless to users, new files/file changes from either system are automatically reflected in the other.  

DataHub uses jobs to perform specific actions between the source and destination platforms. The most common type of jobs are copy and sync; please see Create New Job | Transfer Direction for more information.

All jobs can be configured to run manually or on a defined schedule. This option will be presented as the last configuration step.

To create a job, select the Jobs option from the left menu and click on Create Job. DataHub will lead you through a wizard to select all the applicable options for your scenario.

The main job creation steps include:

  • Selecting a Job Type

  • Configuring Locations

  • Defining Transfer Policies

  • Defining Job Transfer Behaviors

  • Advanced Options

  • Summary | Review, Create Job, and Schedule

Job Type 

Job type defines the kind of job and the actions the job will perform with the content.  There are two main job types available: basic transfer and folder Mapping. 

Basic Transfer - Transfer items between one connection and another

This will copy all content (files, folders) from the source to the destination. Each Job run will detect any new content on the Source and copy to the Destination

For more information, please see Create New Job | Transfer Direction.

Define Source & Destination Locations

All platform connections made in the DataHub Platform application will be available in the locations drop-down lists when creating a job. 

  • If your connections were created with Administrative privileges, you may also have the ability to impersonate another user within your organization.

  • Source defines the location of your current content you wish to transfer.

  • Destination defines the location of where you would like your content to go.

Configuring Your Locations - Impersonation

Impersonation allows a site admin access to all the folders on the site, including those that belong to other users. With DataHub, a job can be set up using the username and password of the site admin to sync/migrate/copy files to or from a different user's account without ever having the username or password of that user.

How and why would I use impersonation? 

This allows a site admin access to all the folders on the site, including those that belong to other users.  Within DataHub, a job can be set up using the username and password of the site admin to sync/migrate/copy files to or from a different user's account without ever having the username or password of that user.

Enable Run as user...

Choose Source User

Job Category

The category function allows for the logical grouping of jobs for reporting and filtering purposes.  The category is optional and does not alter the job function in any way.

DataHub comes with two default job categories:

Maintenance: DataHub maintenance jobs only. This category allows you to view the report of background maintenance jobs and is not intended for newly created transfer jobs.

Default: When a category is not defined during job creation, it will automatically be given the default category. This option allows you to create a report for all jobs that a custom category was not assigned.

Create Job Category

Enable feature and select from existing job categories or create a new category.

From the jobs grid, filter by category

Job Policies

Define what should happen once items have been successfully transferred and set up rules around how to deal with content as it is updated on your resources while the job is running.

  • DataHub works on the concept of “deltas” where the transfer engine only transfers files after they have been updated.

  • File version conflicts occur when the same file on the source and destination platforms have been updated in between job executions.

  • Policies define how DataHub handles file version conflicts and whether or not it persists a detected file deletion. 

  • Each job has its own policies defined and the settings are NOT global across all jobs.

Conflict Policy - File Version Conflicts

When a conflict is detected on either the source or the destination, Conflict Policy determines how DataHub will behave.

For more information, please see Conflict Policy.

Delete Policy - Deleted Items

When a delete is detected on either the Source or the Destination, Delete Policy determines how DataHub will behave. 

For more information, please see Delete Policy.

Behaviors

Behaviors determine how this job should execute and what course of action to take in different scenarios. All behaviors are enabled by default as recommended settings to ensure content is transferred successfully to the destination.

Zip Unsupported Files / Restricted Content

Enabling this behavior allows DataHub to compress any file that is not supported on the destination into a .zip format before being transferred. This will be done instead of flagging the item for manual remediation and halting the transfer of the file.

For example, if you attempt to transfer the file "db123.cmd" from a Network File Share to SharePoint, DataHub will compress the file to "db123.zip" before transferring it over, avoiding an error message. 

Allow unsupported file names to be changed

Segment Transformation policy controls if DataHub can change folder and file names to comply to platform's restrictions.

Enabling this behavior allows DataHub to change the names of folders and files that contain characters that are not supported by the destination before transferring the file. This will be done instead of flagging the file for manual remediation and preventing it from being transferred.

When this occurs, the unsupported character will be transformed into a underscore.

For example, if you attempted to transfer the file "Congrats!.txt" from box to NFS, it would be transformed to "Congrats_.txt" and appear that way on the destination.

Preserve file versioning between locations

DataHub will preserve and transfer all versions of a file on supported platforms.

Advanced

These optional job configurations determine what features you want to preserve, filter or add during your content transfer. 

Filtering

Filtering defines rules for determining which items are included or excluded during transfer. For more information, please see Job Filters.

Job Filters | Filter By Name Pattern

Job Filters | Filter By Extensions or Type

Job Filters | Filter By Size

Job Filters | Filter By Date Range or Age

Job Filters | Filter by Metadata

Job Filters | Metadata Conjunctions

Permission Preservation

This setting enables DataHub to determine how permissions are transferred across platforms.

Permissions | Author / Owner Preservation

Permissions | Permissions Preservation

Permissions | Permissions Import

Permissions | Preserve Shared Links

Metadata Mapping

Metadata mapping allows you to document your source metadata and map how you want it applied to the destination in CSV format. Enabling this feature will offer the ability to import the CSV file and apply it during job creation.

For more information, please see Metadata Import.

Scripting

Some DataHub features are not available yet in the user-interface. The scripting feature allows the advanced DataHub user to enter advanced transfer features by inserting JSON formatted job controls. 

Enabling this option will allow you to leverage these features and apply it during job creation.

Job Summary - Review your job configuration

Before you create your job, review all your configurations and adjust as needed. Modifying your job after creation is not supported; however, the option to duplicate your current job will allow you to make any adjustments without starting from the beginning. 

  • The Edit option will take you directly to the configuration to make changes.

Define Job Schedule

During job creation, the final step is to define when the job will run and what criteria will define when it stops. 

  • Save job will launch the job scheduler.

  • Save job and run it right now will trigger the job to start immediately. It will run every 15 mins after the last execution completes.

Schedule Stop Policies 

Stop policies determine when a job should stop running.  If none of the stop policies are enabled, a scheduled job will continue to run until it is manually stopped or removed. 

The options for the stop policy are: 

Stop after a number of total runs

The number of total executions before the job will move to "complete" status

Stop after a number of runs with no changes

The job has run and detected no further changes; all content has transferred successfully. 

If new content is added to the source and the job runs again, this will not increment your stop policy count. However, job executions that detect no changes do not need to be consecutive to increment your stop policy count.

Stop after a number of failures

Most failures are resolved through automatic retries. If the retries fail to resolve the failures, then manual intervention is required. This policy takes the job out of rotation so that the issue can be investigated. 

Job executions that detect failures do not need to be consecutive to increment your stop policy count.

Stop after a specific date

The job will "complete" on the date defined

Reports

Reporting is paramount with the DataHub Platform.  Whether you chose to utilize the DataHub manager application, CLI, or ReST API, reporting options are available to help you manage and surface data about your content in real-time.  

 Out-of-the-box reports include: 

  • Dashboard: Provides an overview of what is happening across all your content

  • Job Overview: Detailed job information including source, destination, schedule and current status

  • Flagged Items: Content that did not transfer and requires attention

  • Content Insights: Breakdown of your transferred data

  • Sharing Insights: Breakdown of all permissions associated based on your source content

  • User Mappings: The permission associations of your content

  • Item Report: Information on each item that transferred

  • Validation: At any time, you may run a validation run, which will trigger a full inspection of all content relating to the option you select for the next run only.

Job Overview Report

This report provides detailed transfer information for the individual job.

Schedule: Provides information on how many times the job has executed, when the job will run again and progress towards meeting the job stop policy defined

Transfer Details | Identified Chart: Reflects content identified on the source platform and the status summary for items

Transfer Details | Revised Chart: Reflects content that DataHub revised during transfer to meet destination requirements and user-defined job configurations

Transfer Details | Flagged Chart: Reflect content that DataHub could not transfer. Manual remediation is required

Run Breakdown Report: Provides job history information for each execution for the given job

  • Note: Last Activity in the Run Breakdown will only appear during the job execution.

In some circumstances, bytes on the destination can be higher than listed on the source. This discrepancy is caused by property promotion on Word documents. For more information, see Report Values | Potential Differences due to Post Processing.

Values in the run breakdown may differ from values presented in the charts. This is because the run breakdown tracks each individual occurrence where as an item can only exist in a single chart category.

Example: When an item is both truncated AND ignored, it would not show up in the "Revised" chart but would show up in the "Revised" run breakdown

The run breakdown also shows both files and folder values. The charts display files and folder values separately, with the "Transfer Details" dropdown available to switch between display values.

Job Content Insights Report

This report provides detailed content information for the individual job.

Use the drop-down options to change the chart views.

Job Sharing Insights Report

This report offers a breakdown of all permissions associated to your content. The values presented are based on the source content.

On the Shared Insights tab for a job, the value "Not Shared" represents both items that have no permissions as well as content shared by inheritance from the parent folder. At this time, DataHub only tracks permissions applied during transfer, not permissions that result from inheritance within the hierarchy.

Job User Mappings Report

The User Mappings report for a given job presents the permission breakdown of your content. 

If any of the following features are enabled, User Mapping report will populate:

Job Validation Report

Control the level of tracking and reporting for content that exists on both the source and destination platform, including content that has been configured to be excluded from transfer and content that existed on the destination prior to the initial transfer. 

Items that have been ignored / skipped by policy or not shown because they already existed on the destination can now be seen on reports with the defined categories.

The default validation option is inspect none. This option does not need to be configured in the application user-interface or through the ReST API; it is the system default.

This configuration will not track all items but will offer additional tracking with performance in mind. Inspect none will track all items on the source at all levels of the hierarchy but not including those configured to be ignored/skipped through policy. For the destination, all content in the root (files & folders) that existed prior to the initial transfer will be tracked as destination only items and reported as ignored/skipped.

This option has the following features:

Source: All content (files and folders) at all levels in the hierarchy, but not including those configured to be ignored/skipped through policy. However, if the connection does not have access to a given folder in the hierarchy, we cannot track and report these items.

Destination: All content in the root (files and folders) that existed prior to the initial transfer will be tracked as destination only items and reported as ignored/skipped.

Destination: All content (files and folders) lower depths of the directory (sub-folders) that existed prior to the initial transfer will not be tracked.

If the connection does not have access to a given folder in the hierarchy, we cannot track and report these items.

Job Reports - Validation tab: At any time, you may run a validation run, which will trigger a full inspection of all content relating to the option you select for the next run only.

Generate Job Reports

DataHub Reports provide several options to combine many jobs into a single report for review. Reports are generated by category, individually selected jobs or by convention job parent (user account mapping, network home drive mapping or folder mapping job types). 

Reports are separated by two tabs so you can clearly distinguish between jobs that are actively transferring content and simulation jobs that imitate transfer. 

If no category is defined during job creation, it will be assigned to the default job category. 

Generate Report

Select Report Type

Define what the report will contain

  • Category: Defined during job creation

  • Parent jobs: Relating to convention jobs such as user account mapping, network home drive mapping or folder mapping job types

  • Manually select jobs: Choose each job individually that you want in your report

Remediation

Items that were unable to be transferred by the DataHub Platform will be flagged for manual remediation. Items can be flagged for many reasons, and in some cases, still transferred to the destination platform. Each item is a package, consisting of the media itself, version history, author, sharing and any other metadata. DataHub ensures all pieces of the item package are transferred to the destination to preserve data integrity. When an item is flagged, DataHub is indicating that all or some portion of this failed to migrate.

All migrations require some amount of manual intervention by the client to move content that fails to transfer automatically.

  • Note that one of the uses of simulation mode is to get an understanding prior to a live transfer of how many files might fail to transfer and the reasons.

  • This can be used to adjust the job parameters to achieve a higher number of automatic remediation successes.

General Reasons Content does not Transfer

Errors from Source & Destination Platforms

This is a broad error category that indicates DataHub was prevented from reading, downloading, uploading or writing content during content transfer by either the source or destination platform provider. Each situation is dependent on the storage provider rejection reason and will require manual investigation to resolve.

Insufficient Permissions

Many platforms may require additional permissions in order to perform certain functions, even for site administrator accounts. These permissions typically require a special request from the storage provider. For example, content that has been locked, hidden or has been flagged to disable download may require this special permission request from your storage provider.

Scenario-Specific Configuration

Content on your source storage platform is diverse, and users across your business will structure their data in a wide-variety of different ways. A single one-size-fits-all project configuration may not be suitable and can result in some content not transferring to the destination platform. DataHub will assist in assessing these situations to help provide custom, scenario-specific configuration that may workaround the issue that is preventing the transfer.

Disparate Platform Features

Each platform provider has a given set of features that are generally shared concepts in the storage business industry. However, within each storage platform, there can be behavioral or rule differences within these features, and aligning these discrepancies can be challenging. Features such as permission levels (edit, view, view+upload, etc.) may not align as an exact match to the destination platform, file size restrictions or file names may need to be altered to conform to meet the destination platform's policies. DataHub will attempt to accommodate these restrictions through configurations in the system; however, not all scenarios can be covered in a diverse data set.

Interruption in Service

DataHub must maintain connection to the database at all times during the transfer process. If there is an interruption in service, DataHub will fail the transfer as it is unable to track / write to the database.

How do I validate my content transferred successfully?

Verify the destination

DataHub will report all content that has transferred to the destination. Log into your destination platform and verify the content is located as expected.

DataHub is reporting items in "pending" or "retrying" status, what are my next steps?

Run the job again

DataHub defaults to retrying the job 3 times to reconcile items that are in pending/retry status. Depending on your job configuration, this may occur with the defined schedule or you can start the job manually.

Review the log message

DataHub logs a reason why the item is in pending/retry status. On the job "Overview" tab, click on the Transfer Details breakdown status "retrying". This will direct you to the filtered "Items" list. Select the item then click the "View item history" link on the right toolbox.

DataHub is reporting items in "Flagged" status, what are my next steps?

When an item is in "flagged" status, this means DataHub has made all attempts to transfer the file without success, and it requires manual remediation. 

Review the log message

DataHub logs a reason why the item has been flagged. On the job "Overview" tab, click on the Transfer Details breakdown status "flagged". This will direct you to the filtered "Items" list. Select the item and click the "View item history" link on the right toolbox.

Review the message and determine if you can resolve on the source platform.

Review all flagged items

These are the recommended ways to view all flagged items: export the flagged item report or review the "Flagged Items" page. 

Export report:

  • Job Report → Items tab → Filter by Status: Flagged

  • Click "Export this report" → Save CSV file for review

Review "Flagged Items" page:

  • Retry or Ignore individual items

  • View Item History for individual items

  • Link back to the job the flagged items is associated with

  • Export all Flagged Items report


Related Links