Deployment guide

DataHub is an Enterprise Content and Data Integration platform capable of managing file/data transfer and synchronize operations across a myriad of different storage management platforms at scale. It offers significant business flexibility around all aspects of Enterprise Content and Data Integration including:

  • Large-scale file migration
  • Enterprise file analysis
  • Classification, compliance, and actions/outcome management
  • Multi-system hybrid/sync

Solution Architecture

The DataHub platform is built on a pluggable, content–streaming architecture that enables highly–automated file/data transfer and synchronization capabilities for up to billions of files. File bytes stream in from one connection endpoint (defined by the administrator), across a customer owned and operated network and/or cloud service, and then stream out to a second connection endpoint. Content can also flow bidirectionally across two endpoints rather than solely from an "origin" to a "destination."


DataHub is a "security–neutral" model, fully utilizing your existing infrastructure and security schema. Content and data bytes only exist in DataHub memory in chunks, which are immediately streamed from one endpoint (through DataHub) and out to another endpoint. Both incoming and outgoing bytes are transferred using the most secure protocol available for each connector. For example, DataHub will use SSL or TLS encryption along with OAuth (token–based) authorization for most cloud service connectors.

Platform Components

Server Manager Admin Console

The server component of DataHub runs as a background service on at least one server. The DataHub service can also be deployed on multiple server nodes in a cluster.

Database Server

DataHub utilizes an embedded PostgreSQL database as part of its standard installation process. Additional options include Microsoft SQL Server or hosted PostgreSQL. DataHub requires network connectivity with the database to function properly in all cases.

Agents

DataHub agents are designed to serve three distinct use cases deployed on:

  • User Desktop Agents - Local, user desktops
  • Remote Agent server(s) - remote office NFS, eliminating the need for a VPN solution

Connectors

DataHub's Platform Connectors provide storage endpoint integration. Each storage connector has been carefully developed to implement all of the available security features via its native API. The connectors themselves can be deployed to execute in the context of the primary DataHub Server or in the context of one or more DataHub Remote Sites depending on the deployment model selected.

DataHub stores connection information such as the authenticated security token, URL, and UNC, and in some cases such as network file–shares, the user name and password are encrypted within the database. DataHub Connectors utilize all the default ports based on the platform's required communication protocol.

System Locale

Region Settings must be set to English (United States).

  1. Go to Control Panel > Clock and Region.
  2. Select Region > Administrative tab.
  3. Select Change system locale.
  4. Set current system local to "English (United States)"

DataHub Processing Servers (1 - # Servers)

  • CPU cores: 8
  • RAM: 32GB (minimum)
  • OS disk: 500GB (minimum)
  • OS: Windows Server 2016

If using a cloud server, the following templates are recommended: 

  • AWS: m4.2xlarge 

  • Azure: D8S_V3 

SQL Server (1 Server or Availability Group)

  • CPU cores: 16
  • RAM: 64GB (minimum)
  • OS_Vol: 100GB+ SSD – redundant / fault tolerant
  • OS: Windows Server 2016
  • Data_Vol1: 1TB premium SSD
  • Data_Vol2: 1TB premium SSD (optional)
  • Software/templates: SQL Server 2016 SP1 (or higher) enterprise

If using a cloud server, the following templates are recommended: 

  • AWS: m4.4xlarge 

  • Azure: D16S_V3 

Supported Operating Systems

Server / Manager & Remote Sites

  • Windows Server 2019

  • Windows Server 2016

  • Windows Server 2012 R2

Supported Databases

Server / Manager and Remote Sites

  • Embedded PostgreSQL 10.10-2+
    • If you are using a PostgreSQL database, it must be configured to use English for messages.
  • SQL Server 2016+
    • Database Planning and Tuning Concepts should be implemented; this includes noting the database name, instance and port number for SQL access if different than defaults

Browser Support

Supported: 

  • Chrome

Note:

  • DataHub Platform installation is not supported in FireFox, Edge, or Safari.
  • DataHub Platform application works as expected in FireFox, Edge, and Safari. Minor issues may be observed.
  • DataHub Platform is not supported in any version of Internet Explorer.

Open Port 9090

Ensure your Windows firewall is disabled on your VM (or the more advanced option is to open port 9090).

Administrator Password Requirements

Passwords must meet the following requirements: 

  • At least 8 characters
  • At least one uppercase letter (A-Z)
  • At least one digit (0-9)
  • At least one non-alphanumeric character (!@#$%)
  • Cannot contain the username

Languages

DataHub Platform Application

  • English

DataHub strongly recommends leveraging the following tools:

Other Recommendations

Firewall

  • If a firewall is used, port 9090 is required to be opened to the primary DataHub server for communications to be enabled for the engine. 

  • From the DataHub server, access to the source/destination locations should be opened for normal traffic, including http and https. 

  • For best results for data migrations that reside on premise, DataHub servers should be installed or co-located in the same data center to reduce latency of source data access. 

DataHub Service Account

These are the general account best practices when running a DataHub Platform clustered install.

  • Create a service account (DataHub User) whose password does not expire.
  • Create a new Database “DataHub” and make the DataHub user a DB Owner.
  • Ensure you have an Admin/Power User account who has full access to both sides (Source and Destination).
  • Ensure you are able to get to api.portalarchitects.com from the server.
  • For Office 365/OneDrive for Business and Box, all users have to be provisioned.
  • For Office 365/OneDrive for Business, the DataHub Transfer account(s) have to be added to Site Collection Administrator of each and every user. 

Proxy Server

DataHub supports proxy server environments and utilizes the proxy settings from the underlying host Operating System; however, additional prerequisites may be necessary for a fully functional implementation. Note that proxy servers may introduce some latency in the transfer process. If a proxy is utilized, ensure that the DataHub servers have the manual proxy setup configured correctly. Proxy server environments add a layer of complexity and additional management that may impact troubleshooting, rate limiting, and overall performance.

  1. Log in to DataHub server(s) with the DataHub Service Account.
  2. Go to Windows Settings > Proxy Settings
  3. Enable the "Use a proxy server..." setting.
  4. Update the Address and Port with the applicable proxy values.
  5. Select Save

Review Database Options

Determine which database you will use:

Default Database | Embedded PostgreSQL

  • Ideal for those less familiar with managing separately installed databases

  • Does not require a paid license, free to use


Optional | Hosted PostgreSQL

  • During installation, you may configure a hosted PostgreSQL database that you can manage separately.


Optional | SQL Server

You can optionally create the database before running the installer.

  1. Log into SQL Server Management Studio.

  2. Create a new database

    • Set the name appropriately.

    • Set the recovery mode to simple

    • Add the Windows Service Account User or SQL Login to the database

      • Security >  Logins >  {Windows Service Account User} >  properties >  User Mapping >  db_name >  db_owner

Initial Install

Database Option

If you intend to install Microsoft SQL Server, you may use the following Powershell command to skip parts of the install process.

  1. In powershell, navigate to the directory where skysync-{OS Type}-{Version}.exe exists.

  2. Run .\skysync-{OS Type}-{Version}.exe --install-pgsql 0

Verify Installer Properties

In some instances, Windows will flag an executable as "untrusted" and block it. If this happens with the DataHub installer, there will be issues with the installation. Therefore, before you begin installation, review the installer properties to verify the correct settings. Right-click on the installer and select Properties. You need to verify the settings below. Make sure you click Apply to apply the changes before selecting OK.  

  • Read-only should not be selected.

  • Hidden should not be selected.

  • Unblock should be selected.

Run as Administrator

To install a new instance of DataHub, follow the steps below.

  1. Place the installer in a folder on the computer's hard drive (i.e. C:\DataHub), then right-click and select Run as administrator.

  2. If prompted, allow the installer to make changes to your computer.

  3. Select Next on the Welcome screen.

Installation Directory

Specify Installation and Configuration directories. This can be any local path, but we recommend using the default settings.

Directory Default
Installation C:\Program Files\Syncplicity DataHub
Configuration C:\ProgramData\Syncplicity DataHub\v4

Configure Windows Service

Configure which Windows user will run the DataHub service.

  • The installer defaults to a user named syncplicitydatahub with a randomly generated password. If you plan to re-use this account, we recommend entering your own password.

  • If you already have an account to run the DataHub service, you may enter that here.

  • If the entered user does not exist, it will be created as a local Windows account. Select Yes to create the user.

Begin Installation

If you are using SQL Server with Windows Authentication, ensure that the DataHub service account is a DB Creator.

  • Alternatively, you can pre-create a SQL Server database and grant this account DB Owner.

  • See Prepare for Installation section.

Select Next to begin installation.

PostgreSQL program files will be installed by default. This is intended. You may choose a different database further along in the installation process.

Once complete, leave Launch Syncplicity DataHub selected and select Finish.

Configuration Wizard

A new browser window will launch with the configuration wizard. If your browser doesn't immediately load to this screen, the DataHub service might still be starting. Your browser will continue to check until the website is started. The URL is http://localhost:9090/.

Select Get Started.

Select Continue on next screen.

Configure Your Database

Default Database | PostgreSQL

  • Ideal for those less familiar with managing separately installed databases

  • Does not require a paid license; free to use

If you are using a PostgreSQL database, it must be configured to use English for messages. This is required for DataHub to properly deploy the server. 

Here, you can begin building your desired database.

  • For ease of use, the Use default database option uses an embedded version of PostgreSQL.

  • If you select Use a different database, you will have the option to either create your own PostgreSQL instance or deploy to SQL Server.

Use Default Database - Embedded PostgreSQL 

If you select to use the default PostgreSQL database, select Create Database to continue.

Use a Different Database - Microsoft SQL Server

If you select to use a different database, the configuration wizard will ask you to select a database provider. Select Microsoft SQL Server from the list.

Important - Please Review

After selecting a database provider, you will be prompted for connection details.

  1. Add the Server name and Database name and select the Authentication.

  2. Select Test Connection when complete to validate your configuration.

    1. You will see a “Connection Successful” message if the database is detected.

    2. If the database is not detected, a warning message will display to indicate that it will be created.

Use a Different Database - Microsoft SQL Server -Use Specific Credentials (Optional)

If you select Use specific credential under Authentication, you will see two additional fields: User name and Password. This allows you to enter the credential required to connect to the database.

Configure your Database - Summary

The information provided for your selected database displays. Review it to ensure it is correct. Select Create database to continue.

Use a Different Database - Hosted PostgreSQL 

If you do not use the default database, the configuration wizard will ask you to select a database provider. Select PostgreSQL from the list.

Select Hosted and enter the database information. Make sure you test the connection.

Optional Database Connection - Enter Manual Connection String

DataHub supports SQL Server and PostgreSQL databases. For complex database scenarios, DataHub allows you to supply the connection string with the applicable options and format necessary for your database environment. However, these connection strings can be modified with additional platform parameters and values. 

If connectivity to the database cannot be made with default database configuration options, you may need to utilize the connection string to specify the correct parameters. For example, it is a common enterprise practice to change the default SQL Server port from 1433 to another port, requiring the use of a connection string.

SQL Server Connection Strings

SQL Server | Integrated Authentication (Windows Authentication)

Server=SqlServerMachineName;Database=SkySyncV4;Integrated Security=true


SQL Server | Use Specific Credentials (SQL Authentication)

Server=SqlServerMachineName;Database=SkySyncV4;User ID=SqlServerUserName;Password=MyPassword


SQL Server | Always On - Availability Group Listener with Multi-Subnet

Server=tcp:AGListenerName,1433;Database=SkySyncV4;IntegratedSecurity=SSPI;MultiSubnetFailover=True


Azure SQL DB Standard Connection

Server=tcp:[ServerName].database.windows.net;Database=SkySyncV4;User ID=[LoginForDb]@[ServerName];Password=MyPassword;Trusted_Connection=False;Encrypt=True

PostgreSQL Connection Strings

PostgreSQL Standard Connection

dbProvider=npgsql,dbConnectionString=User ID=PostgreSqlUserName;Password=MyPassword;Host=ServerName;Port=5432;Database=SkySyncV4

Validate Your License

After configuring your database, DataHub will ask you for your license key. Your DataHub license will be provided to you after purchase. It should look something like this: 01234567-0123-0123-0123-0123456789ab.

Select Get started.

Enter your license key and select Activate. (this button is disabled until you enter a license.) You will see a green “License Activated” message indicating the license was activated successfully. Congratulations! You've now successfully licensed DataHub.

Select Next to continue.

Validate Your License - Troubleshooting

If something goes wrong, an error message will be displayed in a box below the Activate button.

Verify your license key was entered correctly and try to activate again.

Common causes of failure include:

  • Firewalls blocking port 443 for https

  • Firewalls blocking access to https://api.portalarchitects.com which is the address of our license server

  • DataHub not properly set up to use the corporate proxy server

    • DataHub uses the default proxy server specified for the user account under which it is executing

    • Ensure the proxy server is configured appropriately

  • Additional spaces or characters added accidentally with copy/paste of the license

Validate Your License - Accept End-User License Agreement

Scroll to the bottom and click Accept to accept the license agreement.

Create an Administrator Account

Create your first DataHub global admin account now. This user is granted all permissions by default. You can add more users later and set up different permissions, groups, and functionality.

Select Get started.

Create your administrator user for DataHub. By default, this user has access to everything in the application.

  • Username and Password are the only required fields.

  • Passwords must meet the following requirements: 

    • At least 8 characters

    • At least one uppercase letter (A-Z)

    • At least one digit (0-9)

    • At least one non-alphanumeric character (!@#$%)

    • Cannot contain the username

When complete, click Next and review the settings.

If you are satisfied with your credentials, click Create.

You will then be prompted to "Restart and launch." The restart and launch may take a few seconds to process. No loading icon is present during this time, but the browser will reload once the restart is complete.

This will not restart your computer. Only the DataHub service will be restarted.

Log in with your Administrator credentials and begin using DataHub!

Upgrade

The new installer also helps facilitate product updates. Follow the steps below to complete the upgrade process.

1. Stop the DataHub service on all machines in your cluster

2. Place the installer in a folder on the computer's hard drive (i.e. C:\DataHub), then right click and select Run as administrator.

3. Click Next to begin upgrading DataHub's program files.

4. If any database updates are required, your browser will display a "We need to update" screen. Follow the prompts to update the database.

5. Once complete, select Finish, and DataHub will launch.

Related Links