Metadata mapping

Use metadata mapping to associate a set of properties in a file whose values you would like to transfer between source and destination. It is analogous to metadata mapping in DataHub v3. You provide a common set of metadata properties between source and destination. When a file or file is transferred from source to destination (or viceversa), its property values are also transferred based on the defined map.

Metadata Mapping Options

Metadata Map

You can define a transfer job's metadata mapping in a job's transfer options json object.

Property

Description

Example

schemas A list of metadata schemas.

"metadata_map": {
    "schemas": [
        {
            "mappings": [
                {
                    "source": {
                        "property": {
                            "type": "string",
                            "name": "Text"
                        }
                    },
                    "destination": {
                        "property": {
                            "type": "string",
                            "name": "Text"
                        }
                    }
                }
            ]
        }
    ]
}

Schema Mapping ("schemas" in Metadata Map)

A schema mapping defines a metadata mapping between source and destination. You can provide a source and/or destination schema id can the transfer engine can retrieve from the platform provider to get additional schema information about the properties in the mapping.

Whether metadata from one schema per file or multiple schemas per file will be imported is dependent upon the platform. If the platform supports multiple schemas, then metadata from multiple schemas will be imported.  In the case of Office365 where only one schema is supported, for each file, the metadata importer first looks for custom metadata fields, then looks for metadata in the order that the schemas are specified in the schemas block.

Property

Description

Example

source

The source schema definition to use in this mapping.

When specified, it is used to load type information for all properties in the schema.

"source": {
    "id": "ptid:htc2v5PCAXAAAAAAAAARPw"
}

destination

The destination schema definition to use in this mapping.

When specified, it is used to load type information for all properties in the schema.

"destination": {
    "id": "test schema"
}

default Defines the schema as the fallback schema to use when mapping properties from source to destination if a mapper that matches the specified schema is not found.

"default": true

mappings

A list of property associations between source and destination. 

You can provide the type of property, though it is not required.

"mappings": [
    {
        "source": {
            "property": {
                "name": "Text"
            }
        },
        "destination": {
            "property": {
                "name": "Text"
            }
        }
    },{
        "source": {
            "property": {
                "name": "Caption"
            }
        },
        "destination": {
            "property": {
                "name": "Title"
            }
        }
    }
]

Property Definition Mapping ("mappings" in Schema)

A property definition mapping defines the map between a source and destination property.

Property

Description

Example

source

The source property.

"source": {
    "property": {
        "name": "Caption"
    }
}

destination

The destination property the source maps to.

"destination": {
    "property": {
        "name": "Title"
    }
}

choices A map of choices. This is used when the property is a choice of values (e.g. yes, no) and you need to map these values between the source and destination because they differ.

"choices": [
    {
        "source": {
            "name": "true",
            "value": "True"
        },
        "destination": {
            "name": "y",
            "value": "Yes"
        }
    }
]

Property Choice ("choices" in Property Definition Mapping)

A property choice defines the map between a source and destination choice value. For instance, if the property has valid values of true/false in the source and yes/no in the destination, you can use a property choice to map them respectively.

Property

Description

Example

source The name and value of the source.

"source": {
    "name": "true",
    "value": "True"
}

destination The name and value of the destination.

"destination": {
    "name": "y",
    "value": "Yes"
}

Property Value Map ("source" and "destination" in Property Definition Mapping)

A property value map defines the actual property to map to (in either the source or destination) and what actions to take when a value is either missing or invalid during transfer.

Property

Description

Valid Values
(default in bold)

Example

property The property to map to.

"property": {
    "name": "Caption"
}

when_missing Action to take when the property value is missing during transfer.

skip

default

calculate

fail

"property":{
    "name": "caption"
},
"when_missing": "skip

when_invalid Action to take when the property value is invalid or cannot be coerced to the destination type during transfer.

fail

warn

skip

default

"property":{
    "name": "caption"
},
"when_invalid": "fail"

Property Definition ("property" in Property Value Map)

Property

Description

Valid Values
(default in bold)

Example

name The name of the property. This is the only required field. You can alternatively use query_name, id or caption.

"name": "Caption"

type An optional type. This is optional and typically provided by the schema definition from the platform if the schema id is specified in the schema mapping.

unknown

boolean

id

integer

datetime

decimal

html

string

uri

lookup

account

"type": "string"

Example JSON

{
    "name": "Box -> Sharepoint (Metadata)",
    "kind":"transfer",
    "transfer":{
        "transfer_type": "copy",
        "source": {
            "connection": {
                "id": "36405b306fe84df69d55e007ed27967e",
            },
            "target": {
                "path": "/Source"
            }
        },
        "destination": {
            "connection": {
                "id": "92edba5e132645dda035580bfb14a063",
            },
            "target": {
                "path": "/Metadata Test/Destination"
            }
        },
        "audit_level": "trace",
        "performance": {
            "parallel_writes": {
                "requested": 2
            }
        },
        "metadata_map": {
            "schemas": [ {
                "mappings": [ {
                    "source": {
                        "property": {
                            "name": "Text"
                        }
                    },
                    "destination": {
                        "property": {
                            "name": "Caption"
                        }
                    }
                }]
            }]
        }      
    },
    "schedule":{
        "mode":"manual"
    }
}

Example JSON: Box - O365

This example includes custom metadata and template metadata.  

In order to import metadata into O365, the columns must be defined in the library first (ie. cog → library settings → create column).

Source (Box) Metadata:  In this table, the headings indicate the template name followed by the metadata field name.


custom_attribute1 

Legal | Affiliates

Legal | Classification

Legal | Agreement Type

Presales - RFI Response Archive | Initiative

file1 custom attribute1 value1



file2 custom attribute1 value2 Talbot Underwriting Ltd Restricted Europe Distributor
file3 custom attribute1 value3


Data Warehousing
file4 custom attribute1 value4 Ross Products; Abbot Diabetics Care Restricted Partner Data Replication
file5
AMES Restricted Cloud
file6
Agility Logistics Ltd Restricted License Data Quality
file7



Master Data Management

Destination (Office 365) Metadata: The order in which the schemas will be used for import are custom, Presales, Legal


custom_attribute2

Affiliates

Classification

Agreement Type

Initiative

file1 custom attribute1 value1



file2 custom attribute1 value2



file3 custom attribute1 value3



file4 custom attribute1 value4



file5
AMES Restricted Cloud
file6



Data Quality
file7



Master Data Management

{
    "name": "metadata mapping BOX -> O365",
    "kind": "transfer",
    "transfer":
    {
        "metadata_map": {
            "schemas": [
                {
                    "mappings": [
                        {
                            "source": {
                                "property": {
                                    "name": "Initiative"
                                }
                            },
                            "destination": {
                                "property": {
                                    "name": "Initiative"
                                }
                            }
                        }
                    ],
                    "source": {
                        "id": "Presales - RFI Response Archive"
                    }
                },{
                    "mappings": [
                        {
                            "source": {
                                "property": {
                                    "name": "Affiliates"
                                }
                            },
                            "destination": {
                                "property": {
                                    "name": "Affiliates"
                                }
                            }
                        },{
                            "source": {
                                "property": {
                                    "name": "AgreementType"
                                }
                            },
                            "destination": {
                                "property": {
                                    "name": "Agreement Type"
                                }
                            }
                        },{
                            "source": {
                                "property": {
                                    "name": "Classification"
                                }
                            },
                            "destination": {
                                "property": {
                                    "name": "Classification"
                                }
                            }
                        }
                    ],
                    "source": {
                        "id": "Legal"
                    }
                },{
                    "mappings": [
                        {
                            "source": {
                                "property": {
                                    "name": "custom_attribute1"
                                }
                            },
                            "destination": {
                                "property": {
                                    "name": "custom_attribute2"
                                }
                            }
                        }
                    ]
                }
            ]
        },
        "source":
        {
            "connection":
            {
                "id": "a7410cdc3d0c4d649b4544d98388d726"
            },
            "target": {
                "item": {
                    "root": true,
                    "name": "Box -> O365 - Metadata"
                }
            }
        },
        "destination":
        {
            "connection":
            {
                "id": "a92c9e36b52d44488168bcbd3a8dcfd5"
            },
            "target": {
                "path": "/Documents/Box -> O365 - Metadata"
            }
        }
    },
    "schedule": {
        "mode": "manual"
    }
}

If multiple versions of a file exist and are being uploaded during the same job run, only the most current metadata is preserved and will be applied to all versions of the file being uploaded during that job run.

Related Links