Fusion Platform

Our solution for increasing the power and value of your data.

RSMB Fusion Platform

Your data is an asset, but how much of the value of that asset is frozen and unused? The value of your data comes from the value of what you are able to do with it.

That's where RSMB's data fusion solution comes in: it uses robust statistical techniques to allow you to fuse together different datasets in order to create an enriched dataset that can deliver more powerful insights. This allows you to extract significantly more value from your existing data assets.

We have delivered solutions using our fusion process for many clients over a number of years, normally providing an end-to-end tailored service. We are now making this platform available for all to use, in an easy to use web front end, along with an API for systems integration.

Our new platform makes the process of fusing your own data straightforward if your team has the necessary data and statistical skills to go through the steps below.

Alternatively, if you don't feel confident doing all or part of the process we can manage the process in whole or part for you.

And where your integration may need regular updates, we may be able to set it up for you so you can run the updates yourself. This relies on the linking variables and importance weights being relatively consistent over time.

If you operate data management platforms, planning databases or data analysis systems, we can also work with you to implement our fusion methodology in your systems via our easy to use REST API. This provides a seamless integration experience with the data and analytics systems of your choice, enabling data fusion in your own products for an enriched user experience.

What is Data Fusion?

Data fusion is a technique that allows you to integrate two (or more) datasets, based on linking variables that are common in both.

So for example, one dataset may hold person data for a TV viewing panel and the other may hold data for a lifestyle survey. Both datasets have variables that are unique to each but they also have variables in common and these are used for fusing the datasets together. Typical linking variables include demographics, media consumption data, product ownership, interests and lifestyle. Of course these aren't arbitrary, statistical analysis is used to determine which are appropriate for the fusion. Some variables will turn out to be more important than others, and this is handled by weighting during the fusion process.

There are two types of fusion: constrained and unconstrained.

  • In an unconstrained fusion every respondent in the recipient database is matched with the most similar respondent(s) in the donor database. Donors can be matched multiple times or not at all.
  • In a constrained fusion every recipient and every donor are matched exactly once, and the resultant ‘respondent’ file is a larger size as both donor and recipients have been fragmented and then matched. The original donor and recipient metric data is exactly preserved.

The most appropriate fusion approach is based on the use case. If it's important to have an integrated dataset that exactly replicates the separate datasets then constrained fusion will be the solution. For many uses of integrations this isn't necessary and unconstrained fusion can be used.

Steps for a data fusion

The core steps of a fusion are the same whether we are managing your fusion for you or you're doing it yourself. Yellow blocks are steps on the platform, teal blocks are preparation and QC.

Ensure the Recipient and Donor datasets are compatible and likely to yield meaningful results.
Prepare the data. Ensure datasets are clean with no poor quality or incomplete data.
Create Parameter Files, notably:
- a Hooks File which identifies the key attributes or elements that allow for the linkage of the data and
- a Critical Cells file that identifies elements that are crucial for a high quality fusion.
Create Importance Weights if required
Set up your project on the platform
Choose method of fusion: constrained or unconstrained
Upload files
Run Validation
Run Fusion
Quality Control Results