AI can’t spot credit card fraud if there isn’t enough fraudulent data. Self-driving cars won’t prevent crashes if they don’t get enough data on car accidents. The algorithms need to be trained, but these events are (thankfully) too rare for that to happen.
What’s the answer?
Enter the age of synthetic data.
Synthetic data, or, as I prefer to call it, Unreal World Data (see what I did there?), can accelerate development on many healthcare fronts starting with orphan diseases where data is costly, sparse and incomplete – in so many cases it’s smarter to just have a direct patient relationship. Of course, we’ve had synthetic control arms for a while now, but the technology is now advanced enough to simulate the patients with disease. Perhaps most excitingly the data has zero privacy issues – it’s not real, so can be shared more openly with partners and amongst different teams. It can also overcome bias or health equity concerns.
Whilst big tech has unsurprisingly muscled into this area (Meta acquired startup AI.Reverie at the end of 2021), pioneering RWE firm Aetion has just made its first acquisition: Replica Analytics, a leading synthetic data firm: https://lnkd.in/dnhG8Ucm whilst Syntegra, MDClone, Accenture and Phesi are pushing forwards.
Synthetic data will account for 60% of all data used in all AI development by 2024, Gartner ambitiously predicts. It starts this year.