A few weeks ago, I applied for a new Udacity program about Private AI. Funnily enough, it's co-organized by Facebook - one of the top companies why Private AI is needed in the first place.

What is Private AI?

AI - or let's rather say Machine Learning - is built and trained on top of millions of data points. These data points are difficult to acquire and can be extremely expensive - unless you're at the source.

One reason why many big tech companies are doing so well in this space is their unlimited access to proprietary data. In many cases - think about you tagging your friends on FB/IG or solving a Google Captcha - your user base is even responsible for labelling that data. A process that is usually extremely slow and expensive.

While tech companies have gotten a little more cautious about it, there's (almost) nothing stopping them to take your data and train Machine Learning algorithms with it. If someone has to do data analysis - which is almost always needed - that means that potentially a Data Scientist is looking over your images. Scary - right?

Additionally, and maybe even more importantly, you are not in charge. If the service you are using will grab your data and train Machine Learning-powered weaponry with it - they can. The research lab that is applying AI to prevent cancer or other diseases? They get left out.

Private AI describes a set of methods, algorithms and encryption techniques, which allow it to train these Machine Learning models without ever possessing the data or having a look at it. Potentially, your data doesn't even leave your device (e.g. in case of Google Photos).

The folks at OpenMined have spent multiple years now perfecting a framework to do exactly that.

Why is it needed?

Data regulations are getting stricter. While business owners hate them (because it often needs more work and stricter processing of data), it's good news for consumers. Your data is getting safer, a lot of things that were allowed just 1-2 years ago can't be done anymore (or at least not without your consent).

On the downside, it's also getting more difficult for small businesses and startups to acquire data - many of which are applying AI for good, fun and helpful causes. Nowadays it's normal to use public datasets or scrape your own from various sources. While it's a grey area now, soon it's not going to be allowed anymore.

Private AI will make it possible for the consumer (or anyone with a proprietary data source), to decide on who to give your data too, without having the worry of the data getting stolen or sold. In many cases, you will even get paid for it. Especially small businesses will have an ethical and equally as expensive source for high-quality data.

Imagine getting a breast cancer scan - then making your scan available only to ethical research labs and getting paid for it. Nowadays the situation looks differently - your breast cancer scan is stored anyways, but often sold to the highest bidder and/or made available to everyone.

Selling your data - this might sound like something that you're not interested in, but the stigma about making your data accessible will change, once the consumer is in charge, the data is encrypted, safe and everyone knows how much their data is truly worth.