Last week, Microsoft announced the launch of a new effort – an Open Data Campaign to promote data sharing and usage. In a blog article written by the Chief Intellectual Property Counsel, Jennifer Yokoyama, the tech giant plans to roll out the Campaign in three steps.

First, a new set of guidelines on how the company will be sharing information with other entities will be published. Next, Microsoft will invest in tools that facilitate data sharing. Last but not least, it will form 20 new partnerships around shared data in the next two years with leading institutions in the open data movement like the Open Data Institute and The Governance Lab (GovLab) at the New York University Tandon School of Engineering. To kick start, Microsoft will begin sharing data on broadband access via its Airband initiative. These data will eventually be merged with others to speed up broadband connectivity.

Reasons for data liberation

The primary aim of Microsoft’s data democratization plan is to address the growing data divide in data ownership and assist organizations of various sizes to realize the power of data and artificial intelligence (AI). The tech giant believes over 50% of the data created by online interactions are garnered by less than 100 companies in the world and most people proficient in AI work only in the technology sector. As such, it appears the benefits brought about by data and new technologies that come with it are exclusively enjoyed by some but not others.

PwC has once predicted that 70% of economic value produced by AI will either go into the US or China. While it’s unavoidable to put a halt on a potential divide, Microsoft calls for better synergy between public and private sectors, to open up their resources and expertise and make data sharing handier for the benefits of everyone. Microsoft is not alone in this liberation movement. The Organization for Economic Co-Operation (OECD) reckons many countries may enjoy a 1-2.5% gain in GDP if data are to be more openly shared.

In fact, the ongoing Covid-19 pandemic has somewhat driven a collective embrace of data. On 16 March, researchers and leaders from the Allen Institute for AI (Artificial Intelligence); Chan Zuckerberg Initiative; Georgetown University’s Center for Security and Emerging Technology and the National Library of Medicine at the National Institutes of Health set up the COVID-19 Open Research Dataset (CORD-19).

The dataset comes with more than 29,000 published or on preprint server scientific articles. It’s believed to be the most comprehensive literature collection around the topic of coronavirus. Most notably, all articles come in a format that is easily digest by computers. It supposed to increase the speed of work by AI experts, hoping they will develop new solutions or techniques to battle against coronavirus or other related diseases

What does it mean in the long run?

Economists do agree with the long-term benefit of wider data sharing. Unlike natural resources like oil, data will not deplete even if they are being used and reused in an infinite manner. Nevertheless, to expedite data sharing, present data privacy and protection regulations may need to be enhanced to protect all parties from possible disputes. Likewise, there’s a need to standardize certain software and platforms for a smooth sharing of data.

From a corporate’s perspective, Microsoft’s move is partly altruistic since it encourages non-commercial data sharing. At the same time, like IBM, perhaps Microsoft is keener on the trades of data-related services and software rather than data itself. That means, the Campaign may never receive support from those who are profiting directly from data itself like Facebook and Google.


Author Bio

Hazel Tang A science writer with data background and an interest in the current affair, culture, and arts; a no-med from an (almost) all-med family. Follow on Twitter.