New trends identified by computers that scour web, satellite data
‘We are working with all kinds of data,’ said Dr Jon Blower, at the University of Reading, UK. ‘Smart machines can then help us find the needle in the haystack or, for example, identify interesting trends in petabytes of satellite imagery.’
It means, among other things, that new ways have been discovered to improve the efficiency of shipping by tracking ocean eddies, and to improve greenhouse gas emission estimates by tracking land-use change in the UK.
The technology taps into an increasing trend to open up access to the vast amounts of data being collected every day by global information systems (GIS), including the EU’s Copernicus system to observe the earth using a family of so-called Sentinel satellites.
‘Satellite imagery forms a big part of it,’ said Dr Blower. ‘The Sentinel programme got going part-way through our project, so we were able to take advantage of that data. There's also a lot of open geographic data that we can use, such as OpenStreetMap.’
His project is MELODIES, a pan-European initiative which has been funded by the EU to try to develop ways that these vast data stores can be interpreted and analysed. They’re doing it by using a technology called linked data which enables datasets to be connected and shared.
The Issue
All projects receiving Horizon 2020 funding have the obligation to make sure any peer-reviewed journal article they publish is openly accessible, free of charge.
In July, the EU updated the rules for its research funding to say that the data underlying publications should also be made available on an open-access basis as a default, unless any mitigating circumstances - such as intellectual property rights, personal data protection or national security concerns - apply.
It’s part of the policy of Open Science championed by Carlos Moedas, European Commissioner for Research, Science and Innovation.
‘We want to show open data publishers that if they put the data out there, people will use it,’ he said.
In September, the European Commission proposed new copyright rules that will give universities, research institutes, and research-performing companies more legal rights to perform text and data mining. The proposed law would benefit those working with big data for public interest purposes.
Interoperability
Researchers at the coal face of open data say that linking information together and making vast datasets searchable is by far the biggest problem they encounter—whether they are start-ups, universities or research institutes.
That’s why a number of researchers are working specifically on linked data, creating tools that allow datasets of different sizes and formats of data to be connected and searched.
‘Linked data is a framework of tools and practices for exposing, sharing and connecting information,’ explained Jesús Estrada from the SmartOpenData project which created a way of linking up a number of different European environmental datasets.
‘The key idea is that each data provider wants to publish information and this information is easily understandable by others.’
This is done by creating meaningful, or semantic, connections between datasets that can identify different representations of the same content. For example, a linked open data system would allow geographical data points from different datasets to be interpreted as ‘roads’ or ‘rivers’ and contain relationships, such as ‘road goes over river’, allowing people to obtain comparable or complementary information from different sources.
The SmartOpenData project set up five pilots where their system is used for environmental management. For example in Italy, datasets on algae, sewage, pollutants from rivers, chemicals, groundwater monitoring and wastewater treatment have been combined to help authorities monitor overall water quality in Sicily.
The next step is to allow small- and medium-sized enterprises to build on this and take advantage of a growing market. Open data is projected to create 25 000 new jobs in the EU between 2016 and 2020, with market size growing from EUR 55.3 billion in 2016 to EUR 75.7 billion by 2020.