In this project we explored how semantic web technologies can be used to link together open data from various sources in hopes of providing some new insights.
As a part of the Semantic Web course at the Vrije Universiteit in Amsterdam, Robin van der Markt and I worked on this project during one busy month. We wanted to explore similarities and differences between the Netherlands and Norway based on censuses from the 19th and 20th century. Both countries have made censuses from this time period digitally available, but in different format.
Based on the work done by the CEDAR project, many of the Dutch censuses are available in RDF. As this was not the case for the Norwegian data, we decided to convert the Norwegian datasets to RDF. Even though the Norwegian census from 1910 is available in JSON format online today, this did not fit our purpose, as there is a constraint on the number of results for each query. Thankfully the same data is available through the NAPP project.
With limited time we decided to only focus on the occupational information which lies in the censuses. Utilizing the information available from the HISCO website we were able to construct an “occupational link” between the two censuses which obviously is very different in format. This link was based on a dataset of more than 1600 different occupations from the HISCO website. But as the Dutch and Norwegian datasets was different from each other, we ended up with a mere 146 occupations which we could find in both datasets.
The application is currently available online: http://chefdev.nl/semanticweb/. Here you’ll find similarities between the countries, such as clusters of miners in more rural areas and that barbers mainly lived in the big cities. But also differences, where the Netherlands employed a higher number of people in the railways and that Norway (somewhat counter intuitive) had more people working as bicycle makers than the Netherlands. If you want to read more about our project in detail, the project report is available for download here: Final report SW 2013
If you are new to the semantic web there are various sites which we would recommend to visit, such as: