Information on how culture and cultural identities evolved in the past is 'hidden' in enormous amounts of unstructured data, such as texts written and printed over a period of more than 20 centuries, artefacts created over an even longer period, and audio-visual documents produced since the last quarter of the 19th century. In addition to these unstructured data, we have (semi-) structured data, partly structured-born (such as vital records), and partly made structured by the labour of generations of researchers in the humanities and social sciences.
Until recently, research addressing questions about culture and identity depended on experts’ abilities to identify potentially relevant pieces of information in archives, libraries and museums. Because such research was extremely time consuming, it was hardly possible to look at all data or test alternative analyses.
Digital data are accessible to a large number of researchers, and digital tools will enable them to verify data selections and interpretations of other researchers. Researchers’ use of these tools (the Digital Turn) will make it possible to address, and to at least partly answer, some of the challenges posed by Horizon2020. Comparative analysis of texts and images covering a certain topic over time - even centuries - will show how opinions gradually evolved and whether they developed differently in distinctive regions, social strata, or religions. Perhaps equally important, the fact that data selections and interpretations will be grounded in a solid empirical basis, which can be verified by other researchers, will make research outcomes more valuable both for policy development and for future scholarly endeavours.
In order to profit from the potential offered by the Digital Turn, the following problems must be tackled: (1) there is no integrated approach for the humanities in dealing with digital data and tools; (2) existing data sets are not connected and tools often apply to idiosyncratic formats only; and (3) there is lack of training of researchers and students in using digital methods to analyse large data sets.
The humanities researchers in the Netherlands are leading partners in the European infrastructure programmes CLARIN and DARIAH. In order to maintain and extend this leading position, they have joined forces in the Common Lab Research Infrastructure for the Arts and Humanities (CLARIAH) to offer a solution to these problems: an infrastructure (called the Common Lab) that provides researchers and research groups with integrated access to unprecedented collections of seamlessly interoperating digital research resources and innovative tools to process them in virtual workspaces, thus enabling data intensive science in the humanities. The Common Lab will provide researchers with intelligent access methods for exploring resources and innovative ways of combining different resources into virtual collections. In this way information that is hidden in unstructured textual and multimedia documents can be disclosed, and when useful analysed in combination with structured databases.
The Common Lab will be easy to access and use for researchers with a limited technical training. It will have an open structure that allows for the future addition of new data sets and tools (Open Access whenever possible). Dissemination activities, educational programs and training sessions will enable researchers and students to acquaint themselves with new research methodologies. Companies and public organisations, esp. in the top sector Creative Industry and in the cross-sectoral ICT Roadmap, support the Common Lab, because it addresses problems they are also facing. An intensive outreach programme will help create opportunities for developing applications and services relevant for society or commerce.