A corpus represents a fundamental tool for the investigation of the grammatical features of a language. Indeed, it records the linguistic variations and uses of the language among different countries, creating a common base for different types of studies on spontaneous and semi-spontaneous data.
Although various researches have been conducted about the evolution of signs in Italy across different geographic areas (diatopic changings) and time (diachronic changings), a national corpus has never been developed before the PRIN project (Research Project of National Interest). The project was financed by the Ministry for Education, University and Research in November 2008. It lasted two years and was created with the partnership of three universities: University of Urbino (then moved to La Sapienza University of Rome), Ca’ Foscari University of Venice and Bicocca University of Milan.
One of the main results of this project was the creation of the first national Corpus of LIS. The LIS Corpus is based on video recordings saved in high quality: mpg2. The large quantity of videos collected and the representative variations of signers recorded were very important factors, in order to obtain accurate analyses. Moreover, geographic and social factors have been taken into account to build the corpus. The data was recorded among 10 cities, covering the Northern, Southern and Central part of Italy: Turin, Milan, Brescia, Bologna, Florence, Rome, Salerno, Bari, Catanzaro, and Ragusa. For the purposes of the project, only deaf people were involved in the research and no other people linked to the Deaf society and culture were included, such as the hearing families of Deaf people or their interpreters. However, not only native signers were considered (native signers are between 5% and 10% of Deaf people in Italy), but also Deaf signers who mainly used LIS in everyday communication, despite having learnt sign language later on in life. Other social factors that were taken into account were: gender, deafness in families, type of school attended, educational level, lifestyles with respect to the city or country where they live, and social status (in hearing communities and among Deaf people).
An average of 18 participants was selected for each city and divided into three groups: 6 for the young group, 6 for the middle-aged group, and 6 for the old group. During the recording of videos, only Deaf researchers or collaborators took part to the recording session in order to minimize the effect of the paradox of the author participant, namely the influence of the researchers relating to the linguistic choices of the signers. Furthermore, the session took place in locations which were familiar and commonly frequented by the Deaf informants, in order to avoid an uncomfortable atmosphere and to allow more spontaneous productions.
Four different types of data have been recorded: spontaneous conversations, individual narrations, dialogues, and picture-naming.The section of spontaneous conversation involved three Deaf people and lasted about 45 minutes. Free conversations are good resources for the collection of frequent linguistic structures, but they are less useful to investigate the occurrence of specific constructions in that they lack negative evidence. Individual narrations consisted in an individual story telling which lasted only a few minutes. The signer was sitting in front of another participant. The function of the second participant was to avoid anxiety during the performance due to the presence of the camera, and to make the narration look more spontaneous.The third section aimed at investigating the production of questions. Therefore, participants were invited to ask each other questions to gain detailed descriptions of a car accident. Although these types of productions are not completely spontaneous, (in that there is a guide-line to follow), this task is useful for the elicitation of specific linguistic structures, as in this case wh-questions.During the fourth task, participants were asked to provide the sign(s) for some pictures in order to explore linguistic variation among signers coming due to socio-linguistic variables such as age and geographic origins. When undertaking this task, signers were asked to produce all the signs they knew to refer to the same picture. The pictures belonged to different semantic fields: colours, months, family words, compounds, words without signs, classifiers, signs expressed through dactylology (hand alphabet), initialized signs, diachronically evolved signs and diatopically evolved signs.
The data were annotated in a separate file through a specific software called ELAN.
Figure: ELAN dialog box (recreated from Mantovan, 2015: 111)
ELAN is a piece of software created at the Max Planck Institute in Nijmegen, Netherlands. It can be used with several operating systems and it can be downloaded for free. ELAN allows the simultaneous analysis of four videos in the video viewer. Linguistic information can be hierarchically organised in the tier panel and then, inserted in the annotation panel with personal classifications, depending on the specific research interests. In the upper right corner, the tabs panel allows users to visualize the annotations in various format and to modify the volume and rate of the videos. When the annotation was concluded, the data have been exported to Excel to run the statistical analysis of the corpus.