Making an online dictionary for Central Australian sign languages

In Central Australia, sign languages are used alongside speech, gesture and other semiotic systems such as sand drawing. These sign languages have been described as ‘alternate’, as they are not generally the primary mode of communication in these communities but rather % % # Z ¢  ~} = 1988 [2013]). In this paper we discuss a sign language documentation and online resource development project for Indigenous sign languages from Central Australia. In particular we track > # % $ % sign in an online sign language dictionary (www.iltyemiltyem.com). This project represents the #% #% # < region since Kendon’s research in the 1980s, and his in-depth analysis of the sign languages found in some Central Australian communities provides a foundation for the current research (Kendon, 1988 [2013]).


Introduction
In Central Australia, sign languages are used alongside speech, gesture and other semiotic systems such as sand drawing.These sign languages have been described as 'alternate', as they are not generally the primary mode of communication in these communities but rather \ZLK PUZ[LHK VM ZWLLJO PU WHY[PJ\SHY J\S [\YHS JPYJ\TZ[HUJLZ .YLLU >PSRPUZ " 2LUKVU 1988[2013]).In this paper we discuss a sign language documentation and online resource development project for Indigenous sign languages from Central Australia.In particular we track V\Y ^VYRÅV^ MYVT ZPNU YLJVYKPUN ZLZZPVUZ [OYV\NO [V [OL W\ISPJH[PVU VM ZLSLJ[LK ]PKLV JSPWZ VM sign in an online sign language dictionary (www.iltyemiltyem.com).This project represents the ÄYZ[ JVTWYLOLUZP]L H [[LTW[ [V KVJ\TLU[ ZPNU SHUN\HNL RUV^SLKNL PU [OL *LU[YHS (\Z[YHSPHU region since Kendon's research in the 1980s, and his in-depth analysis of the sign languages found in some Central Australian communities provides a foundation for the current research (Kendon, 1988(Kendon, [2013]]).The web-based dictionary project is titled Iltyem-iltyem, an Anmatyerr term meaning 'signaling with hands, using handsigns'.The key purpose of Iltyem-iltyem is to support the maintenance, teaching and learning of sign languages in Central Australia by providing an accessible and contemporary online media resource.We aimed to create a media product using tools that are open-source, widely used, aesthetically pleasing and 'pertinent ' (Nathan, 2006) to a community audience.A related aim was to design the dictionary so it could be used by sign HUK NLZ[\YL YLZLHYJOLYZ ;OL VUSPUL YLWVZP[VY` VM ZPNUZ WYV]PKLZ H YLÄULK HUK NYV^PUN JVYW\Z [OH[ PZ ZLHYJOHISL VU ZL]LYHS WHYHTL[LYZ *VYW\Z HUK ^VYRÅV^ KLZPNU WYV]PKPUN MVY T\S[PWSL uses of data is consistent with best practice in language documentation work.As stated by Thieberger (2011, p. 463), "we now take it for granted that all documentation should include a media corpus, that various data sources can be made to work together, and that outcomes of linguistic work be created in an archival form with derived forms for presentation".Although archiving and future access to preservation copies of media and metadata is a central concern, in this paper we focus on our solutions for deriving web-site posts from complex sign data sets.

>L OH]L MV\UK H SPULHY ^VYRÅV^ JHUUV[ HJJV\U[ MVY [OL T\S[PWSL \ZLZ VM [OL ZPNU ]PKLV KH[H 9H[OLY H KP]LYNLU[ ^VYRÅV^ ^P[O [^V KPZ[PUJ[ WH[OZ TLL[Z [OL T\S[PWSL ULLKZ PKLU[PÄLK PU V\Y ^VYRÅV^ KLZPNU
The project was designed as a language documentation and publishing partnership between signers and speakers of Indigenous languages, linguists and multi-media designers.The working practices of the project team were established in collaboration with local leaders and structured around mentoring relationships within a skilled and culturally diverse group.This involved the development and testing of a prototype website, raising awareness of issues involved in internet publishing via a project blog 1 and ongoing review and consultation over the use and archiving of recorded material.
A group of Anmatyerr and Warlpiri speaker/signers from Ti Tree, 200 km north of Alice Springs, and Ngaatjatjarra speaker/signer Elizabeth Marrkilyi Ellis from Tjukurla in the Western Desert region participated in the early stages of the project.The project has also recorded sign at Wilora (Stirling), Artarre (Neutral Junction), Yuelamu (Mt Allan) and Utopia, and the website is currently being expanded with contributions from these communities.Many of the participants from these communities have long-standing experience working on education, training and language documentation projects over many years (Green, 2003(Green, , 2010)).Figure 1 shows the locations of communities involved in Iltyem-iltyem and the languages spoken there.The blog appears on the front page of the Iltyem-iltyem website (http://iltyemiltyem.com).For a discussion of this as a form of digital outreach in language documentation, see Gawne (2015) The website was designed to add to the considerable suite of language learning resources already developed for Indigenous communities in Central Australia by articulating with the semantic domains and graphics used in the IAD Press Picture Dictionary Series (see Green, 2003).The site was launched in Alice Springs in September 2013 and it contains close to 400 clips available for public view by registered users, who can browse and search across a range of categories.
All of us women are working on the sign project so that the children can learn.We are putting the signs on a website.If they open the site then they'll be able to see how signs are done.The elders taught us sign language -they handed it down to us.They held that knowledge from the Dreaming and they handed it over and passed it on to us.Now we want to pass it on to our children.We want to put our language on the web so that the children can learn sign language (Janie Pwerrerl Long, Hanson River, 29 June 2011).

Considerations in website design
There are a number of technical and cultural considerations related to the appearance of the website and its functionality as a community resource.A key consideration is that the website ZOV\SK Z\P [ [OL JVYW\Z HUK YLÅLJ[ [OL T\S[PTVKHSP[` VM [OL YLJVYKPUNZ Kendon, 1988Kendon, [2013]]).To date, the focus has been on recording sign knowledge from hearing signers, 3 and consequently the majority of the recordings are sign/speech composites which 2.
Janie Long Pwerrerl is the daughter of Lucky Long Peltharr, one of the Anmatyerr women at Ti Tree with whom Kendon worked on sign documentation in the 1980s.

3.
We are yet to conduct studies of the ways that deaf individuals in the region acquire and use sign, although there is some anecdotal evidence that some traditional sign is used by Indigenous deaf with each other and with hearing members of their JVTT\UP[PLZ ZLL (KVUL 4H`WPSHTH " )H\LY " *VVRL (KVUL " 4H`WPSHTH (KVUL " 6»9LPSS` " 7V^LY include a range of spoken languages: Anmatyerr, Kaytetye, Alyawarr, Warlpiri and Ngaatjatjarra.The online resource is designed to present both speech and sign in selections of audio-visual recordings of signers from across this range of language groups.Elicitation of sign was conducted in the spoken languages of the communities.In the annotation and analysis of sign SHUN\HNL YLJVYKPUNZ ^L OH]L [O\Z PKLU [PÄLK [OL IHZPJ JVTT\UPJH[P]L \UP[ HZ H ºZPNU \[[LYHUJL» comprising one or more signs, and with or without accompanying speech.
The second issue relating to website design centers around challenges inherent in making a set VM JVTT\UP[` YLZV\YJLZ [OH[ OH]L JVOLYLUJL PU [LYTZ VM SVJHS SHUN\HNL PKLU[P[PLZ HUK HɉSPH[PVUZ From the perspective of the signers regional variations in sign 'identity' are based on a complex set of factors, but predominantly on the variety of speech (if used) of the signer and hence on their language and cultural identity.So, for example, an Alyawarr or a Kaytetye person may employ identical signs in most domains, yet speak distinct languages and belong to particular NLVNYHWOPJHS HYLHZ ^P[OPU *LU[YHS (\Z[YHSPH >L PKLU[PM` PUZ[HUJLZ VM SHUN\HNLZWLJPÄJ ZPNU production -for example Alyawarr sign, Kaytetye sign, or Warlpiri sign -even though the sign systems used in some regions of Central Australia are essentially the same, apart from minor SL_PJHS KPɈLYLUJLZ 2LUKVU BD 4 As many Indigenous people in these communities are multilingual, it is also common for a particular signer to use sign in communicative contexts where one or another of several spoken languages predominate.A sign/speech composite thus may consist of a sign that is shared across the Central desert region, but coupled with regional HUK JVTT\UP[` ZWLJPÄJ ]HYPL[PLZ VM ZWLLJO ¶ LP[OLY VUL VM [OL (YHUKPJ SHUN\HNLZ VY VUL VM [OL neighbouring languages such as Warlpiri.Both of these issues have important consequences for the implementation of a practical system for identifying particular signs -their 'Sign IDs', and for the ways sign lexemes are represented on the website.We will return to the problem of Sign IDs below.

>L KLZPNULK H WYVQLJ[ ^VYRÅV^ [V ZLSLJ[ HUK L_[YHJ[ ZPNU \[[LYHUJLZ MYVT SVUNLY YLJVYKPUNZ
and to label individual signs within sign utterances, thus making the online resource searchable for these individual signs.Given the predominance of both speech and sign in sign utterances, we needed to display information about both in sign utterance clips.These issues have consequences for the treatment of the media and metadata throughout the annotation and VUSPUL JVU[LU[ I\PSKPUN ^VYRÅV^ Internet download speed was a technical constraint for the website design.Speed testing of an early prototype of the website at Ti Tree community in early 2012 helped to establish compression parameters for online video. 5We found that extremely slow internet speeds at Ti  (Kendon, 1988(Kendon, [2013]], p. 53-54).The Iltyem-iltyem website includes Internet speed testing was conducted via http://www.speedtest.net in 2012.Testing of the Ti Tree School Internet showed the speed was slower than 85% of the rest of Australia.At that time, the speed over a Telstra Next-G mobile device used at Ti Tree was slower than 65% of the rest of Australia.

MVY WYLZLU[H[PVU VU [OL ^LIZP[L ;OLYL HYL [^V THPU HWWYVHJOLZ [V \UKLY[HRPUN [OPZ ;OL ÄYZ[ PZ [V ZLSLJ[ YLSL]HU[ ZLJ[PVUZ PU H TLKPH ÄSL I` Z[HY[ HUK LUK [PTL JVKLZ HUK [OLU JHSS [OLT \W
as 'snippets' (e.g. by HTML 5), and stream them from a host server which holds a repository of HYJOP]HS ÄSLZ ^OPJO YLTHPU PU [HJ[ ;OL HS[LYUH[P]L PZ [V JYLH[L ºJSPWZ» ¶ H Z\P[L VM ZLJVUKHY` ÄSLZ that are then presented as independent items.Clips are small and more manageable for media streaming, especially over slow internet connections.This was the approach we chose for this project, even though creating clips can potentially lead to data management problems by YLWSPJH[PUN TLKPH HUK JYLH[PUN ZLJVUKHY` ÄSLZ 6 Although perhaps not regarded as best practice in language documentation (Thieberger, 2011), we found it to be appropriate in this context given the limits on internet speed (cf.Bowden & Hajek, 2006)
To access this example, viewers are required to register to enter the dictionary component of the Iltyem-iltyem website.

(i) Record
The recording sessions follow a methodology outlined previously (Green et al., 2011), which attempts to adapt some of the conventions used in primary sign language recording to remote conditions.This has resulted in sign recordings that are consistent in terms of the lighting and We turn now to the featured sign utterance, the KWATY/WATER example.Figure 3 is a still taken from a video of a recording session in which this particular sign featured.(2010:125) the optimal number of annotation tiers for sign language corpora is "yet to be determined" and is very much a matter of "trial and error".At this stage we have not annotated any non-manual features of the signs in the corpus.In general facial expression, eyegaze and posture-shift have no formal position in the system of signs, and only play a role at the discourse level (Kendon [1988(Kendon [ ] 2013, p. 113, p. 113).However, it may be the case that Indigenous signers who have had exposure to other sign languages such as Auslan use mouthings and other facial expressions alongside manual signs. 12.
Johnston (2014, p. 9) notes of the 60 or so tiers used in the standard ELAN template for the Auslan corpus that "most tiers Annotation is undertaken in a series of passes (Johnston, 2010, p. 116), initially segmenting the video footage into sign utterances, then proceeding to assign sign ID glosses, transcribe associated speech and develop free translations.Over time, continuing work on annotation and transcription of the corpus will enable searchability and recognition of patterns through the examination of large data sets.This will increase its value as a research tool to further explore various aspects of Central Australian sign languages.Although the ELAN template for this project allows for more detailed annotation of sign forms, in the initial stages the objective was to prepare as many sign examples as possible for export to the website.In such instances, co-signing speech often serves to disambiguate what is being referred to in sign.In these circumstances, the question arises as to whether to identify the unique sign form with one Sign ID label and then describe the multiple meanings that occur in context, or alternatively to identify them individually as sign form/referent complexes. 13  For the purposes of the Iltyem-iltyem ^LIZP[L :PNU 0+Z JVTWYPZL [^V WHY[Z ¶ [OL ÄYZ[ IHZLK VU a speech equivalent from the spoken language of the community, and the second an English gloss that approximates the meaning of the sign.Table 2  For a discussion of the determination of sign homonyms in Auslan see Johnston (2010, p. 124).A single sign form will be given a separate sign ID gloss if meanings are "completely distinct and unrelated".Although sign forms could be given a unique and abstract value, 15 it is important to present sign glosses on the website in a way that is not overloaded with technical language and linguistic or abstract information.This enables intuitive searching of the website using English words and Indigenous language words.As our annotation of sign data is an on-going process, analysis of sign polysemies and preparation of clips for the website proceeds hand in hand.
For sign utterances selected for use on the website, a separate ELAN tier called 'export' is employed (see Figure 4).Metadata for these segments are entered into this tier, following a structured template that includes a range of metadata values.The metadata in the export tier for the KWATY/WATER example are shown in Figure 5.     [HI KLSPTP[LK [L_[ ÄSL L_WVY[LK MYVT ,3(5 ^P[O 2>(;@>(;,9 example highlighted (vi) Build website posts We used Wordpress to build the Iltyem-iltyem website.
Wordpress is an open source content management service, with a built-in capacity to publish online content with titles, text, embedded video, images and audio.The content can be categorized, searched and commented upon.
People can register accounts with a Wordpress site, and editorial functions can be restricted to registered members.The Iltyem-iltyem project has extended these features for better site usability and manipulation of sign data, using a combination of publicly available and custom plugins. 18Figure 7 shows a screen capture of the KWATY/WATER example clip and metadata, as a published post on the Iltyem-iltyem website.Plugins are packets of code that extend the core behavior of existing software.
metadata and clips.The metadata is displayed on screen with the video clip, identifying the signer, the Sign IDs (Indigenous language and English), co-sign speech and speech translation.The signer's spoken language identity and the semantic categories the sign clip belongs to are displayed above the clip.The website is searchable across all these parameters -for example, viewers are able to search for all clips contributed by any given signer or recorded in a particular community, all the clips containing particular local language and English terms, and all clips Examples of on-line sign language dictionaries include the Auslan Signbank Dictionary (http://www.auslan.org.au), the British Sign Language dictionary (http://www.british-sign.co.uk/british-sign-language/dictionary/), and the New Zealand Sign Language dictionary (http://nzsl.vuw.ac.nz) -see also McKee & McKee (2014).The Summer Institute of Linguistics (SIL) is currently developing and testing a program called SooSL (See Our Own Sign Language), designed to support the creation of video-based dictionaries for sign languages of the world (http://www.sil.org/about/news/new-technology-supports-language-development-signed-languages).

Figure 3 :
Figure 3: KWATY/WATER video still Figure 4 shows the ELAN tier hierarchy of the Iltyem-iltyem project's annotation template.The top level of the tier hierarchy, called S-Utterance, marks a composite unit of sign and speech.The S-Utterance tier is the parent of a set of analysis tiers, such as RH-IDgloss, LH-IDgloss ^OLYL ZPNU PKLU[PÄJH[PVU SHILSZ MVY ZPNUZ THKL ^P[O [OL YPNO[ HUK SLM[ OHUK HYL UV[LK HUK S-Speech which is a transcript of the co-signing speech if it occurs.As discussed by Johnston(2010:125)  the optimal number of annotation tiers for sign language corpora is "yet to be determined" and is very much a matter of "trial and error".12 Figure 4 shows the ELAN tier hierarchy of the Iltyem-iltyem project's annotation template.The top level of the tier hierarchy, called S-Utterance, marks a composite unit of sign and speech.The S-Utterance tier is the parent of a set of analysis tiers, such as RH-IDgloss, LH-IDgloss ^OLYL ZPNU PKLU[PÄJH[PVU SHILSZ MVY ZPNUZ THKL ^P[O [OL YPNO[ HUK SLM[ OHUK HYL UV[LK HUK S-Speech which is a transcript of the co-signing speech if it occurs.As discussed by Johnston(2010:125)  the optimal number of annotation tiers for sign language corpora is "yet to be determined" and is very much a matter of "trial and error".12

Figure 4 :
Figure 4: ELAN annotation template used for the Iltyem-iltyem project

;
OLYL HYL H U\TILY VM JOHSSLUNLZ PUOLYLU[ PU JVUZPZ[LU[S` HUK \UPX\LS` HWWS`PUN ZPNU PKLU[PÄJH[PVU labeling.Ideally, a reference lexical database such as the Auslan Signbank is needed to do [OPZ LɈLJ[P]LS` 1VOUZ[VU W Z\NNLZ[Z [OH[ [OL JYLH[PVU VM H JVYW\Z ^P[OV\[ Z\JO H lexical database is "unlikely to succeed".In the Iltyem-iltyem WYVQLJ[ ZPNU PKLU[PÄJH[PVU SHILSZare developed heuristically throughout the annotation and analysis process.For the on-line dictionary, signs are labeled with a 'Sign ID' comprising a gloss (in the spoken language of the signer) and an English equivalent.This iterative process inevitably leads to revisions to the schema as the searchable corpus grows, and as new signs and variations to well known ones HYL PKLU[PÄLK The prevalence of sign polysemy poses particular problems in assigning unique Sign IDs.For example, in Central Australian Indigenous languages a range of kin terms is lexically KPɈLYLU[PH[LK PU ZWLLJO `L[ [OLYL HYL SLZZ RPU ZPNUZ [OHU ZWVRLU RPU [LYTZ (U L_HTWSL PZ [OL sign for Anmatyerr spoken kin terms: angey 'father, father's brother', awenh 'father's sister', and aler 'man's child, a person's brother's child'.Each of these kin terms is signed the same ^H`!H OVYPaVU[HSS` L_[LUKLK PUKL_ ÄUNLY [HWZ [OL JOPU ( YLSH[LK ZP[\H[PVU L_PZ[Z MVY THU` ÅVYH and fauna terms where a single sign refers to more general categories or taxonomic groupings.

>VYRÅV^ Capture and editing Annotation and online delivery Storage location
onto USB devices, to ensure community members receive copies of highlights from their recording sessions soon after they occur.These are also used as consultation tools to contextualise discussions about online publishing and other uses of recorded sign language material.For some communities, where there is little or no internet access, DVD or USB based copies of sign recordings may be the only feasible way to provide copies of sign recordings.

Table 2 :
Sign IDs and sign forms for WATER in Anmatyerr, Warlpiri, Kaytetye, and Ngaatjatjarra (Green & Wilkins, 2014)LTHU[PJ JH[LNVYPLZ ;OL ZLHYJO M\UJ[PVU PUJS\KLZ ÄS[LYZ ^OPJO HSSV^ searches to be constrained.The Iltyem-iltyem website enables various levels of access to content and material in both the front and back ends of the website.At its basic level it is open for subscriber access, which allows non-editable access to open material.There are also editor, researcher and administrator roles.This allows others who are not part of the project team to review, comment on and edit posts.The Iltyem-iltyem website is unique in several ways.Although there are online dictionaries for some primary sign languages,Iltyem-iltyem PZ [OL ÄYZ[ ZLHYJOHISL VUSPUL KPJ[PVUHY` VM HU (\Z[YHSPHU 0UKPNLUV\Z ZPNU SHUN\HNL ;V V\Y RUV^SLKNL P[ PZ HSZV [OL ÄYZ[ VM HU ºHS[LYUH[L» ZPNUlanguage to include embedded video and to represent multiple 'alternate' sign languages in a single online repository accessible to both community and academic audiences.ÄSTZ ZOV^PUN ZPNU SHUN\HNL PU \ZL and demonstrate the multimodal nature of communicative practices in Central Australia.This includes recordings of sand stories, where sign is used along with drawing, speech and song(Green, 2014).We are planning for a project review in 2016 and at this time all consultants or [OLPY MHTPSPLZ ^PSS IL HISL [V HKK HUKVY ^P[OKYH^ TH[LYPHS HUK YLÅLJ[ VU [OL JVUZLX\LUJLZ VM being 'online'.Post 2016, the website will be archived, but we also aim for it to continue as a live media product.20Signlanguageannotation is time consuming and requires specialist knowledge(Green & Wilkins, 2014).Building a corpus of time-aligned annotations linked to media provides access [V H YLÄULK KH[H ZL[ PU ^OPJO YLWYLZLU[H[P]L JSPWZ HYL SHILSSLK HUK ZLHYJOHISL HJYVZZ H YHUNL of parameters.Extending this to an online repository of curated sign language material thus represents increased value for money for the research investment.The online repository enables researchers to work collaboratively on research questions, such as the degree to which ZPNU SHUN\HNLZ PU *LU[YHS (\Z[YHSPH KPɈLY HJYVZZ SHUN\HNL NYV\WZ HUK ^OH[ ZPTPSHYP[PLZ HUK KPɈLYLUJLZ L_PZ[ IL[^LLU [OLZL HUK 0UKPNLUV\Z ZPNU SHUN\HNLZ MV\UK M\Y[OLY HÄLSK 19.