ILDA User Manual - Glossary of Terms

ILDA Glossary Table
Term Definition
Audio document

An audio primary source material hosted on ILDA as a “Document.”

Concordance orthography

A mapping of characters from one orthography to another.


Comma Separated Values. CSVs are a common DSV format for exporting data because they are universal and allow for information to be taken from one software to another.

Digital Surrogate

A digital copy of a paper-based archival item.


Delimiter-Separated Values. Delimited text files are a common format for storing data because they are universal and allow information to easily be used in multiple applications. In a DSV file, data is separated into “columns” using a text character.

You can think of DSV files as text file versions of a spreadsheet, where each cell is separated by a specific character. ILDA accepts uploads of two types of delimited text files, Comma-Separated Values (CSV) or Tab-Separated Values (TSV) in which data is separated by commas or the [TAB] key, respectively.


A set of archival material designated as a collection on ILDA. This term is user-defined; administrators of ILDA designate collections of materials in order to organize primary source materials within ILDA.


Saving a file into a different format.


A label assigned to a given morpheme that states the role of the morpheme in the grammar of the language.


International Phonetic Alphabet.


A common file format for digital images. ILDA only allows upload of JPG image files.


A basic unit of the vocabulary of a language. It may consist of a single word or may be more complex, encompassing a phrase.  

Line Numbers

Numbers on the margin of a page designed to aid researchers orient themselves as they look at a document.


The smallest unit of meaning in a language which may be smaller than a word and joined to other morphemes to create or modify a word.

Open source

Free software that is available to the general public without a subscription or without pay and can often be easily downloaded from the internet.


A conventional system for the written representation of a language.


Portable Document Format. PDFs are a stable file format for viewing documents, and many pdf viewing software packages allow users to edit pages. This allows users to redact sensitive information and add line numbers.


Sound or set of sounds that create contrast in meaning in a language.


A phrase is an instance of the target language in the archival documentation that is used to create a phrase entry in ILDA. Along with page and line numbers, phrase numbers are used to uniquely identify all phrase entries in ILDA.

Primary source materials

Archival materials in their original state, as opposed to materials written about or derived from archival materials (e.g. published works, slip files).

Stem/Head word

Unit to which additional morphemes can attach to in order to modify the meaning content of the unit or even the nature of the resulting combination.

Target Language

The language being analyzed and documented in ILDA for the purpose of its revitalization.


A token is an individual instance of a word or phrase in the target language.


Tab-separated values. TSVs are a common DSV text format for exporting data because they are universal and allow for information to be taken from one software to another.


An orthographic representation of language data originally in audio format.


The process of analyzing a written record from one orthography to another.