Types of data that we write as tables

Tables, which can exist on disk as script files or archives but are accessed by our code in a unified way, were explained in The Table concept.

Here we discuss particular types of data, usually indexed by a string corresponding to the utterance or speaker, that we frequently access using Tables.

I/O with alignments

Alignments (Alignments in Kaldi) are of type vector<int32> and an alignment represents a sequence of transition-ids (Integer identifiers used by TransitionModel) for a particular utterance.

Because an alignment is the same as vector<int32>, the types involved directly in this are the typedefs Int32VectorWriter, SequentialInt32VectorReader, and RandomAccessInt32VectorReader. The convention in our scripts and program names is that "ali" is short for "alignment", so a set of alignments on disk might be called 0.ali, and we have command-line programs like ali-to-pdf, ali-to-phones that deal with alignments. All training programs read alignments, which are generated by a decoder in a separate stage.