vertical file
A vertical file is a text file where each token (or word) is on a separate line. This format is typically used for text corpora and may contain additional metainformation (annotation). Vertical files are usually created from a prevertical format.
The first column contains tokens and structures, the other columns may contain part of speech, lemmas or other positional attributes. An example of a vertical file:
<doc genre="fiction" title="1984" author="G. Orwell"> <chapter no="1.1"> <p> <s> It PP it-d was VBD be-v a DT a-x bright JJ bright-j cold JJ cold-j day NN day-n in IN in-i April NP April-n <g/> , , ,-x and CC and-c the DT the-x clocks NNS clock-n were VBD be-v striking JJ striking-j thirteen CD thirteen-m <g/> . SENT .-x </s> ... </p> ... </chapter> </doc>column 1: tokens and structures column 2: part of speech tags column 3: lempos attribute See more details on how to prepare a text for the vertical format.