A
vertical file is a text file where each token (or word) is on a separate line. This format is typically used for text corpora and may contain additional metainformation (annotation). Vertical files are usually created from a prevertical format.
The first column contains tokens and structures, the other columns may contain part of speech, lemmas or other positional attributes. An example of a vertical file:
<doc genre="fiction" title="1984" author="G. Orwell">
<chapter no="1.1">
<p>
<s>
It PP it-d
was VBD be-v
a DT a-x
bright JJ bright-j
cold JJ cold-j
day NN day-n
in IN in-i
April NP April-n
<g/>
, , ,-x
and CC and-c
the DT the-x
clocks NNS clock-n
were VBD be-v
striking JJ striking-j
thirteen CD thirteen-m
<g/>
. SENT .-x
</s>
...
</p>
...
</chapter>
</doc>
column 1: tokens and structures
column 2: part of speech tags
column 3: lempos attribute
See more details on how to
prepare a text for the vertical format.