
The i-Vocab file hierarchy is an idiosyncratic vocabulary stream for authors over time which attempts to be a "fingerprint" of the author, both by individual, workpiece, and date. First pass is to use common OCR engine and generate vocabulary file for author, article/book/thesis, noting date of creation. This generates i-Vocab base dict, which is not changed. Best guess is to say that base is words which are at .001 level? So Maslow's base would contain words such as "isomorphic," "Weltanschauung," "aggridant," "eupsychia," etc.YYYYMMDD.iv0 (zero) is base file (proposed) and successive gathers are YYYYMMDD.ivo (letter 'o'). The .ivo files can be gathered by month, year and day for ease of historical processing. Dated files with no content are deleted before writing the monthly and yearly files, and monthly files without content are also deleted. Native UNIX utilities and cron used to build repository structure. Key to making it work is consistent upkeep and discrete stacks by not only author but author-and-written-text. This way, new words like "orthomentor," "eularchy," and "Inodalyn" can be used to map sociograms over time, etc.
The base idiosyncratic vocabulary metafile is used for B-analysis and Petri-style addition to growing media, as well as linking to it's own unique protocol. The protocol allows for realtime, Internet-wide plug-in functionality. The author's iVocab Metafile can be attached to a process vi sockets, etc.
The i-Vocab base file produces a planar map of uniqueness and frequency, etc., and can be used as a "fingerprint" or "blueprint" for the author. This could predict future joinings to other authors and could provide for realtime analysis and display of linguistic tropisms as they develop.
Will post i-Vocab metafile for Abraham Maslow's paper "Theory Z" and his "Eupsychian Management" book. As words are added by scanning future works, they will contribute to the base file, and, as time permits, backward construction.
There will need to be a root dictionary, but idiosyncratic vocabularies can be caught and sampled now. The root dictionary is mostly obvious.