• A simple intertextual machine

    For some time, I’ve been interested in the similarities between two given texts. That similarity could be understood as textual (approximate string matching, longest common subsequence, etc), language-based (translations), semantic (paraphrases, allusions, etc), and ludic (think Derrida’s Glas). In an effort to resist my tendency to think up Digital Humanities chalupas (e.g. Neatline + Omeka + Juxta + Zotero + VoyantTools all rolled up into one), I’m trying to imagine the most simple block matcher possible.

    Focusing on the textual for a bit. Here’s what I want my tool to do for me:

    • Parse the text for suggestions using an approximate string matching algorithm fine tuned for different versions of a literary work.
    • Allow me to tweak the results by selecting the appropriate boundaries for the blocks.
    • Allow me to name the individual blocks using unique IDs.
    • Store my choices in a database.
    • Take me to  bird’s eye view of the two documents, to see where things have moved around to.
    • After I have done enough matching with several documents, show me the network of connections.

    We already have a tool, Juxta, that could provide this functionality if we expand it’s capability to abstract matches and divorce it from the DIFF algorithm. The one addition we would need would be the ability to give unique ID’s to blocks and visualize from a distance. Anyone up for tweaking Juxta?


Comments are closed.

Skip to toolbar