Mining Shakespeare

Via LISNews, a news item on a project of the kind I would love to be involved with someday: Mellon grant to fund project to develop data-mining software for libraries

In his winning project, titled "Web-based Text-Mining and Visualization for Humanities Digital Libraries," Unsworth [John Unsworth, Dean of the School of Library and Information Science at the University of Illinois at Urbana-Champaign] expects to produce software "for discovering, visualizing and exploring significant patterns across large collections of full-text humanities resources in digital libraries and collections." …

In traditional "search-and-retrieval" projects, scholars bring specific queries to collections of text and get back more or less useful answers to those queries, Unsworth said.

"By contrast, the goal of data-mining, including text-mining, is to produce new knowledge by exposing unanticipated similarities or differences, clustering or dispersal,
co-occurrence and trends."

And here’s the part that really caught my eye:

With data-mining tools, Unsworth said, you first select a body of material that you think is important in some way, next select features of those materials that you similarly think are important, and then "map the occurrence of those features in the selected materials to see whether patterns emerge. If patterns do emerge, you analyze them and from that analysis emerges — if you are lucky — new insights into the materials."

For example, in the planning grant for this project, members of his research team, using the full set of Shakespeare’s plays, selected five "circulation-of characters" features — scenes, nodes, singles, loops, switches — as independent variables, and "genre" as the dependent variable; they then "attempted to order the plays by feature similarities and see how that corresponded — or didn’t — to genre," he said.

"There was one very interesting result, which was that Othello fell squarely in with the comedies. If I were to analyze this result, I’d ask a number of questions about the methods used to produce the results, but once satisfied that I was not looking at an artifact of the procedure itself, I would ask what it means that Othello has the structural features of comedy, and from there, an interesting journal article might emerge."

Years ago I heard a professor of classics give a lecture on Oedipus Rex and Othello in which he suggested that Othello was "structurally a comedy." I forget how he reached that conclusion (it had something to do with the way the plot of Othello centers around its protagonist  being elaborately deceived). It’s fascinating to think about how the UIUC research team reached the same interpretation by such a different road. This kind of thing is what appeals to me about humanities computing.

Or, to pose a question that I heard posed this summer: what could we do if we had the entire nineteenth century online?

5 Responses to “Mining Shakespeare”

  1. Harrison says:

    Finally! A use for data mining other than to discover that men in their 30s buy beer and diapers at the same time, when shopping in the late afternoon.

  2. alan says:

    The Othello/comedy result is fascinating. I wonder if it’s connected with the similarity between ‘comedy duos’ such as ‘Laurel and Hardy’ and ‘Othello and Iago’. Perhaps Mozart’s Don Giovanni has a similar structure? Also, I think that of all Shakespeare’s tragedies Othello has the ‘purest’ tragic form, more akin to Greek tragedy, or opera, for that matter. The other three great tragedies have more elaborate structures and embrace a broader range of tragic themes. Othello is very focused on a single idea, and relies heavily on dramatic irony, as much good comedy does.

  3. Amanda says:

    Interesting that you’d compare it to Don Giovanni, since Don Giovanni has always struck me as so much in the mingled tragi-comedy vein — rather like what people like to say about Shakespeare’s tragedies usually not following a strict tragic form, for that matter. (Now I’m wishing I could recall the substance of a conversation a bunch of my friends and I once had during intermission at a performance of DG, which I vaguely remember touched on that — we were talking about the sudden eruption of revenge plot into seduction plot during the the masked ball.)
    I agree with you on Othello being the singlest-minded of the major tragedies, though — the only one that approaches its tight focus on one group of characters and one plot is Macbeth, which doesn’t quite count with all its supernatural apparatus.

  4. alan says:

    I suppose it’s the sheer force of the tragic element as conveyed in the music of DG that makes me think of it as a tragedy. You’re quite right, it’s a tragi-comedy.

  5. Amanda says:

    Re the music of DG: absolutely. From the very first bars of the overture. But perhaps the thing about DG is its vacillation between tragic and comic extremes? Figaro and Così are more consistently comic, though both have darker undercurrents aplenty.
    Now I’m going to have to listen to all the Mozart-Da Ponte operas one after the other for purposes of comparison…