Text Mining - THATCamp Virginia 2012

Archive for the ‘Text Mining’ Category

20 Apr

Posted by
jet9r

Category
Mine your own business

6

Yesterday afternoon, Brad Pasanek and I decided to play at text-mining. We started working with MALLET and this GUI tool but were soon lost in the mine, buried in code, with nary a respirating canary, shafted.

Our proposal includes two potential approaches:

(1) a session could look at how a scholar might begin to use topic modeling in the humanities. What do those of us with limited technical nous need to know in order to begin this type of work? We imagine a walk-through, cooking-show-like presentation that goes from A (here are some texts) to B (here is a visualization). Between A and B there are many difficult and perilous interactions with shell scripts, MALLET extrusions, statistics, spread sheets, and graphing tools. While we two are probably not capable of getting from A to B with elegance, flailing about in a group, roughing out a work flow, getting advice from sundry THATCampers, and making time for questions would be generally instructive—or so we submit.

(2) An alternative approach assumes some basic success with topic-modeling, and focuses instead on working with the cooked results. How can my-mine-mein data (we would bring something to the session and invite others to do the same) be interpreted, processed, and visualised? This secondary concern may even be included in the visualization session that has already been proposed.

Both bits assume a willingness to wield the MALLET and do some topic modeling. We aim primarily at a how-to and hack-and-help, and not a discussion of the pros and cons of topic modeling or text-mining in general.

THATCamp VA

THATCamp Virginia 2012 is a regional THATCamp to be held on Friday and Sunday, April 20th and 21st, 2012, at the Scholars' Lab, University of Virginia Library, in Charlottesville, VA.

We'll do two workshops on Friday, April 20th—one on Neatline, our soon-to-be-launched spatial humanities project, and the other on do-it-yourself aerial photography, with helium balloons. Then the un-conference itself will run all day on Saturday, April 21.

Archive for the ‘Text Mining’ Category

Mine your own business

THATCamp VA

Recent Posts

Archives

Categories