Analysis for wuthering.txt

1. Most frequent words in wuthering.txt

WORD OCCURANCE
heathcliff 240
linton 157
catherine 144
could 126
will 103
answer 80
master 80
edgar 78
time 76
house 76
hand 72
earnshaw 72
door 71
don 67
joseph 66
hindley 63
well 60
thought 60
reply 57
face 56

3. Part of speech tagging

Part of speech tagging is an interesting breed: mostly all longer texts split up into a quite constant array of nouns / verbs / etc. - no surprise here!

What's more interesting when you combine part of speech tagging with other forms of analysis. Would the occurences of only adjectives tell us more about the mood of a certain part of text, like a chapter? Certainly so! What about verbs? Do they present traces of action and happening?

Part of speech tagging becomes especially helpful when playing with n-grams and sentiment analysis, so for now just take our word: the application is ready to bring 100.300 English words for tagging, there can not be a lot more than that!

These features will be coming out soon on Underminer. Until then, part of speech tagging is displayed in a form of the good old boring piechart.

5. Sentiment analysis
Every novel's - we're thinking linear plots - primal statement is probably about consequences (that cursed human condition).

Are things going generally the right direction or they just keep getting worse? Using TextBlob's sentiment analysis, this chart aims to make a guess at the direction of the plot: by associating a value (y) between +1 (extremely good) and -1 (extremly unfornutane) for each sentence of the text (x), we are able to see how the plot progresses.

Please note: only those sentences are counted, where a sentiment could be detected - sentences with a value of zero are omitted from the chart.

ACCESS FILES

Click for processed data!