Underminer: Analisys for wuthering.txt

Analysis for wuthering.txt

1. Most frequent words in wuthering.txt

WORD	OCCURANCE
heathcliff	240
linton	157
catherine	144
could	126
will	103
answer	80
master	80
edgar	78
time	76
house	76
hand	72
earnshaw	72
door	71
don	67
joseph	66
hindley	63
well	60
thought	60
reply	57
face	56

2. Typical word occurances per chapter

The chart shows the occurances of words typical to the text (x) for each chapter (y).

Use the input field to add more words - this is not a free text search, though all words present in the text are shown in the autofill menu.

You can also remove words by doubleclicking over the legend of each word. With a single click, you will highlight the given word.

Legend

3. Part of speech tagging

Part of speech tagging is an interesting breed: mostly all longer texts split up into a quite constant array of nouns / verbs / etc. - no surprise here!

What's more interesting when you combine part of speech tagging with other forms of analysis. Would the occurences of only adjectives tell us more about the mood of a certain part of text, like a chapter? Certainly so! What about verbs? Do they present traces of action and happening?

Part of speech tagging becomes especially helpful when playing with n-grams and sentiment analysis, so for now just take our word: the application is ready to bring 100.300 English words for tagging, there can not be a lot more than that!

These features will be coming out soon on Underminer. Until then, part of speech tagging is displayed in a form of the good old boring piechart.

4. Typical sentence length

The chart shows the average number of words present in an average sentence (x) for each chapter (y).

Colors indicate the most common part of speech at the given word position.

Only those sentences are counted which have at least the average of the chapter's sentence length.

Part of speech

Noun

Pronoun

Adjective

Verb

Auxiliary-verb

Adverb

Preposition

Conjuction

Interjection

Unknown

5. Sentiment analysis

Every novel's - we're thinking linear plots - primal statement is probably about consequences (that cursed human condition).

Are things going generally the right direction or they just keep getting worse? Using TextBlob's sentiment analysis, this chart aims to make a guess at the direction of the plot: by associating a value (y) between +1 (extremely good) and -1 (extremly unfornutane) for each sentence of the text (x), we are able to see how the plot progresses.

Please note: only those sentences are counted, where a sentiment could be detected - sentences with a value of zero are omitted from the chart.

6. Named entites

The chart shows every named entity recognized by Stanford Univerity's Named Entity Tagger.

Only those entites are connected via links, which are not further away from each other than 500 words in the text.

The width of the connecting links shows how frequently the two entites are mentioned together.

Please note: this chart is in development phase, it is only available for the demo text - it will not be present in your custom text analysis.

Colors

Person

Location

Organization

Date

Time

Money

ACCESS FILES

Click for processed data!

UNDERMINER

Analysis for wuthering.txt

1. Most frequent words in wuthering.txt

3. Part of speech tagging