Underminer: Analisys for madamebovary.txt

Analysis for madamebovary.txt

1. Most frequent words in madamebovary.txt

WORD	OCCURANCE
emma	374
charles	317
madame	246
time	243
day	233
monsieur	229
bovary	226
hand	196
could	183
thought	182
good	170
homais	170
eye	167
love	166
long	159
room	156
leon	140
man	135
head	135
will	133

2. Typical word occurances per chapter

The chart shows the occurances of words typical to the text (x) for each chapter (y).

Use the input field to add more words - this is not a free text search, though all words present in the text are shown in the autofill menu.

You can also remove words by doubleclicking over the legend of each word. With a single click, you will highlight the given word.

Legend

3. Part of speech tagging

Part of speech tagging is an interesting breed: mostly all longer texts split up into a quite constant array of nouns / verbs / etc. - no surprise here!

What's more interesting when you combine part of speech tagging with other forms of analysis. Would the occurences of only adjectives tell us more about the mood of a certain part of text, like a chapter? Certainly so! What about verbs? Do they present traces of action and happening?

Part of speech tagging becomes especially helpful when playing with n-grams and sentiment analysis, so for now just take our word: the application is ready to bring 100.300 English words for tagging, there can not be a lot more than that!

These features will be coming out soon on Underminer. Until then, part of speech tagging is displayed in a form of the good old boring piechart.

4. Typical sentence length

The chart shows the average number of words present in an average sentence (x) for each chapter (y).

Colors indicate the most common part of speech at the given word position.

Only those sentences are counted which have at least the average of the chapter's sentence length.

Part of speech

Noun

Pronoun

Adjective

Verb

Auxiliary-verb

Adverb

Preposition

Conjuction

Interjection

Unknown

5. Sentiment analysis

Every novel's - we're thinking linear plots - primal statement is probably about consequences (that cursed human condition).

Are things going generally the right direction or they just keep getting worse? Using TextBlob's sentiment analysis, this chart aims to make a guess at the direction of the plot: by associating a value (y) between +1 (extremely good) and -1 (extremly unfornutane) for each sentence of the text (x), we are able to see how the plot progresses.

Please note: only those sentences are counted, where a sentiment could be detected - sentences with a value of zero are omitted from the chart.

6. Named entites

The chart shows every named entity recognized by Stanford Univerity's Named Entity Tagger.

Only those entites are connected via links, which are not further away from each other than 500 words in the text.

The width of the connecting links shows how frequently the two entites are mentioned together.

Please note: this chart is in development phase, it is only available for the demo text - it will not be present in your custom text analysis.

Colors

Person

Location

Organization

Date

Time

Money

ACCESS FILES

Click for processed data!

UNDERMINER

Analysis for madamebovary.txt

1. Most frequent words in madamebovary.txt

3. Part of speech tagging