labMTsimple.storyLab functions

labMTsimple.storyLab.copy_static_files()

Deprecated method to copy files from this module’s static directory into the directory where shifts are being made.

labMTsimple.storyLab.emotion(tmpStr, someDict, scoreIndex=1, shift=False, happsList=[])

Take a string and the happiness dictionary, and rate the string.

If shift=True, will return a vector (also then needs the happsList).

labMTsimple.storyLab.emotionFileReader(stopval=1.0, lang='english', min=1.0, max=9.0, returnVector=False)

Load the dictionary of sentiment words.

Stopval is our lens, $Delta _h$, read the labMT dataset into a dict with this lens (must be tab-deliminated).

With returnVector = True, returns tmpDict,tmpList,wordList. Otherwise, just the dictionary.

labMTsimple.storyLab.emotionV(frequencyVec, scoreVec)

Given the frequency vector and the score vector, compute the happs.

Doesn’t use numpy, but equivalent to np.dot(freq,happs)/np.sum(freq).

Same as copy_static_files, but makes symbolic links.

labMTsimple.storyLab.shift(refFreq, compFreq, lens, words, sort=True)

Compute a shift, and return the results.

If sort=True, will return the three sorted lists, and sumTypes. Else, just the two shift lists, and sumTypes (words don’t need to be sorted).

labMTsimple.storyLab.shiftHtml(scoreList, wordList, refFreq, compFreq, outFile, corpus='LabMT', advanced=False, customTitle=False, title='', ref_name='reference', comp_name='comparison', ref_name_happs='', comp_name_happs='', isare='')

Make an interactive shift for exploring and sharing.

The most insane-o piece of code here (lots of file copying, writing vectors into html files, etc).

Accepts a score list, a word list, two frequency files and the name of an HTML file to generate

** will make the HTML file, and a directory called static that hosts a bunch of .js, .css that is useful.

labMTsimple.storyLab.shiftHtmlJupyter(scoreList, wordList, refFreq, compFreq, outFile, corpus='LabMT', advanced=False, customTitle=False, title='', ref_name='reference', comp_name='comparison', ref_name_happs='', comp_name_happs='', isare='', saveFull=True, selfshift=False, bgcolor='white')

Shifter that generates HTML in two pieces, designed to work inside of a Jupyter notebook.

Saves the filename as given (with .html extension), and sneaks in a filename-wrapper.html, and the wrapper file has the html headers, everything to be a standalone file. The filenamed html is just the guts of the html file, because the complete markup isn’t need inside the notebook.

labMTsimple.storyLab.shiftHtmlPreshifted(scoreList, wordList, refFreq, compFreq, outFile, corpus='LabMT', advanced=False, customTitle=False, title='', ref_name='reference', comp_name='comparison', ref_name_happs='', comp_name_happs='', isare='')

Make an interactive shift for exploring and sharing.

The most insane-o piece of code here (lots of file copying, writing vectors into html files, etc).

Accepts a score list, a word list, two frequency files and the name of an HTML file to generate

** will make the HTML file, and a directory called static that hosts a bunch of .js, .css that is useful.

labMTsimple.storyLab.stopper(tmpVec, score_list, word_list, stopVal=1.0, ignore=[], center=5.0)

Take a frequency vector, and 0 out the stop words.

Will always remove the nig* words.

Return the 0’ed vector.

labMTsimple.storyLab.stopper_mat(tmpVec, score_list, word_list, stopVal=1.0, ignore=[], center=5.0)

Take a frequency vector, and 0 out the stop words.

A sparse-aware matrix stopper. F-vecs are rows: [i,:]

Will always remove the nig* words.

Return the 0’ed matrix, sparse.

labMTsimple.speedy sentiDict class

class labMTsimple.speedy.sentiDict(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')

An abstract class to score them all.

makeListsFromDict()

Make lists from a dict, used internally.

makeMarisaTrie(save_flag=False)

Turn a dictionary into a marisa_trie.

matcherDictBool(word)

MatcherTrieDict(word) just checks if a word is in the dict.

matcherTrieBool(word)

MatcherTrieBool(word) just checks if a word is in the list. Returns 0 or 1.

Works for both trie types. Only one needed to make the plots. Only use this for coverage, so don’t even worry about using with a dict.

matcherTrieDict(word, wordVec, count)

Not sure what this one does.

matcherTrieMarisa(word, wordVec, count)

Not sure what this one does.

my_marisa = (<marisa_trie.RecordTrie object>, <marisa_trie.RecordTrie object>)

Declare this globally.

openWithPath(filename, mode)

Helper function for searching for files.

scoreTrieDict(wordDict, idx=1, center=0.0, stopVal=0.0)

Score a wordDict using the dict backend.

INPUTS:

-wordDict is a favorite hash table of word and count.

scoreTrieMarisa(wordDict, idx=1, center=0.0, stopVal=0.0)

Score a wordDict using the marisa_trie backend.

INPUTS:

-wordDict is a favorite hash table of word and count.

stopper(tmpVec, stopVal=1.0, ignore=[])

Take a frequency vector, and 0 out the stop words.

Will always remove the nig* words.

Return the 0’ed vector.

wordVecifyTrieDict(wordDict)

Make a word vec from word dict using dict backend.

INPUTS:

-wordDict is our favorite hash table of word and count.

wordVecifyTrieMarisa(wordDict)

Make a word vec from word dict using marisa_trie backend.

INPUTS:

-wordDict is our favorite hash table of word and count.

The following subclasses of the sentiDict class are available:

-LabMT

-ANEW

-LIWC07

-MPQA

-OL

-WK

-LIWC01

-LIWC15

-PANASX

-Pattern

-SentiWordNet

-AFINN

-GI

-WDAL

-EmoLex

-MaxDiff

-HashtagSent

-Sent140Lex

-SOCAL

-SenticNet

-Emoticons

-SentiStrength

-VADER

-Umigon

-USent

-EmoSenticNet

these don’t get the data attributes so we’ll just leave them out

labMTsimple.speedy sentiDict subclasses

class labMTsimple.speedy.LabMT(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')

LabMT class.

Now takes the full name of the language.

class labMTsimple.speedy.ANEW(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')

ANEW class.

loadDict(bananas, lang)

Load the corpus into a dictionary, straight from the origin corpus file.

class labMTsimple.speedy.LIWC07(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')

This is the default, define it anyway

class labMTsimple.speedy.MPQA(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')

MPQA class.

loadDict(bananas, lang)

Load the corpus into a dictionary, straight from the origin corpus file.

class labMTsimple.speedy.OL(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
loadDict(bananas, lang)

Load the corpus into a dictionary, straight from the origin corpus file.

class labMTsimple.speedy.WK(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.LIWC01(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.LIWC15(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.PANASX(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.Pattern(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.SentiWordNet(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.AFINN(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.GI(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.WDAL(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.EmoLex(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.MaxDiff(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.HashtagSent(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.Sent140Lex(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.SOCAL(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.SenticNet(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.Emoticons(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.SentiStrength(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.VADER(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.Umigon(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.USent(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.EmoSenticNet(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')

labMTsimple.speedy sentiDict subclasses auto

class labMTsimple.speedy.AFINN(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.ANEW(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')

ANEW class.

loadDict(bananas, lang)

Load the corpus into a dictionary, straight from the origin corpus file.

class labMTsimple.speedy.EmoLex(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.EmoSenticNet(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.Emoticons(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.GI(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.HashtagSent(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.LIWC(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')

LIWC class.

loadDict(bananas, lang)

Load the corpus into a dictionary, straight from the origin corpus file.

class labMTsimple.speedy.LIWC01(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.LIWC07(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')

This is the default, define it anyway

class labMTsimple.speedy.LIWC15(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.LabMT(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')

LabMT class.

Now takes the full name of the language.

class labMTsimple.speedy.MPQA(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')

MPQA class.

loadDict(bananas, lang)

Load the corpus into a dictionary, straight from the origin corpus file.

class labMTsimple.speedy.MaxDiff(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.OL(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
loadDict(bananas, lang)

Load the corpus into a dictionary, straight from the origin corpus file.

class labMTsimple.speedy.PANASX(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.Pattern(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.SOCAL(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.Sent140Lex(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.SentiStrength(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.SentiWordNet(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.SenticNet(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.USent(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.Umigon(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.VADER(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.WDAL(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.WK(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')
class labMTsimple.speedy.sentiDict(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')

An abstract class to score them all.

makeListsFromDict()

Make lists from a dict, used internally.

makeMarisaTrie(save_flag=False)

Turn a dictionary into a marisa_trie.

matcherDictBool(word)

MatcherTrieDict(word) just checks if a word is in the dict.

matcherTrieBool(word)

MatcherTrieBool(word) just checks if a word is in the list. Returns 0 or 1.

Works for both trie types. Only one needed to make the plots. Only use this for coverage, so don’t even worry about using with a dict.

matcherTrieDict(word, wordVec, count)

Not sure what this one does.

matcherTrieMarisa(word, wordVec, count)

Not sure what this one does.

my_marisa = (<marisa_trie.RecordTrie object>, <marisa_trie.RecordTrie object>)

Declare this globally.

openWithPath(filename, mode)

Helper function for searching for files.

scoreTrieDict(wordDict, idx=1, center=0.0, stopVal=0.0)

Score a wordDict using the dict backend.

INPUTS:

-wordDict is a favorite hash table of word and count.

scoreTrieMarisa(wordDict, idx=1, center=0.0, stopVal=0.0)

Score a wordDict using the marisa_trie backend.

INPUTS:

-wordDict is a favorite hash table of word and count.

stopper(tmpVec, stopVal=1.0, ignore=[])

Take a frequency vector, and 0 out the stop words.

Will always remove the nig* words.

Return the 0’ed vector.

wordVecifyTrieDict(wordDict)

Make a word vec from word dict using dict backend.

INPUTS:

-wordDict is our favorite hash table of word and count.

wordVecifyTrieMarisa(wordDict)

Make a word vec from word dict using marisa_trie backend.

INPUTS:

-wordDict is our favorite hash table of word and count.

labMTsimple.speedy.u(x)

Python 2/3 agnostic unicode function

test

class labMTsimple.speedy.LabMT(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')

LabMT class.

Now takes the full name of the language.

test auto

class labMTsimple.speedy.LabMT(datastructure='dict', stopVal=0.0, bananas=False, loadFromFile=False, saveFile=False, lang='english')

LabMT class.

Now takes the full name of the language.