Skip to contents

All functions

add_id()
Add id
add_multiwords()
Find multi-word expressions.
add_word_id()
Add word_id.
clean_text()
Clean text
confuse()
Get confusion matrix
cossim2dict()
Similarity of documents to a dictionary
detect_similar_words()
Detect similar words
drop_which()
Determine which similar terms to drop
filter_ntile()
Filter by ntile.
find_distinctive()
Find distinctive keywords
find_unique_id()
Find index of unique id in df.
get_ARPF()
Get Accuracy, Recall, Precision, and F1
get_F1()
Get F1 score for a DDR measure
get_combis()
Get combinations of keywords
get_corpus_representation()
Vector representation of a corpus.
get_hits()
Get occurrence frequency of words.
get_many_F1s()
Get F1 scores for many words or dictionaries.
get_many_F1s_by_group()
Get F1 for many by a grouping varialbe
get_many_RPFs()
Get Recall, Precision, F1 for many
get_prediction()
Get binary prediction
get_word_representations()
Get word representations.
normalize()
Normalize a vector
prepare_train_data()
Prepare text for fastText-model-training
remove_similar_words()
Remove too similar terms
repl_na()
Replace missing values.
simil_words2rep()
Cosine similarity between words and a given vector.
tw_annot
Annotated Tweets from German politicians.
tw_data
Tweets from German politicians.