Function reference
-
add_id()
- Add id
-
add_multiwords()
- Find multi-word expressions.
-
add_word_id()
- Add word_id.
-
clean_text()
- Clean text
-
confuse()
- Get confusion matrix
-
cossim2dict()
- Similarity of documents to a dictionary
-
detect_similar_words()
- Detect similar words
-
drop_which()
- Determine which similar terms to drop
-
filter_ntile()
- Filter by ntile.
-
find_distinctive()
- Find distinctive keywords
-
find_unique_id()
- Find index of unique id in df.
-
get_ARPF()
- Get Accuracy, Recall, Precision, and F1
-
get_F1()
- Get F1 score for a DDR measure
-
get_combis()
- Get combinations of keywords
-
get_corpus_representation()
- Vector representation of a corpus.
-
get_hits()
- Get occurrence frequency of words.
-
get_many_F1s()
- Get F1 scores for many words or dictionaries.
-
get_many_F1s_by_group()
- Get F1 for many by a grouping varialbe
-
get_many_RPFs()
- Get Recall, Precision, F1 for many
-
get_prediction()
- Get binary prediction
-
get_word_representations()
- Get word representations.
-
normalize()
- Normalize a vector
-
prepare_train_data()
- Prepare text for fastText-model-training
-
remove_similar_words()
- Remove too similar terms
-
repl_na()
- Replace missing values.
-
simil_words2rep()
- Cosine similarity between words and a given vector.
-
tw_annot
- Annotated Tweets from German politicians.
-
tw_data
- Tweets from German politicians.