Skip to contents

Returns the cosine similarity of each word in a data frame and a given vector representation.

Usage

simil_words2rep(word_df, word_field = "words", rep, model)

Arguments

word_df

A data.frame containing a column with words or multiword expressions.

word_field

A character string indicating the name of the column in word_df that contains the words.

rep

A given word-vector representation, stored as numerical vector, matrix, or sparse matrix object. length, resp. ncol of rep must be equal to the dimensions of the used fastText model.

model

A fastText model, loaded by fastrtext::load_model().

Examples

model <- fastrtext::load_model(system.file("extdata",
                               "tw_demo_model_sml.bin",
                                package = "dictvectoR"))
pop_rep <- tw_annot %>%
           dplyr::filter(pop == 1) %>%
           clean_text(text_field = "full_text") %>%
           get_corpus_representation(model = model)
words_df <- data.frame(words = c("coronadeutschland", "skandal"))
words_df$popsimil <- simil_words2rep(words_df,
                                     word_field = "words",
                                     rep = pop_rep,
                                     model)