Skip to contents

Draw bigram network using morphological analysis data.

Usage

draw_bigram_network(df, draw = TRUE, ...)

bigram(df, group = "sentence", depend = FALSE, term_depend = NULL, ...)

trigram(df, group = "sentence")

bigram_depend(df, group = "sentence")

bigram_network(bigram, rand_seed = 12, threshold = 100, ...)

word_freq(df, big_net, ...)

bigram_network_plot(
  big_net,
  freq,
  ...,
  arrow_size = 5,
  circle_size = 5,
  text_size = 5,
  font_family = "",
  arrow_col = "darkgreen",
  circle_col = "skyblue",
  x_limits = NULL,
  y_limits = NULL,
  no_scale = FALSE
)

Arguments

df

A dataframe including result of morphological analysis.

draw

A logical.

...

Extra arguments to internal functions.

group

A string to specify sentence.

depend

A logical.

term_depend

A string of dependent terms column to use bigram.

bigram

A result of bigram().

rand_seed

A numeric.

threshold

A numeric used as threshold for frequency of bigram.

big_net

A result of bigram_network().

freq

A numeric of word frequency in bigram_network. Can be got using word_freq().

arrow_size, circle_size, text_size,

A numeric.

font_family

A string.

arrow_col, circle_col

A string to specify arrow and circle color in bigram network.

x_limits, y_limits

A Pair of numeric to specify range.

no_scale

A logical. FALSE: Not draw x and y axis.

Value

A list including df (input), bigram, freq (frequency) and gg (ggplot2 object of bigram network plot).

Examples


sentences <- 50
len <- 30
n <- sentences * len
x <- letters
prob <- (length(x):1) ^ 3
df <- 
  tibble::tibble(
    lemma = sample(x = x, size = n, replace = TRUE, prob = prob),
    sentence = rep(seq(sentences), each = len))
draw_bigram_network(df)

#> $df
#> # A tibble: 1,500 × 2
#>    lemma sentence
#>    <chr>    <int>
#>  1 a            1
#>  2 j            1
#>  3 f            1
#>  4 b            1
#>  5 a            1
#>  6 d            1
#>  7 e            1
#>  8 c            1
#>  9 h            1
#> 10 i            1
#> # ℹ 1,490 more rows
#> 
#> $bigram
#> # A tibble: 264 × 3
#>    word_1 word_2  freq
#>    <chr>  <chr>  <int>
#>  1 b      a         25
#>  2 a      c         22
#>  3 b      d         21
#>  4 e      b         21
#>  5 d      b         20
#>  6 a      a         19
#>  7 b      g         19
#>  8 a      b         18
#>  9 b      c         18
#> 10 c      b         18
#> # ℹ 254 more rows
#> 
#> $freq
#>  [1] 10 10 10 10 10 10 10 10  8  8  8  8  6  6
#> 
#> $gg

#>