My problem is this:
If I have a string with words sorted by their value (separated by comma ):
text = "light, device emitting, light emitting, optical, light emitting, diode, electrode, photodetector, semiconductor, device emitter, device photodetector, resin, seal, device light, semiconductor device, light emitting device, compact light emitting device, compact light emitting device , compact device for lighting a light-emitting device, LED diode of a device, device for a photocell of a device, tightness of a device, emitting type, emitting light t, emitting light-emitting light-emitting light, light emission, sealing of the light device, optical transmitter, package assembly, photocell device, photosensitive, semiconductor electrode device, semiconductor photocell device, transmitting, transmitter, type of light,type of light emitting, light emitting diode "
The terms in the variable text can be divided by function or by function of the stringr package. strsplit str_split
library(stringr)
str_split = strsplit(text[1], ", ")
As we see, the object str_splitconsists of 40 divided terms.
Now I would like to extract the first 10 non-duplicate terms.
Let pocket = {light, device emitting, emitting light, optical, light, diode, electrode, photodetector, semiconductor}
In the 1st iteration: light, device emitting, emitting light , optical, light, diode, electrode, photoconverter, semiconductor.
The term “light” is a subset of “light emission”, so we remove the term “light” and add the 11th term to the variable text , that is, emit a device.
: = {, , , , , , , , , )
2- : , , , , , , , , ,
"" " ", "" 12- , .
: = {, , , , , , , , ,
: , , , , , , , , ,
"" "", "" 13- , .
: = {, , , , , , , , , }
4- : , , , , , , , , ,
"" " ", " " 14- , .
: = {, , , , , , , , , }
5- : , , , , , , , , ,
"" " ", "" 15- , .
: = {, , , , , , , , , }
6- : , , , , , , , , ,
" " " ", " " 16- , .. .
: = {, , , , , , , , , }
.
R.
- ?