Is there a way for me to save punctuation marks!,?, And from my text documents using the text CountVectorizer or TfidfVectorizer parameters in Scikit-Learn?
Thanks in advance.
You must configure the parameter token_patternwhen instantiating the vector. For instance:
token_pattern
vent = CountVectorizer(token_pattern=r"(?u)\b\w\w+\b|!|\?|\"|\'")
Source: https://habr.com/ru/post/1653187/More articles:Native Android Crash - invalid raster - c ++Python: how to group a list of objects by their characteristics or attributes? - pythonCan I pass a reference type to a template to specify the following types of non-piggy template parameters? - c ++Download on pypi without credentials in .pypirc - pythonHow to get the number of Facebook likes URLs consistently? - facebookSwitch to Objective-C mode in lldb - iosPandas - Group / data cells in longitude / latitude - pythonΠ Π΅ΡΠ΅Π½ΠΈΠ΅ ΠΌΠ½ΠΎΠ³ΠΎΡΠ»Π΅Π½ΠΎΠ² Ρ ΠΊΠΎΠΌΠΏΠ»Π΅ΠΊΡΠ½ΡΠΌΠΈ ΠΊΠΎΡΡΡΠΈΡΠΈΠ΅Π½ΡΠ°ΠΌΠΈ Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ sympy - pythonSplitting Long php generated HTML table? - htmlhttps://translate.googleusercontent.com/translate_c?depth=1&pto=aue&rurl=translate.google.com&sl=ru&sp=nmt4&tl=en&u=https://fooobar.com/questions/1653192/how-to-apply-class-to-body-element-using-foundations-responsive-navigation&usg=ALkJrhiUiwNPvilor3OK0ZC-OWyzs5fScAAll Articles