The longest common substring for more than two lines in PowerShell?

How to find the matching strings in a string array in PowerShell:

Example:

$Arr = "1 string first", "2 string second", "3 string third", "4 string fourth" 

Using this example, I want this to be returned:

 " string " 

I want to use this to find the corresponding parts of the file names, and then delete this part of the file name (for example, remove the artist name from the set of mp3 files, for example), without indicating which part of the file name should be replaced manually.

+6
source share
4 answers
 $arr = "qdfbsqds", "fbsqdt", "bsqda" $arr | %{ $substr = for ($s = 0; $s -lt $_.length; $s++) { for ($l = 1; $l -le ($_.length - $s); $l++) { $_.substring($s, $l); } } $substr | %{$_.toLower()} | select -unique } | group | ?{$_.count -eq $arr.length} | sort {$_.name.length} | select -expand name -l 1 # returns bsqd 
  • create a list of all unique substrings of input strings
  • filter for substrings that occur during the following lines (i.e. in all input lines)
  • sort these filtered substrings according to the length of the substring
  • returns the last (i.e. longest) of this list
+5
source

If this (artist name, etc.) will be just one word:

 $Arr = "1 string first", "2 string second", "3 string third", "4 string fourth" $common = $Arr | %{ $_.split() } | group | sort -property count | select -last 1 | select -expand name $common = " {0} " -f $common 

Update:

An implementation that works for multiple words (finding the longest common substring of words):

 $arr = "1 string a first", "2 string a second", "3 string a third", "4 string a fourth" $common = $arr | %{ $words = $_.split() $noOfWords = $words.length for($i=0;$i -lt $noOfWords;$i++){ for($j=$i;$j -lt $noOfWords;$j++){ $words[$i..$j] -join " " } } } | group | sort -property count,name | select -last 1 | select -expand name $common = " {0} " -f $common $common 
+1
source

Here is the "Longest Common Substring" function for two lines in PowerShell (based on wikibooks C # Example ):

 Function get-LongestCommonSubstring { Param( [string]$String1, [string]$String2 ) if((!$String1) -or (!$String2)){Break} # .Net Two dimensional Array: $Num = New-Object 'object[,]' $String1.Length, $String2.Length [int]$maxlen = 0 [int]$lastSubsBegin = 0 $sequenceBuilder = New-Object -TypeName "System.Text.StringBuilder" for ([int]$i = 0; $i -lt $String1.Length; $i++) { for ([int]$j = 0; $j -lt $String2.Length; $j++) { if ($String1[$i] -ne $String2[$j]) { $Num[$i, $j] = 0 }else{ if (($i -eq 0) -or ($j -eq 0)) { $Num[$i, $j] = 1 }else{ $Num[$i, $j] = 1 + $Num[($i - 1), ($j - 1)] } if ($Num[$i, $j] -gt $maxlen) { $maxlen = $Num[$i, $j] [int]$thisSubsBegin = $i - $Num[$i, $j] + 1 if($lastSubsBegin -eq $thisSubsBegin) {#if the current LCS is the same as the last time this block ran [void]$sequenceBuilder.Append($String1[$i]); }else{ #this block resets the string builder if a different LCS is found $lastSubsBegin = $thisSubsBegin $sequenceBuilder.Length = 0 #clear it [void]$sequenceBuilder.Append($String1.Substring($lastSubsBegin, (($i + 1) - $lastSubsBegin))) } } } } } return $sequenceBuilder.ToString() } 

To use this for more than two lines, use it as follows:

 Function get-LongestCommonSubstringArray { Param( [Parameter(Position=0, Mandatory=$True)][Array]$Array ) $PreviousSubString = $Null $LongestCommonSubstring = $Null foreach($SubString in $Array) { if($LongestCommonSubstring) { $LongestCommonSubstring = get-LongestCommonSubstring $SubString $LongestCommonSubstring write-verbose "Consequtive diff: $LongestCommonSubstring" }else{ if($PreviousSubString) { $LongestCommonSubstring = get-LongestCommonSubstring $SubString $PreviousSubString write-verbose "first one diff: $LongestCommonSubstring" }else{ $PreviousSubString = $SubString write-verbose "No PreviousSubstring yet, setting it to: $PreviousSubString" } } } Return $LongestCommonSubstring } get-LongestCommonSubstringArray $Arr -verbose 
+1
source

If I understand your question:

 $Arr = "1 string first", "2 string second", "3 string third", "4 string fourth" $Arr -match " string " | foreach {$_ -replace " string ", " "} 
0
source

Source: https://habr.com/ru/post/901919/


All Articles