Notepad ++ check for duplicate lines

Example

40,000 + lines with the following types:

GUID: 0981723409871243 

Search the entire GUID for duplicates

Example:

 GUID: 124432408213 GUID: 08917234071423 GUID: 0189742381 GUID: 08917234071423 GUID: 0817423423 GUID: 124432408213 

I have TextFX and Compare, but how would I find this part, there are 2 124432408213 and 2 08917234071423

out of 40,000 rows with possible duplicates I cannot easily find them I need a way to find duplicates.

It should be something like a GUID: "Search for text after guid" of the next line, then continue searching for each GUID ... I could write a special program that can do this, but ... try to avoid this TextFX is quite powerful. I just donโ€™t see a way to do something like this ...

I need to add a little more information here:

[block1] guid: ???? more information:??? [/ Block1]

this is how each block is formatted.

+6
source share
2 answers

Use TextFx to sort the input lines and save duplicates. Then search for regular expressions by setting a Bookmark on the Mark tab. The search text should be ^(GUID:\s*\d+\r\n)\1 , then click "Mark All **". Next use Menu => Search => Bookmark => Delete unmarked lines to delete all but duplicates, or use Menu => Search => Bookmark => Copy bookmarks to bookmarks and paste the desired lines. If there are four or more identical lines, then the above can end with one record for each pair, another way to delete TextFX, removing duplicates, should remove the excess.

For the case of [block1] guid: ???? more info: ??? [/block1] [block1] guid: ???? more info: ??? [/block1] [block1] guid: ???? more info: ??? [/block1] regex is more complex, but ^(\[block1\] guid:\s*\d+ more info:\s*\d+ \[/block1\]\r\n)\1 finds and puts duplicates in:

 [block1] guid: 1234 more info: 5678 [/block1] [block1] guid: 1235 more info: 5678 [/block1] [block1] guid: 1235 more info: 5678 [/block1] [block1] guid: 1236 more info: 5678 [/block1] [block1] guid: 1236 more info: 5678 [/block1] 

On Linux or similar, a command such as sort -c inputFileName | grep -v "^\s*1\s" sort -c inputFileName | grep -v "^\s*1\s" or sort inputFileName | unic -c | grep -v "^\s*1\s" sort inputFileName | unic -c | grep -v "^\s*1\s" sort inputFileName | unic -c | grep -v "^\s*1\s" or sort inputFileName | uniq -d sort inputFileName | uniq -d should work depending on which commands and options are available.

+3
source

Although my answer will not help you ... Copy your lines into two news tabs, then use TextFX to duplicate the sort tab 1 and the unique tab with list 2. Then move tab 2 to another view, at the end use Compare.

+11
source

Source: https://habr.com/ru/post/946593/


All Articles