I use Python difflib libraries to find where 2 documents differ. Differ () compare () method does this, but very slowly. - at least 100x slower for large HTML documents compared to diff .
How can I effectively determine where two documents differ in Python? (Ideally, I am after the position, and the text itself, like SequenceMatcher (). Get_opcodes () returns.)
a = open("file1.txt").readlines() b = open("file2.txt").readlines() count = 0 pos = 0 while 1: count += 1 try: al = a.pop(0) bl = b.pop(0) if al != bl: print "files differ on line %d, byte %d" % (count,pos) pos += len(al) except IndexError: break
Google diff API- python, html-, . , , , .
An ugly and stupid solution: if difffaster, use it; through a call from python through subprocess, parse the output of the command for the necessary information. It will not be as fast as once diff, but perhaps faster than difflib.
diff
subprocess
difflib
Source: https://habr.com/ru/post/1727219/More articles:change visibility with expression in Microsoft Report rdlc - c #Ошибка при запуске Memcached: не удалось прослушать - memcachedHow to fix broken Lisp directory path for Emacs? - emacsIs it wrong to configure the web server to map HTTP and HTTPS traffic to the same document root? - securityPerformance impact on classes without styling? - performancewindow.print does not work in Firefox - javascriptIs it possible to have identically named source files in the same visual studio C ++ project? - c ++How to use jaxb_commons plugins from maven - pluginsUsing a textbox value as a parameter in LINQ dat source / gridview: a value of type 'String' cannot be converted to type 'Double' - c #Hiding TR - Borders Included - jqueryAll Articles