I came up with a separation and conquest algorithm for this. Just wanted to know if this would work or not?
The first average is calculated from the integer range, that is (0+ (1 <32-1)) → 1, and then this idea is applied: the range of the number from the beginning to the middle or from the middle to the end will always be less than the numbers that we will consider, because we are considering billions of numbers, and there will certainly be some numbers that will be repeated, since the range of a 32-bit integer is much less than a billion numbers.
def get_duplicate(input, start, end):
while True:
mid = (start >> 1) + end - (end >> 1)
less_to_mid = 0
more_to_mid = 0
equal_to_mid = 0
for data in input:
data = int(data, 16)
if data < mid:
less_to_mid += 1
elif data == mid:
equal_to_mid += 1
else:
more_to_mid += 1
if equal_to_mid > 1:
return mid
elif mid-start < less_to_mid:
end = mid-1
elif end-mid < more_to_mid:
start = mid+1
with open("codes\output.txt", 'r+') as f:
content = f.read().split()
print(get_duplicate(content, 0, 1<<32-1))
I know that we can use a bit array, but I just want to get your views on this solution and if the implementation is buggy.