in code snippet train_dataset
, test_dataset
, valid_dataset
of type numpy.ndarray
.
def check_overlaps(images1, images2): images1.flags.writeable=false images2.flags.writeable=false print(type(images1)) print(type(images2)) start = time.clock() hash1 = set([hash(image1.data) image1 in images1]) hash2 = set([hash(image2.data) image2 in images2]) all_overlaps = set.intersection(hash1, hash2) return all_overlaps, time.clock()-start r, exectime = check_overlaps(train_dataset, test_dataset) print("# overlaps between training , test sets:", len(r), "execution time:", exectime) r, exectime = check_overlaps(train_dataset, valid_dataset) print("# overlaps between training , validation sets:", len(r), "execution time:", exectime) r, exectime = check_overlaps(valid_dataset, test_dataset) print("# overlaps between validation , test sets:", len(r), "execution time:", exectime)
but gives following error: (formatting code make readable!)
valueerror traceback (most recent call last) <ipython-input-14-337e73a1cb14> in <module>() 12 return all_overlaps, time.clock()-start 13 ---> 14 r, exectime = check_overlaps(train_dataset, test_dataset) 15 print("# overlaps between training , test sets:", len(r), "execution time:", exectime) 16 r, exectime = check_overlaps(train_dataset, valid_dataset) <ipython-input-14-337e73a1cb14> in check_overlaps(images1, images2) 7 print(type(images2)) 8 start = time.clock() ----> 9 hash1 = set([hash(image1.data) image1 in images1]) 10 hash2 = set([hash(image2.data) image2 in images2]) 11 all_overlaps = set.intersection(hash1, hash2) <ipython-input-14-337e73a1cb14> in <listcomp>(.0) 7 print(type(images2)) 8 start = time.clock() ----> 9 hash1 = set([hash(image1.data) image1 in images1]) 10 hash2 = set([hash(image2.data) image2 in images2]) 11 all_overlaps = set.intersection(hash1, hash2) valueerror: memoryview: hashing restricted formats 'b', 'b' or 'c'
now problem don't know error means let alone think correcting it. please?
the problem method hash arrays works python2
. therefore, code fails try compute hash(image1.data)
. error message tells memoryview
s of formats unsigned bytes ('b'
), bytes ('b'
) of single bytes ('c'
) supported , have not found way such view out of np.ndarray
without copying. way came includes copying array, might not feasible in application depending on amount of data. being said, can try change function to:
def check_overlaps(images1, images2): start = time.clock() hash1 = set([hash(image1.tobytes()) image1 in images1]) hash2 = set([hash(image2.tobytes()) image2 in images2]) all_overlaps = set.intersection(hash1, hash2) return all_overlaps, time.clock()-start
Comments
Post a Comment