drro183 | Posted on June 4, 2012, 8:48 p.m.

I know it's pretty elementary, but I was asked for a solution ensure there were no duplicates in an array. I suggested the following:

....Loop through array

........Traverse tree // build if necessary

............If array_element > tree_node

................tree_node.right = array_element

............Else if array_element < tree_node

................tree_node.left = array_element

............Else

................//Do nothing...omit duplicate!

........End traversal

....End Loop

Thoughts? Worst case being n*ln(n), right?

Dragontamer5788 | Posted on June 4, 2012, 11:13 p.m.

Yup. Now the important thing is how easily you can do this in practice. In C++, its as simple as:

set<int> set;

for(int i=0; i<array_length; i++){

set.insert(array[i]);

}

In C++, set is typically implemented as a Red/Black tree (similar to a binary tree, but has some pleasing characteristics). In Java, you can use HashSet or TreeSet. In Python, just use a dictionary, and use Arrays in PHP.

--

BlazBlue Main: Noel Vermilion. Chain revolver -> Bloom Trigger

http://img198.imageshack.us/img198/2153/screenshot1pg.png

drro183 | Posted on June 7, 2012, 6:03 p.m.

Thanks! I will definitely look into implementing a practical example. Pseudo-code was sufficient for now. Thanks again.

CC Ricers | Posted on June 8, 2012, 10:47 a.m.

Would you recommend this for floating point types? For instance I need to remove duplicate 3-dimensional vertices, and also adding to complexity is that they need to be compared as a structure of 3 floats.

--

Fan of **GameMarx** Indie Game Reviews- where gamers control the means of production

movingshadow.com/Web_2002/Previous/Docs/Albums/tsung.htm

ShadowHunter | Posted on June 9, 2012, 10:41 p.m.

*Would you recommend this for floating point types? For instance I need to remove duplicate 3-dimensional vertices, and also adding to complexity is that they need to be compared as a structure of 3 floats.*

If the vertices you're removing are exactly the same (often problematic for float types) then you can use one of the trivial solutions for removing duplicates (std::unordered_set, or std::sort + std::unique)

But if the duplicate vertices need to be detected within a certain disrance of each other then you need to use spatial hashing instead.

Consider a 3-dimensional grid where each cell is the same width (or greater) as the tolerance distance - the maximum distance two vertices can be apart and still regarded as equal. For each vertex you can generate it's (x,y,z) grid location, and also the coordinates of any grid cells the vertex' "sphere of influence" may be overlapping. Because the grid cell width is greater than the tolerance distance, this operation is strictly O(1), and each vertex will occupy either 1, 2, 4 or 8 cells.

Now you can hash the set of cells a given vertex occupies, and do a proper distance check against all vertices that occupy those locations in the hash table. Vertices which are overlapping can be tagged, removed, etc.

This is a very fast way to determine overlaps/duplicates, and provided the grid size is chosen sensibly and the hash table isn't overcrowded, the running time is ~ O(n) for a set of n vertices.

PTP2009 | Posted on June 9, 2012, 10:54 p.m.

Or you can just do the O(n^2) algorithm (checking each element against every other element) and remove if the distance is less than a specified threshold.

DevsBro | Posted on June 13, 2012, 8:21 p.m.

Or you can sort in O(n log n) (or better), then just compare every pair of adjacent elements in O(n), so your result is either O(n log n) or O(n), depending on your sort.

--

http://img198.imageshack.us/img198/6457/rlonmaien.jpg http://i.imgur.com/7OKFr.jpg http://img846.imageshack.us/img846/8706/capturemo.png

CC Ricers | Posted on June 15, 2012, 8:31 a.m.

Yeah, I would definitely not prefer the O(n^2) algorithm because I would be checking tens of thousands of vertices, and that would just slow down a lot very soon.

That's the way I have it now, I improved it a bit by breaking down the loads in chunks, updating the screen after one chunk is done, remembering where it left off on each frame. If I can apply a better algorithm I can make a loading screen that is much more fluid and responsive to the user, to show some progress in reading the files.

--

Fan of **GameMarx** Indie Game Reviews- where gamers control the means of production

movingshadow.com/Web_2002/Previous/Docs/Albums/tsung.htm

ShadowHunter | Posted on June 16, 2012, 6:51 p.m.

Are the duplicates in your array exactly the same though? If so, you can just sort + remove, or if you need it even faster, use one of the unordered_set implementations.

PTP2009 | Posted on June 17, 2012, 7:50 a.m.

I don't think there's a way to sort 3-dimensional points such that neighboring ones will always appear next to each other in the sorted array. In general, yes, you could sort and remove neighbors, but I don't think that works with this data.