The trivial approach is something like this:
Code: Select all
for (uint32_t a = 0; a < max_objects; a++) {
for (uint32_t b = 0; b < max_objects; b++) {
collide(a, b);
}
}Code: Select all
for (uint32_t b = a; b < max_objects; b++)Fast forward through 35+ years of collision detection research papers. Feel free to read some of them. Graphics Gems volumes 1 through 5 would be a good place to start, followed by the proceedings of the various computer graphics conferences.
So, as I've outlined in previous posts, we're keeping our objects in a (conceptual) fixed-size array, preallocated for the maximum number of objects in the game. And we're indexing them based on morton code, effectively a linear encoding of the depth-first traversal of a quadtree[1] data structure. So, we can use the morton code to find candidates for collision detection, which massively improves average complexity to something like O(logn^2), although the worst case stays the same.
That's what we're going to do.
The gist of this is that, for every object, we need to find its morton code, the morton codes of a number of other cells that may contain candidates for collision detection, and *only* perform the relatively expensive collide() funcion on the (hopefully) tiny subset of all game objects with one of those morton codes.
Simple, right?
Actually, no. Not so simple.
So, let's assume, for simplicity's sake, that all our objects are bounded not, as usual, by axis-aligned rectangles, but by circles. We must choose our spatial subdivision for morton encoding (i.e. the size of our quadtree leaves) to be larger than the maximum object bounding circle radius.
So (and ignoring velocity, this is instantaneous detection only for the moment), any object that might realistically collide with our source object must either have its centroid in the same quadtree leaf as our source object, or one of its immediate neighbours.
So, we must perform collide() on all objects in the same quadtree leaf as our source object, and with any object in a neighbouring leaf whose bounding circle overlaps our quadtree leaf. The latter of these is not entirely obvious, but follows from the fact that collide() acts not only on the source object, but on the candidate object as well (which is what brings our worst case complexity down to O(nlogn)). The following diagram illustrates.

In this diagram, dealing with objects in the central cell (morton code 3), we need to collide objects A, B & C with each other, and each of them against D and E. No other objects impinge on cell 3, so they can be excluded early. Note how objects E and F both impinge on cell 2, so will need to be collision tested against G, but never need to be tested against each other.
Building the list of neighbour cells is not entirely trivial (although if we're tricky it's constant time), and iterating the objects in those neighbour cells is relatively expensive, so it makes sense to perform the tests on a cell-by-cell basis, rather than object-by-object. Remember, we have a fixed size quadtree with 0 <= x <= n objects per leaf node, rather than a floating size quadtree with a single object per leaf, thus each node comparison is potentially n to n.
So our algorithm looks like this.
Code: Select all
stop = first node.
do while stop exists
start = stop
stop = next node until stop.morton != start.morton
for each source in start <= source < stop
for each candidate in source < candidate < stop
collide source, candidate
end
end
for each morton in start.morton.neighbours
candidate-start = find node by morton
if candidate-start exists
candidate-end = candidate-start
candidate-end = next node until candidate-end.morton != morton
for each candidate in candidate-start <= candidate < candidate-end
if candidate.bounds overlaps start.morton
for each source in start <= source < stop
collide source, candidate
end
end
end
end
end
endSo, what's the hitch?
Mainly, it's start.morton.neighbours above. I've (deliberately) glossed over this above, as it's a little bit complex. How *do* we find the neighbours for a given morton code?
The most obvious obvious way, given that a morton code is an encoding of a quadtree, is to do a recursive search for each direction, going up and down the "tree" as necessary. That's ugly and painful, and you really don't want to do it.
The next most obvious way is to take the X and Y coordinates of each cell, and interleave them to generate a morton code. Fortunately, we already know at least one coordinate pair from the source object(s) in the current cell, so we don't need to de-interleave the current cell's X and Y, but it's still quite a lot of work (it's either done via lookup tables, which have a cache hit, or via bit twiddling hacks, which are a fair amount of work.
So why can't we just add or subtract numbers to / from the morton codes themselves? After all, they contain the x any y coordinates. And the answer to that is "we can, kinda". The problem, though, is that we need to add and subtract x and y portions separately. A good starting point in terms of relevant literature on this is "Finding Neighbours of Equal Size in Linear Quadtrees and Octrees in Constant Time" by Gunther Schrack.
In short, we can extract the X and Y components of a (16 bit) morton code by
Code: Select all
x = m & 0x5555
y = m & 0xaaaaCode: Select all
// For addition
x = m | 0xaaaa
y = m | 0x5555
// For subtraction
x = m & 0x5555
y = m & 0xaaaaCode: Select all
m = (x & 0x5555) | (y & 0xaaaa)There is a "gotcha" in this, though. You almost certainly have to do it in assembler, because although it all looks like "normal" maths, it's not, and you have no way of telling the compiler that. The compiler knows that adding -1 is the same as subtracting 1 and vice versa, and will happily blow away your carefully constructed sequence of operations without bothering to tell you. After all, the user doesn't care about the actual sequence of operations if the result is the same, right? Wrong. For example, the following (relatively inefficient) code to find the top right neighbour
Code: Select all
#define minus_one 0xffff
#define mask_x 0x5555
#define mask_y 0xaaaa
int16_t top_right(int16_t m) {
return (((m & mask_x) - minus_one) & mask_x) | (((m & mask_y) - 1) & mask_y);
}Code: Select all
00000000 <top_right>:
0: 3084ffff andi a0,a0,0xffff
4: 2403aaaa li v1,-21846
8: 00831024 and v0,a0,v1
c: 30845555 andi a0,a0,0x5555
10: 2442ffff addiu v0,v0,-1
14: 24840001 addiu a0,a0,1
18: 00431824 and v1,v0,v1
1c: 30845555 andi a0,a0,0x5555
20: 00641025 or v0,v1,a0
24: 03e00008 jr ra
28: 7c021620 seh v0,v0
Here's an implementation with inline assembly for toroidal space:
Code: Select all
uint8_t neighbours_toroidal (uint16_t m, uint16_t * r) {
uint16_t xc, yc, xp, xm, yp, ym;
__asm__ volatile (
"ori %4, %6, 0xaaaa\n"
"ori %5, %6, 0x5555\n"
"addiu %0, %4, 1\n"
"addiu %1, %4, -1\n"
"addiu %2, %5, 1\n"
"addiu %3, %5, -1\n"
"andi %0, %0, 0x5555\n"
"andi %1, %1, 0x5555\n"
"andi %4, %4, 0x5555\n"
"andi %2, %2, 0xaaaa\n"
"andi %3, %3, 0xaaaa\n"
"andi %5, %5, 0xaaaa\n"
: "=r" (xp), "=r" (xm), "=r" (yp), "=r" (ym), "=r" (xc), "=r" (yc) : "r" (m));
r[0] = xm | ym;
r[1] = xc | ym;
r[2] = xp | ym;
r[3] = xm | yc;
r[4] = xp | yc;
r[5] = xm | yp;
r[6] = xc | yp;
r[7] = xp | yp;
return 0;
}I'm not suggesting this is the be-all and end-all of collision detection, of course. For starters, it really needs adjusting to handle relative movement, but it's a decent start. On another platform, I'd be looking at SIMD implementations. For the PSP, we might be able to make collide() use the VFPU in some cases, but the loop itself has to be CPU based (unless there's some currently undocumented bitwise ops on the VFPU, of course).
[1] Quadtree because I only really care about 2 dimensions. But it all holds for octrees, you filthy "third dimensionist", you.
Advertising
