Conversation
|
I think this is ready if anyone is looking. I tested it all day. I will create a way for me to test millions of paths sometime tonight so I can test for memory leaks. The only other thing I might try to do is remove x and y from the actual node to save a little more memory. I saw someone say that and it may be possible. Though it won't be super easy. We do use those values and a lot of the code is tied to it. So don't hold this back waiting for it haha. I ran the following test. which resulted in this output. Memory did not increase indicating a memory leak: My CPU usage increased about 0.5-1% to calculate the paths and memory remained where it started. After running 1m paths through this code (all at the same time) the result is this: The memory for the server actually decreased after the test assuming it freed memory on something else. I really doubt there is a memory leak at this point. |
|
Results of changing the heuristic a little: ALSO... I did a clean install of the old TFS algorithm. As if we reverted my last pull request. Here is the result: A pretty clear sign that this version is faster, uses less memory, and no more memory leaks! NICE! The low success rate is due to the old version of maxSearchDist. It was only like 15 or so before we increased it to clientViewX+Y and I am checking 9x9 around the player so a lot of paths end up being too far away. With that being said, the old version didn't handle paths outside its range well and still tried to calculate the path which means we did 250 iterations to find out we couldn't make the path. Between a slower implementation and all of the non existent pre-path conditions that are in my new code it takes almost 4x as long to run the paths. Last check on the old tfs algorithm I reduce the path sizes to 5x5 to try and get more successful paths. It was still slower than this new algorithm on half the size of paths. |
|
I am just giving an opinion here.. take it or leave it... but if you are going to continue to optimize and fine hone this pathfinding algo, might I suggest you research JPS, as it's supposed to be an enhancement to A* which supposedly dramatically increases performance... at least for sure on big grids/nodemaps... it may or may not be worth implementing for our much smaller use cases, I have no idea... another possible optimization I thought of, but never really checked to see if it could be applied, was to limit the distance the node search continues away from the creature based on max viewport.. Anyways, good luck and hope you keep killing it on these work you are doing! |
It looks like optimized algorithm. I will try to benchmark it later. Problem with path finding CPU usage is not players walking, but monsters walking. It would be also good to print total number of 'steps to target' calculated, because it does not matter, if algorithm is faster, if it calculates longer (not optimal) paths more often. |
This is something that would make either algorithm slower. I understand the idea you are presenting, but if an algorithm runs faster even if it is because of CPU cache it would almost always run faster in a more "real world" environment. I will mention the original pathfinding system in TFS wasn't "slow" it just needed optimizations to how it worked. With just a few if statements in the original code it would of made it perform a lot better. No matter what though, it will always be slower because it checks 7 extra nodes for each 1 node in A* because it doesn't utilize the target position.
In these tests, which I did the exact same test on both algorithms it is clear the new implementation is a lot faster. The high success rate is actually a testament to how good the pathfinder is. Making the two slower by having fails will not change that, but I can definitely do some tests to show that. The reason why is this: In my implementation I can figure out if a path is not possible in 60-80 iterations which is why I have the 120 iteration limit, it is actually more than it needs to be just because I wanted to give it some wiggle room. In the old version 250 was the limit of closed nodes (not iterations) it could check which limited it to somewhere between 15-20 sqms. The 120 iteration limit allows us to check paths, like in caves that do not link, up to around 30 tiles away. With would make it SEEM slower but in reality it is checking many more nodes. It is actually a place where I could optimize the code even more. I purposefully left out this optimization because I need to see tibias behavior in this area. I know their monsters will only check so far for a path but I am not sure exactly how far that is. So, yes in the specific case of paths not being found (they are too long to find) but the target is close enough to not make my code ignore the path they will look as though they are the same speed. In reality, again, mine checked 2-3x as many nodes in the same time. To fix this, all I would have to do is add a check to how far the node we are checking is from our startPos. If the nodes are 20sqm away for example we could just terminate because the path is too long. I know the algorithm takes the appropriate (not longer) path from testing it on monsters in real world scenarios, and just by knowing how A* finds its paths. It will never choose a path that is 14sqm over a path that is 13sqm. In most cases you will have two paths that can be correct, depending on how you have implemented the algorithm is how it will decide, but it doesn't matter which one it takes as long as it picks one of the two. Both paths are the same distance to cover. The ONLY argument about A* is that it looks more like a computer made the path, which it did. It doesn't look very monster like to walk into a wall, but the performance cost of not using A* is way too much of a draw back.
Check my original pull and the pull made for the memory leak problem. I test distance on paths extensively. The path should always be the same distance as the distance between startPos and targetPos. (maybe - 1) because we don't walk on top of the target. Trust me, I have worked on this for years. I know what is going on here very well. Lastly, if you want to get a test with more accurate results in all cases add this: With: I added the check when paths have maxSearchDist but not for ones that do not. We need to end those sooner too just not sure when.
I plan on looking at some of the "better" algorithms for sure. A* is kind of the go to for tibia because we don't calculate z position and the paths are pretty small. Any other algorithm is going to be slower than what you can get from a perfect A* implementation for tibia. (Mine is not perfect. We could use a heap for openSet, and utilize vector indices instead of parent node in AStarNode just as an example to make it a little better even now). However, if we wanted to make a more verbose pathfinder so we could delete a lot of the other game logic like if we can throw an item somewhere that uses z position, going for one of the other algos might be better again maybe not though because the small distance of paths. I will check out JPS for sure. You are exactly right about the node thing. I didn't see that before I recommended it to gesior for his tests lol. Nice eye. I purposefully left it out for now because idk tibias actual limit and 120 will only check around 30sqm anyway. |
|
@NRH-AA EDIT: before merging this PR, someone should also test how monsters that keep distance work after changes ex. Warlock. |
|
pos = {x = 20736, y = 0, z = 13 '\r'} startPos = {x = 139, y = 485, z = 13 '\r'} What is this? It looks like you made a creature that is 20k tiles away from the path it generated. |
|
Alright, sounds like it is good enough then. If it stops the crash then let’s roll with it. |
It affects performance. It's 25% slower with 'new', but IDK how to make it work without 'new' and compile with address sanitizer/on Ubuntu 22.04. |
I think we can add a null check inside NodeCompare to fix the crash before your change to fix it. If you have time could you try this? In the meantime we can push this. it seems the issue is we are removing nodes; somehow at the same time they are being compared resulting in a null reference. |
|
I had it calling nodes.clear() too early which was probably causing the invalid memory access. |
|
@gesior We still need to increase the allocated size * 7 wasn't enough. I picked that number "randomly". I figured it was close to enough to handle any paths we were trying to get. That is actually the only problem that was happening. |
|
@Shawak What I did to find out real problem is described at end of this comment. I tested, if increasing
Before this PR (optimized with memory leak):
Before optimized path finding:
So with What I did to find out real problem: I wrote simplified version of this problem: int main(int argc, const char** argv)
{
AStarNodes nodes(1, 1);
AStarNode* node1 = &nodes.nodes.back();
std::cout << "n1:" << node1->x << "," << node1->y << std::endl;
AStarNode* node2 = nodes.createNode(node1, 1, 2);
std::cout << "n1:" << node1->x << "," << node1->y << std::endl;
return 0;
}You can run it with but if you uncomment: it will return valid result: but if you add 3rd element in Code that returns invalid results is always detected with |
There is 2 things that need to be fine tuned.
I had this which is part of why it broke. I wanted to make it easy to increase what size paths can be drawn by just changing: The issue is the amount of iterations can't be the same size as the reserveSize otherwise it will always overflow. So it should be changed to:
Yes, your solution will definitely stop any crashes which is perfect, but this is where I left off before I took my vacation. I didn't think about the fact the allowing the iterations to be constant with the reserve size would cause this problem. It still needs to be addressed as it is unstable. The iterations should be used as a way to stop paths that are too large. |
|
@NRH-AA IDK what is wrong on that screenshot. Algorithm is not supposed to find a way out from any size of labyrinth. It must limit CPU usage by single monster. Also, walking too far, could make monster walk out of player screen (no players on screen) and make it go idle. Old algorithm (TFS 0.x) had number of iterations - in our case also nodes reserved - set by variable passed to
That way normal path finding of monster uses less CPU, but if it fails to find path, it can run more complicated path finding. |
|
@NRH-AA Right now optimized algorithm uses 60% (!) less CPU than old algorithm (5334 ms vs 13322 ms). Can we merge this PR? @ranisalt @ArturKnopik |
|
Tomorrow I will review this changes and test it, if everything is ok I will merge |
|
Tested for a while, looks like it works, code also looks ok, in case of any problem we can revert this changes |
|
@NRH-AA For the record JPS is not a different algo than A*, it's an enhancement for A*, one that keeps it from iterating over already known bad paths/nodes... which, if I read this PR correctly (the comments not the code), you already figured it out that such an alteration would indeed be a good optimization for this algo. |
I made sure to ignore nodes we don't need to check again but JPS would work a little differently. I think JPS works best if you already have the path data. (like google maps) We know all routes that can be taken which can be used to ignore routes we know will end up in a bad place. We can also jump to certain routes if we know that a node at this x,y that is found for a path going north or whatever will always come out at a specific spot. So maybe 100 paths all use that same route. That would be a hot spot and we could automatically grab the nodes required and just jump to that point and start the path finding again. Doing this multiple times would for sure increase performance in that case, but outside knowing the paths before hand I don't think its plausible. |
I did not analyse JPS, but wikipedia says
so JPS won't work. |



Performance enhancements to pathfinding algorithm. Remove new from node construction to help with memory problems.