stats update

General discussions about the website's layout and functionality
User avatar
Corfman Clan
Global Moderator
Posts: 914
Joined: January 17th, 2012, 12:21 am

Re: stats update

Post by Corfman Clan »

As many of you know, this has been a recurring problem throughout the past year. Initially, to get around it, I upped the timeout for updating the backcountry leader boards from 30 minutes to 75 minutes. Since then, I have upped the timeout to 120 minutes. I recently added a hint to the query that seems to have helped some. However, the whole stats update is taking several hours to complete, for example, 4.5 hours this morning. Basically, the problem has never been solved, just managed.

Updating the backcountry boards involves running an insert query 36 times: 4 region types (LonelyCache Wide, states, counties, DGP regions) by eight cache types plus the overall. Most of these 36 would run in under a couple minutes but sometimes one or two would take an extended amount of time, possibly timing out after 2 hours.

Over the past week, I have have been rewriting the DB queries that are used to update several of the various leaderboards. I started with the points leader boards and came up with a change that dramatically improved the time it took for those boards. I applied that technique to the backcountry boards too. I also updated the queries for the FDF boards and the Owner/Finder accolades statistics to follow suit. I am very happy with the results. Before the changes, updating stats on my laptop's test database was taking about 70-75 minutes. With the rewritten queries, the last update took just 16 minutes. Updating the backcountry leader boards went from taking over 30 minutes to completing in under 3.

At this time, I am very confident that these changes will fix the problem where our nightly stats update sometimes does not complete, or takes several hours to complete.

I'm waiting until after the annual leader boards are finalized Jan. 10 before I deploy this update. I'm confident the updates are correct, but the nightly updates have been completing fine lately and I don't want to jeopordize that in any way. Things may test out fine in a lab environment but fail on production.

It's actually very simple to verify these changes. First I run the stats update, then I make the change and run the stats update again. As long as the two sets of statistics were generated on the same day and I did not add/remove any caches or found logs, then the two sets should be identical. So I run a database query to compare them. If any differences are found then there is a problem, if they are identical, then the change is correct.
Image
rocketsciguy
Posts: 145
Joined: January 18th, 2012, 9:55 am

Re: stats update

Post by rocketsciguy »

Wow!
LonelyCache.com wrote:Statistics last updated Saturday, 11 January 2014 02:17:15 (UTC-07:00) Mountain Time (US & Canada)
Whatever you did really worked! I forgot what time you said the update scripts started to run, but if it was 2AM, it looks like you have an order-of-magnitude improvement in performance here! Glancing around the boards I usually look at, I don't notice any problems or anomalies. Great job! :D
User avatar
Corfman Clan
Global Moderator
Posts: 914
Joined: January 17th, 2012, 12:21 am

Re: stats update

Post by Corfman Clan »

Corfman Clan wrote:As many of you know, this has been a recurring problem throughout the past year. Initially, to get around it, I upped the timeout for updating the backcountry leader boards from 30 minutes to 75 minutes. Since then, I have upped the timeout to 120 minutes. I recently added a hint to the query that seems to have helped some. However, the whole stats update is taking several hours to complete, for example, 4.5 hours this morning. Basically, the problem has never been solved, just managed.
I think this can be moved to the solved bin. Last night statistics were updated in no more than 17 minutes and 15 seconds! :D
Corfman Clan wrote:Updating the backcountry boards involves running an insert query 36 times: 4 region types (LonelyCache Wide, states, counties, DGP regions) by eight cache types plus the overall. Most of these 36 would run in under a couple minutes but sometimes one or two would take an extended amount of time, possibly timing out after 2 hours.
I updated this query again, removing the need to worry about region type when executing the query. That dropped it down to only needing to run 9 times instead of 36. Of course, a lot more rows per execution is involved, but overall the same amount of data need to be produced and overall the data are traversed a lot less (75%?). Region type is a concern because for LonelyCache wide and state, backcountry boards are based on a cacher's top 20 caches while with county and DGP regions, it's 10 caches.
Corfman Clan wrote:At this time, I am very confident that these changes will fix the problem where our nightly stats update sometimes does not complete, or takes several hours to complete.
Now, with this resolved, I finally feel comfortable with the idea of expanding the LonelyCache territory. :)
Image
Post Reply