Archive for Downsizer For an ethical approach to consumption
|

jema
|
A mysql puzzleI think the major speed issue left on the board is the behaviour of "View latest Posts" which goes very slow at times.
You'd think this was a database issue, but when you view the index page the system is checking for new posts in each forum
I have optimized each of the tables I think are involved in these queries.
What is more, when the page is being created, I can't even see the mysqld deamon as a top process.
|
jema
|
| Quote: |
SELECT t.*, u.username, u.user_id, u2.username as user2, u2.user_id as id2, p.post_username, p2.post_username AS post_username2, p2.post_time, f.forum_name
FROM " . TOPICS_TABLE . " t, " . USERS_TABLE . " u, " . POSTS_TABLE . " p, " . POSTS_TABLE . " p2, " . USERS_TABLE . " u2, " . FORUMS_TABLE . " f
WHERE t.topic_poster = u.user_id
$forum_sql
AND p.post_id = t.topic_first_post_id
AND p2.post_id = t.topic_last_post_id
AND t.forum_id = f.forum_id
AND u2.user_id = p2.poster_id
AND t.topic_type <> " . POST_ANNOUNCE . "
$noreply_sql
$startdate_sql
$sticky_sql
$locked_sql
$moved_sql
$reg_sql
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC
LIMIT $limit
|
takes a good number of seconds and is the code we use on the main board to show the latest posts. I have zero doubt something similar is in "latest posts". This is the culprit all right.
|
bagpuss
|
do you know what version of mysql you are using
I think order bys tend slow querys so if there is another way to do it it would be good
|
jema
|
| bagpuss wrote: | do you know what version of mysql you are using
I think order bys tend slow querys so if there is another way to do it it would be good |
It is a 3.x. I think you have hit the nail almost on the head, those sorts of queries are speeded up in the server config by the:
key-buffer-size
variable, I need a decent size, but the mysql service fails to start if I set it above 12mb. I fancy it would want more like 128mb.
I think my support friends at fastservers tried this, and gave up when it failed to restart.
I am giving up for the night now, but I think cracking this will crack the issues.
|
bagpuss
|
if you are maintaining the server would you beable to upgrade the mysql instance to get some of the general speed ups and niftier featuresof 4.1?
it should be possible to increase it to 128MB though
at work our key-buffer-size is 1GB but our machines do tend to have at least 8GB of memory which tends to help
|
jema
|
| bagpuss wrote: | if you are maintaining the server would you beable to upgrade the mysql instance to get some of the general speed ups and niftier featuresof 4.1?
it should be possible to increase it to 128MB though
at work our key-buffer-size is 1GB but our machines do tend to have at least 8GB of memory which tends to help |
I have thought long and hard about upgrading to 4.x.
I have read a pundit suggesting anyone who does not should be taken out into the street and beaten around the head
But then again i care more about absolute stablity in a database than anything else, and wonder why 3.x is still the one that ships with Linux distributions.....
We have 2gb, and hence I have no idea (at the moment) why I cannot get the size up
|
dougal
|
Please pardon my probable use of incorrect terminology, but I think searching the forums *has* to be the *wrong* way of showing the "10 newest" on the front page.
IMHO there needs to be a ten-slot table for "the newest". Regard this as a circular buffer.
When any (every!) post is made, stick it's details into the new table, and mark it as newest. (A pointer?)
When the next post is made, stick it in the next slot and advance the pointer.
After using slot 10, wrap around to slot 1.
Yes, I know you have to hook into the posting entry code, but I can't believe thats a massive problem.
The big advantage comes when you ask for the 10 newest for the front page. Check the pointer and read out (reverse order) all 10.
Zero searching. Very little cpu load...
BTW, I believe did notice a couple of times (yesterday) that the front page 'latest' was sometimes a couple of *hours* out of date...
|
jema
|
| dougal wrote: | Please pardon my probable use of incorrect terminology, but I think searching the forums *has* to be the *wrong* way of showing the "10 newest" on the front page.
IMHO there needs to be a ten-slot table for "the newest". Regard this as a circular buffer.
When any (every!) post is made, stick it's details into the new table, and mark it as newest. (A pointer?)
When the next post is made, stick it in the next slot and advance the pointer.
After using slot 10, wrap around to slot 1.
Yes, I know you have to hook into the posting entry code, but I can't believe thats a massive problem.
The big advantage comes when you ask for the 10 newest for the front page. Check the pointer and read out (reverse order) all 10.
Zero searching. Very little cpu load...
BTW, I believe did notice a couple of times (yesterday) that the front page 'latest' was sometimes a couple of *hours* out of date...  |
Yesterday it was dead on the front page at a certain point. Ihad established this was the killer problem, using a caches resolved it a bit, but the cache system flopped as when the timeout on the cache hit, several queries would activate as several people hit the front page.
I decided it needed to be run as a chron job to refresh the cache, but that took a bit of time to sort.
As to your other point, yes rejigging phpbb entirely would do the trick, and would not be hard. On the other hand this query should not be crippling in the first place, and I think it is best to make the official code work if possible.
|
bagpuss
|
| jema wrote: |
But then again i care more about absolute stablity in a database than anything else, and wonder why 3.x is still the one that ships with Linux distributions.....
|
As far as stability goes the website my project runs averages about 100000 page impressions a day and it running 4.1.1 without issue. It took us a long time to upgrade from 3.x to 4 but it is better now that we have
something else you may find useful is this set of webpages
http://mysql.flupps.org/
from the guy who gave us our mysql training I think the ppt file has the most info in it
|
jema
|
| bagpuss wrote: | | jema wrote: |
But then again i care more about absolute stablity in a database than anything else, and wonder why 3.x is still the one that ships with Linux distributions.....
|
As far as stability goes the website my project runs averages about 100000 page impressions a day and it running 4.1.1 without issue. It took us a long time to upgrade from 3.x to 4 but it is better now that we have
something else you may find useful is this set of webpages
http://mysql.flupps.org/
from the guy who gave us our mysql training I think the ppt file has the most info in it |
pretty useful
I am still struggling to set the server variables to make it work faster out of the box, no luck yet.
There are always a number of approaches, and if we are not at absolutre crisis point, it does seem to me to make sense to use the least aggressive method first.
Now i am pretty sure where the problem is, I am pretty sure I could change the code to fix it, but then I'd miss out on seeing how the issue could be more trivially fixed, and that knowledge will come in handy some day.
BTW having read more, key_buffer_size and table_cache do seem adaquate and increasing them has no effect next stp is to use the explain statement from the PPt presentation.
|
bagpuss
|
| jema wrote: |
BTW having read more, key_buffer_size and table_cache do seem adaquate and increasing them has no effect next stp is to use the explain statement from the PPt presentation. |
explain is certainly useful
if it turns out you are doing full table scans then you can probably improve the query or the indexes to help matters
|
jema
|
| bagpuss wrote: | | jema wrote: |
BTW having read more, key_buffer_size and table_cache do seem adaquate and increasing them has no effect next stp is to use the explain statement from the PPt presentation. |
explain is certainly useful
if it turns out you are doing full table scans then you can probably improve the query or the indexes to help matters |
The query was crap and easily improved, but to make matter complicated, that query is not strictly the one that is now the problem, merely a query that did the same job as the ones that are
|
jema
|
and touch wood nobbled the other. we should be rocking now
|
dougal
|
| jema wrote: | ...Ihad established this was the killer problem, using a caches resolved it a bit, but the cache system flopped as when the timeout on the cache hit, several queries would activate as several people hit the front page.
I decided it needed to be run as a chron job to refresh the cache, but that took a bit of time to sort.
As to your other point, yes rejigging phpbb entirely would do the trick, and would not be hard. On the other hand this query should not be crippling in the first place, and I think it is best to make the official code work if possible. |
I still think that the architectural concept of "search the whole forum area for the 10 newest posts" is a profligate waste of compute resource...
And to do so *every time anyone* hits the front page seems very much more profligate, so yes, caching "fairly recent" results would be an improvement.
But, my feeling is that there are likely *very* many more front page hits than new posts.
So making it a *small* part of recording each post, rather than a *large* part of building the front page for each hit, seems doubly efficient.
That said, I can fully understand your wish to run code that has as few deviations from a public distribution as possible.
Presumably you have kept documentation of the why/what/how of all your tweaks, so that upgrading the scripts to a future distribution would be eased...
|
jema
|
| dougal wrote: | | jema wrote: | ...Ihad established this was the killer problem, using a caches resolved it a bit, but the cache system flopped as when the timeout on the cache hit, several queries would activate as several people hit the front page.
I decided it needed to be run as a chron job to refresh the cache, but that took a bit of time to sort.
As to your other point, yes rejigging phpbb entirely would do the trick, and would not be hard. On the other hand this query should not be crippling in the first place, and I think it is best to make the official code work if possible. |
I still think that the architectural concept of "search the whole forum area for the 10 newest posts" is a profligate waste of compute resource...
And to do so *every time anyone* hits the front page seems very much more profligate, so yes, caching "fairly recent" results would be an improvement.
But, my feeling is that there are likely *very* many more front page hits than new posts.
So making it a *small* part of recording each post, rather than a *large* part of building the front page for each hit, seems doubly efficient.
That said, I can fully understand your wish to run code that has as few deviations from a public distribution as possible.
Presumably you have kept documentation of the why/what/how of all your tweaks, so that upgrading the scripts to a future distribution would be eased...  |
The front page now uses a chron job that runs at 5 min intervals. Did that in the early hours of yesterday morning, most other bits of it are using a cache as well
Documentation is a little sparse
|
jema
|
It does seem the server defaults to a small join_buffer_size.
I have upped this to 24Mb, and to my eyes things seem faster.
I am determined that there must be this sort of answer. I do not know the full spec of the system we were on, but whilst I know it was a darn site faster with more bells and whistles, the way we have struggled under the recent load was still out of proportion with my expectations.
I really think that properly configured the server we are on should have at least "coped" with what was thrown at it, without the dozen or more optimisations made largely to downsizer itself.
|
jema
|
One other good sign is that the primary death sign we were seeing was the dreaded 98%+ cpu load.
When I was proposing the hosting, I was basing what would be charged and the basic premise that the server could manage downsizer and a few other sites as well, on the performance of the server I run forums on.
From the outset though the comparison seemed flawed as the downsizer forum used 6x as much cpu per hit This was at the time of heaviest load, and the forums system works very fast with that sort of load.
when I look at the figures today, hit for hit the cpu on both servers is now in line
|
|