[MTOS-dev] Performance

Alvar Freude alvar at a-blast.org
Sun Mar 9 15:57:11 PDT 2008


Hi,

-- Hirotaka Ogawa <hirotaka.ogawa at gmail.com> wrote:

> I think, adding a new published entry is a relatively large
> transaction.  I'd like to know analysis for more minute operations if
> you can.

so, now some more analysis ... :)

The following tests are made with the latest code from SVN and the dev 
branch. As the last time, my desciption is from the SQL point of view.


For all tables with auto_increment there are Sequences generated when 
using PostgreSQL; the next values from the sequences are queried manually 
and gthen inserted into the xxx_id field.

It would be better (and easyer) to change the datatype from INTEGER to 
SERIAL in this cases. This can be used in the same way as an 
auto_increment column in MySQL (with the exception, that you can have as 
much as you want, not only one).
This generates automagically a sequence and a default value of 
"nextval('sequence_name')" for this column.

Alternatively the sequence may created manually, but the default value 
should be changed to the nextval statement. With this it isn't necessary 
to do it manually.

  ...
  id INTEGER DEFAULT nextval('sequence_name') NOT NULL,
  ...



There are a lot of queries of the following style:

  SELECT xxx_id FROM mt_xxx WHERE <where> ORDER BY <order>
  SELECT <everything> FROM mt_xxx WHERE xxx_id IN (<result from above>);


The ORDER BY gets lost in the second query, the result is usually not in 
the same order then the IDs in the subselect!


And this could be written as the following, which is faster and the 
result is in the right order:

   SELECT <needed columns>
     FROM mt_xxx
    WHERE xxx_id IN (SELECT xxx_id FROM mt_xxx WHERE <where>)
 ORDER BY <order>


Which is the same as this simple statement:

   SELECT <needed columns>
     FROM mt_xxx
    WHERE <where>
 ORDER BY <order>


This is about twice as fast as the two separate statements.


Such statements seem to appear for nearly everything, e.g. mt_category, 
mt_author, mt_permission, mt_session, ...


When adding a category, the new category is inserted first, and after 
this there is an update for the category basename. (similar when 
inserting a new entry)

As the queries are all single transactions (AutoCommit), this is not only 
slow, but also dangerous: when the process crashes or gets killed after 
the insert and before the update, then the database consistency is broken.

In total there are 24 queries for adding a category and 26 "SELECT 
'DBD::Pg ping test'". I think the minimum would be two queries: one for 
the session data and one insert.



So, as a conclusion: I think the most important things for better 
performance are:

  * reducing the queries for in template processing

  * use transactions and guarantee data integrity


As far as I understand the mails here in the last days (ups, I did not 
read all), there is something on the way for this two things.

The next important things are IMHO:

  * better PostgreSQL support, e.g. change the manual sequence
    processing to SERIAL data type (like auto_increment in MySQL)
    or DEFAULT nextval('...')

  * concentrate the SQL queries and reduce redundant queries

  * Memory Leaks and FastCGI/mod_perl support


Ciao
  Alvar


-- 
** Alvar C.H. Freude, http://alvar.a-blast.org/
** http://www.assoziations-blaster.de/
** http://www.wen-waehlen.de/
** http://odem.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 194 bytes
Desc: not available
Url : http://www.sixapart.com/pipermail/mtos-dev/attachments/20080309/64382f33/attachment.bin 


More information about the MTOS-dev mailing list