[MTOS-dev] Queries on Very Large Tables

David Scott ds94103 at earthlink.net
Mon May 12 11:29:00 PDT 2008


Do you have command-line access to the database machine so that you can 
profile the query?

d

Mark Carey wrote:
> MT version: 4.1
>
> While troubleshooting some mt.cgi timeouts, in my Visitor Stats Pro
> plugin on a site, I have isolated the delay to a number of container
> tags.  The plugin adds two graphs to the dashboard for "Visitors" and
> "Views" (the graphs are *not* the problem in this case).  Below each
> graph, I have a list for  "recent visitors" and "recent views", each
> list of 10 most-recent objects is powered by a container tag
> (<mt:Visits>, <mt:Views>), which use fairly typical syntax.  For
> example a snippet from the "visits" routine:
>
> 	if ($blog_id) {
> 		$vs_terms->{blog_id} = $blog_id;
> 	}
> 	my $lastn = $args->{lastn} || 10;
> 	$vs_args->{limit} = $lastn;
> 	$vs_args->{sort} = 'created_on';
> 	$vs_args->{direction} = 'descend';
> 	use VisitorStats::Visitor;
> 	my $visitor_iter = VisitorStats::Visitor->load_iter($vs_terms, $vs_args);
>
> I timed this tag, and it took 41 seconds to render on this site.
>
> I should mention that the site in question has more than 3,000,000
> records in the Visitor table.  (The schema for this object includes
> indexes on blog_id and created_on).  I have not noticed such a delay
> on other sites with much fewer records, so my first thought it that
> the cause is the size of the table -- but the query seems relatively
> simple, asking for only 10 items sorted by date.
>
> The query that counts the data for graph, on the other hand, seems
> like a more complicated one, grouping the number of visits by date,
> for the most recent 120 days (there are over 2.5 million records in
> the past 120 days).  But this query takes about 5-6 seconds.
>
> So I am confused.  A few things come to mind as being different in each case:
>
> -one uses an iterator, the other a count with group_by
> -one happens as part of the template engine (being a template
> container tag), and the other happens independent of the template
> rendering process (i actually calculate the graph data via a schedule
> task, and thus, via cron)
>
> Thoughts?  I would have thought that pulling a very small number of
> recent records from a large table would be efficient, but I am not a
> database expert...
>
> Thanks,
>
> Mark
> _______________________________________________
> MTOS-dev mailing list
> MTOS-dev at sixapart.com
> http://www.sixapart.com/mailman/listinfo/mtos-dev
>
>
>   



More information about the MTOS-dev mailing list