What is happening in "Time accumulators: String conversion in mysql"?

What is happening in "Time accumulators: String conversion in mysql"?

Tuesday 29 June 2004 2:10:34 am - 23 replies

Author Message

Tony Wood

Tuesday 29 June 2004 2:29:02 am

ok found it

eZDebug::accumulatorStart( 'mysql_conversion', 'mysql_total', 'String conversion in mysql' );
$sql =& $this->InputTextCodec->convertString( $sql );
eZDebug::accumulatorStop( 'mysql_conversion' );

A list of what happens at each stage would still be useful.

-- tony

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future

Tony Wood

Tuesday 03 August 2004 7:34:03 am

Are there any php compile or system setting I can tweak to reduce the amount of time spent in this stage?

--tony

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future

Alex Jones

Tuesday 10 August 2004 6:06:46 am

*bump*

I'm interested in the answer to this as well.

Alex

Alex
[ bald_technologist on the IRC channel (irc.freenode.net): #eZpublish ]

<i>When in doubt, clear the cache.</i>

Bård Farstad

Tuesday 10 August 2004 6:10:37 am

Tony, Alex,

When time is consumed in string conversion from MySQL then something is not optimal configured. That's why we've added this. It can happen if e.g. you use a differente character set on your database than the rest of eZ publish. Ideally/normally this should be 0.

You should optimally make all charactersets the same. Some information about how to set character sets can be found here:

http://ez.no/ez_publish/documentation/configuration/configuration/language_and_charset/unicode_with_ez_publish

--bård

Documentation: http://ez.no/doc

Tony Wood

Tuesday 10 August 2004 6:38:31 am

This is interesting.

All charsets are set to utf-8 and the MySQL 4.1.3 db is set to utf-8. Any way to debug this further?

what is it that takes the time, I have some old content with another charset pre-conversion and from kernel/sql/common/cleandata.sql . Should I grep through the database and replace

<?xml version=\"1.0\" encoding=\"iso-8859-1\"?> with <?xml version=\"1.0\" encoding=\"utf-8\"?>

--tony

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future

Georg Franz

Tuesday 10 August 2004 7:03:55 am

Hi,

at my installation (utf8 output, utf8 db -> mysql 4.1.2), I've following at a uncached page:

String conversion in mysql | 2.2379 sec | 32.7103% | 14326 | 0.0002 sec

total queries: 288
string conversion calls: 14.326

I've found out, that a "dummy"-method is called. (Because - of course - a conversion from utf8 to utf8 isn't necerssary). But - IMHO - it would be better to save that 14.326 dummy calls.

Kind regards,
Emil.

Best wishes,
Georg.

--
http://www.schicksal.com Horoskop website which uses eZ Publish since 2004

Tony Wood

Tuesday 10 August 2004 7:10:13 am

Found the problem... It was an ovveride with wrong Charset... I have resolved this now..

Sorry for being a bit slow

--tony

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future

Bård Farstad

Tuesday 10 August 2004 7:13:15 am

Great.

It can often be a time saver to check the ini settings using the admin tool. This will let you know if you have an override or not, it have helped me many times.

--bård

Documentation: http://ez.no/doc

Bård Farstad

Tuesday 10 August 2004 7:18:58 am

Emil, are you sure about those settings. eZ publish should not do character conversion if it uses the same character set internally and in the db.

-bård

Documentation: http://ez.no/doc

Tony Wood

Tuesday 10 August 2004 7:24:37 am

I have this now...

I get this now
String conversion in mysql 0.5773 sec 34.3655% 2422 0.0002 sec

Your right emil 2422 calls is a lot.

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future

Bård Farstad

Tuesday 10 August 2004 7:32:28 am

Does this only happen when using UTF-8? I can do some testing on this tomorrow because this should not happen.

--bård

Documentation: http://ez.no/doc

Georg Franz

Tuesday 10 August 2004 7:33:37 am

Bard:

yes, I've mentioned that some times ago in the bug report:

http://ez.no/community/bug_reports/ezdbinterface_php_text_conversion

HTH,

Kind regards,
Emil.

Best wishes,
Georg.

--
http://www.schicksal.com Horoskop website which uses eZ Publish since 2004

Bård Farstad

Tuesday 10 August 2004 7:36:52 am

Ok, I will have a look into this tomorrow.

--bård

Documentation: http://ez.no/doc

Tony Wood

Thursday 12 August 2004 3:50:44 am

Bard,

Just an update. With a few well placed cache-block tags this has been reduced.

String conversion in mysql	0.0074 sec	2.7348%	31	0.0002 sec

--tony

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future

Bård Farstad

Thursday 12 August 2004 6:44:52 am

Tony,

thanks for the reminder ;)

You are absolutely correct. And it's a bug in eZ publish. Add the following lines

if ( !$this->OutputTextCodec->conversionRequired() || !$this->InputTextCodec->conversionRequired() )
{
    unset( $this->OutputTextCodec );
    unset( $this->InputTextCodec );
    $this->OutputTextCodec = null;
    $this->InputTextCodec = null;
}

to lib/ezdb/classes/ezdbinterface.php at line 131.

This fix goes directly into 3.5.0 (trunk) and 3.4.2.

Thanks ;)

--bård

Documentation: http://ez.no/doc

Bård Farstad

Thursday 12 August 2004 6:50:56 am

This fix actually improved the performance on an std 3.4.1 installation on my computer with 35% when I used utf-8. So you will definetly notice it if you use unicode.

--bård

Documentation: http://ez.no/doc

Tony Wood

Thursday 12 August 2004 7:39:51 am

I can confirm it has speed up our site, expecially when people are logged in when we check for notification/bookmarks etc..

Thanks Bard

--tony

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future

Bård Farstad

Thursday 12 August 2004 7:46:35 am

Tony, I've also added another fix in svn which makes it compatible with single byte characters as well. My first patch broke that.

Here is that new code:

            if ( $this->OutputTextCodec && $this->InputTextCodec )
            {
                if ( !$this->OutputTextCodec->conversionRequired() || !$this->InputTextCodec->conversionRequired() )
                {
                    unset( $this->OutputTextCodec );
                    unset( $this->InputTextCodec );
                    $this->OutputTextCodec = null;
                    $this->InputTextCodec = null;
                }
            }

--bård

Documentation: http://ez.no/doc

Tony Wood

Thursday 12 August 2004 7:56:52 am

ok, changed...

One thing I have wanted to ask.

A lot of the code in eZ fixes the character storage mechanism to XML urf-8. Should this be the case? Should it not be "unicode" and "iso xxx" and should the default be iso?

I guess that a lot of users are using MySQL 4.0.x and so can only store single byte data. The utf-8 tag signifies double-byte. Am I missing something?

--tony

Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development

Power to the Editor!

Free eZ Training : http://www.VisionWT.com/training
eZ Future Podcast : http://www.VisionWT.com/eZ-Future

Jan Borsodi

Thursday 12 August 2004 8:14:06 am

utf-8 is OK allthought it would be faster if it was stored using the current internal charset (removes conversion need).
We already do this for the XML datatype but it hasn't been implemented for the other datatypes yet.

Also <i>unicode</i> is the character set but is not an encoding so cannot be used for storage, however unicode has several encodings defined.
utf-8: The most common in stored media, uses 1 to 6 bytes for storage, ie. it is variable and works seamlessly with existing 8bit string code. However it is a bit slow due to the variable size.

usc2: Stores using double-byte, much faster since lookup is constant and quite often used internally in programs. Unfortenately doing this in PHP using PHP code only could quite easily be troublesome

usc4: Similar to usc2 but uses four bytes (since the initial 2 bytes were not enough for all languages in the world, something like 21 bit is needed I believe).
there are also other encodings (like the non-standard utf-7.5) but hardly used.

So storing utf-8 in 8bit only databases is OK as long as you don't try to do text operations on them in the database.

--
Amos

Documentation: http://ez.no/ez_publish/documentation
FAQ: http://ez.no/ez_publish/documentation/faq

You must be logged in to post messages in this topic!

Powered by eZ Publish™ CMS Open Source Web Content Management. Copyright © 1999-2014 eZ Systems AS (except where otherwise noted). All rights reserved.