Thanks, We are reviewing the need to go to postgreSQL for multi-language sites that do not fit into a single character set. I'd prefer to stick with mySQL for unicode (utf-8) if possible.
I look forward to your results.
tia
Tony
Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development
I just had to test it right away. I upgraded my development machine to Ver 13.5 Distrib 4.1.0-alpha changed charsets on all tables to utf-8 and configured eZ publish to use UTF-8 in database, templates etc..
At first glance it seems to work. I wrote an article with norwegian, russian and chinese text in it and it was displayed and stored correctly.
The only problem I found was that when we index chinese text we only split words by spaces, which do not make sense for chinese.
Of course it needs more testing, but I don't think we need to do much to fully support UNICODE(utf-8) with MySQL 4.1.
Please set up a test installation and report any problems you might have with this setup.
I just find out eZ's trick in indexing Chinese in Search tables: split all Chinese characters into single ones
But I do not think that works with Chinese, as no one will search for something using Characters rather than words ( Chinese words composes 2 or more Chinese characters).
mySQL 4.1 installed and running 3.1 svn. I ran the instructions here at http://ez.no/developer/ez_publish_3/documentation/installation_and_configuration/configuration/language_and_charset/unicode_with_ez_publish the content is converted and it exists in the db, but XML fields cannot be read by the admin interface. it just appears blank.
Even after running xml fix php -C update/common/scripts/updatexmltext.php
Are there any further instructions for converting the content?
Tony
Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development
It also appears that the conversion as documented only converts the index to utf-8. The fields in the table are still as they where created. Is there a script to change all the fields to Charset utf-8?
tony
Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development
In my examples (maybe all older v3 sites from v3 beta, don't know), some of the XML field definitions are incorrectly defined as <?xml version="1.0" encoding=""?> when they should be <?xml version="1.0" encoding="utf-8"?>. If you change each affected record then you'll get your content back.
Note. Take csre when using the updatexmltext.php it just deleted the xml data in my entries... again this may be just me..
I hope this helps someone
Tony
Tony Wood : twitter.com/tonywood
Vision with Technology
Experts in eZ Publish consulting & development