Share » Forums » Setup & design » data migration / data export

data migration / data export

data migration / data export

Tuesday 27 November 2007 3:15:02 am - 2 replies

Author Message

gabriele francescotto

Tuesday 27 November 2007 8:31:17 am

Hi Nicholas,

that example works only for the old eZ publish 2.x
for that version, I wrote a simpler PHP script to migrate all the contents to Drupal, that I can copy here.

But I have to say that, after the migration, I migrated again the contents to eZ publish (3.x version); eZ is definitely more powerful and complete than Drupal, there's no comparison.

gabriele

 


/**
 *
 *
 * this is a script to export data
 * from  eZ publish 2 into Drupal's postgres database
 *
 *
 * create a new database (testdata), containing ez publish's and drupal's tables
 NOTE: if your eZ publish database is set up with the character encoding (LATIN9),
you need to migrate the contents to the UTF-8/Unicode encoding. More information can be found in PostgreSQL documentation.

 * CREATE DATABASE testdata ENCODING='UTF8';

 * pg_dump --encoding=UTF8 ezpublish > ezpublish.sql
 * pg_dump drupal > drupal.sql
 * psql testdata < drupal.sql
 * psql testdata < ezpublish.sql
 *
 * practice command to extract ez publish's table list (tables to remove from the live database):
   psql -U drupaluser -W  -c "SELECT table_name FROM information_schema.tables WHERE table_schema = 'public'" drupal_database > tables.txt

 * run these queries:
 * drop table indexes_ez_drupal; // needed only if you made other migration tests before

*/

set_time_limit( 0 );

$sql = "
CREATE TABLE indexes_ez_drupal ( ez_id int4 NOT NULL, drupal_id int4 NOT NULL );
CREATE TABLE ez_cat ( ez_art int4 NOT NULL, ez_cat int4 NOT NULL, drupal_cat int4 NOT NULL );
CREATE TABLE ez_countries ( ez_art int4 NOT NULL, ez_country int4 NOT NULL, drupal_cat int4 NOT NULL );
ALTER TABLE node ALTER COLUMN title  TYPE varchar(255);
ALTER TABLE node_revisions ALTER COLUMN title  TYPE varchar(255);
";


/**************************************
* MAIN SETTINGS
**************************************/
$number_of_articles=30000;      // maximum number of articles parsed; the script can demand a lot of time to process all the articles
$import_abstract=true;          // not used
$import_body=true;              // not used

// it is possible to run the script with just an article, in order to
// to test the software or to analyse particular cases

$catch_exceptions=false;        // [true|false] true if you want to use the script with a single article, false to ignore this feature
$exception_article =94912;      // ez article ID

// these 2 variables can be useful in case the script stops
// for some reason

$starting_article=0;            // it must be the ID of firt article to process; the previous ones will be ignored
$remove_prev_imported = true;   // [true|false] true to reset the previous imported data, false to save them

$argv = $GLOBALS["HTTP_SERVER_VARS"]["argv"];
include_once( "classes/INIFile.php" );
include_once( "classes/eztemplate.php" );
include_once( "classes/ezhttptool.php" );
require_once('ezaddress/classes/ezcountry.php');
include_once( "ezarticle/classes/ezarticlecategory.php" );
include_once( "ezarticle/classes/ezarticlerenderer.php" );
#include_once( "lib/ezxml/classes/ezxml.php" );
#include_once( "lib/ezxml/classes/ezdomdocument.php" );
#include_once( "lib/ezxml/classes/ezdomnode.php" );
include_once( "ezuser/classes/ezobjectpermission.php" );
include_once( "classes/ezinifile.php" );
include_once( "ezimagecatalogue/classes/ezimage.php" );


$GLOBALS["DEBUG"] = true;
$Override_ParentCategory = 0;

//$ini = new INIFile( "site_custom.ini", false );
//$GLOBALS["INI_site.ini"] =& $ini;

$writePermGroup=-1;
$pathSeparator = '#!/!#';
//$pathSeparator = '-';
if ( $argv[1] != "" )
{
    for ( $i = 1; $i < count( $argv ); $i++ )
    {
        $option = preg_split( "#\=#is", $argv[$i] );
        $argument = $option[0];
        $value = $option[1];
        $replace = false;
        switch ( $argument )
        {
            case "-r":
                $replace = true;
                break;

            case "-s":
                $stripPathElements = $value;
                break;

            case "-sl":
                $stripPathElementsLive = $value;
                break;

            case "-q": print "

";
                return false;
                break;
            case "-p":
                $Override_ParentCategory = $value;
                break;

            case '-l':
            case '--language':
                $Language=$value;
               break;

            case "-f":
                $file = $value;
                break;

            case "--writepermgroup":
                $writePermGroup = $value;
                break;
        }
    }
} else {
        print "
        Please make sure that you run the following queries:

" . $sql . "

        If you just run them, please run again this script with the option '-q'.
";
}

/*#########################################################################################
                                    GLOBAL SETTINGS
#########################################################################################*/

//ARTICLES

/*#######################################################################################*/

$db =& eZDB::globalDatabase();
$category = new eZArticleCategory();


$db->array_query($articles, "select * from ezarticle_article limit $number_of_articles");
print "\n##############\n\nSelecting the ".count($articles)." articles to import in Drupal... please wait\n\n";
print " DONE\n\n";

print "\nLooking for exceptions (articles containing polls or forms)... please wait\n\n";
$db->array_query($articles_forms, "select id, name from ezarticle_article where contents like '%<form%'");
$db->array_query($articles_polls, "select id, name from ezarticle_article where contents like '%<poll%'");

$arr_exceptions = array();
        print "\n       The following articles contain FORMS and will be excluded from the migration: ";
foreach ($articles_forms as $form)
{
        print "
        " . $form[id] .": " . $form[name];
        array_push($arr_exceptions, $form[id]);
}

        print "\n\nThe following articles contain POLLS and will be excluded from the migration: ";
foreach ($articles_polls as $poll)
{
        print "
        " . $poll[id] .": " . $poll[name];
        array_push($arr_exceptions, $poll[id]);
}

print "\n";


/*  INIZIALIZZAZIONE NECESSARIA, DA EFFETTUARE DOPO OGNI TEST

delete from node;
delete from node_revisions;
delete from sequence;
delete sequence node_revisions_vid_seq;


###################################################*/

if ($remove_prev_imported) {
//$db->query('create table indexes_ez_drupal ( ez_id int4 NOT NULL, drupal_id int4 NOT NULL )');
$db->query('delete from indexes_ez_drupal');
$db->query('delete from localizernode');
$db->query('delete from node where nid>20');
$db->query('delete from term_data where tid>16');
$db->query('delete from term_node where nid>16');
$db->query('delete from term_hierarchy where tid>16');
$db->query('delete from ez_cat');
$db->query('delete from ez_countries');
$db->query('delete from node_revisions where nid>20');

}


function utf8_to_html ($data)
    {
    return preg_replace("/([\\xC0-\\xF7]{1,1}[\\x80-\\xBF]+)/e", '_utf8_to_html("\\1")', $data);
    }

function _utf8_to_html ($data)
    {
    $ret = 0;
    foreach((preg_split('//',strrev(chr((ord($data{0}) % 252 % 248 % 240 % 224 % 192) + 128) . substr($data, 1)))) as $k => $v)
        $ret += (ord($v) % 128) * pow(64, $k);
    return "&#$ret;";
    }

$body = '';
$title = '';
$pub_date = '';
$mod_date = '';
$images = array();
$author_id = '';
$external_url = '';
$keywords = '';
$country = ''; // category
$allow_comments = false;
$language = 'english';
$ispublished = true;



   print "

  > Importing articles: ";

if ($remove_prev_imported) $process_article = true; else $process_article=false;

foreach ($articles as $number => $art) {

  if (!$remove_prev_imported) {
        if($starting_article==$art[id]) $process_article=true;
  }
if ($process_article)
 if (!in_array($art[id], $arr_exceptions)) {
  if ($number < $number_of_articles) {


        $article =  new eZArticle ($art[id]);
        $article_id = $article->id();

        print "Processing Article : " . $article->name() . "  ID=" . $article->id();

        $rendered = new eZArticleRenderer ($article);
        $title = str_replace("\\", "", $article->name());
        $teaser = $rendered->renderIntro();
        $body_page = $rendered->renderPage();
        $body = $body_page[1];

        // images related to the selected article
        if (count($images)>0) {
                foreach ($images as $k=>$image) {
                        $img = $image['Image'];
                        $img_language = $article->language();
                        $img_caption = $img->caption();
                        $img_html_caption = $img->fullHTMLCaption( $img_language );
                        $img_path = "http://www.oneworldsee.org".$img->filePath();
                        $img_category = $img->categoryDefinition();
                        $img_name = $img->name();
                        if ($k==0) {
                                $teaser = "<div class=\"ezimage\"><img alt=\"".$img_name."\" src=\"".$img_path."\"/><div class=\"caption\">".$img_caption."</div></div>".$teaser;
                        } else {
                                $body = "<div class=\"ezimage\"><img alt=\"".$img_name."\" src=\"".$img_path."\" /><div class=\"caption\">".$img_caption."</div></div>".$body;
                        }
                }
        }



        $keywords = $article->keywords();
        $allCountries = eZCountry::getFlatList( false );
        $title = str_replace("'", "\'", $title);
        $title = str_replace("&quot;", "\"",$title);
        $title = str_replace("&amp;", "&",$title);
        $title = str_replace("&gt;", ">",$title);
        $title = str_replace("’’", "\"",$title);
        $title = str_replace("’", "\'",$title);
        $title = str_replace("‘", "\'",$title);
        $title = utf8_to_html($title);
$teaser = stripslashes(str_replace("\\", "", $teaser));
        $teaser = str_replace("'", "\'", $teaser);
        $teaser = str_replace("<br />", "
",$teaser);
        $teaser = str_replace("&quot;", "\"",$teaser);
        $teaser = str_replace("&gt;", ">",$teaser);
        $teaser = str_replace("&amp;", "&",$teaser);
        $teaser = str_replace("’’", "\"",$teaser);
        $teaser = str_replace("’", "\'",$teaser);
        $teaser = str_replace("‘", "\'",$teaser);
        $teaser = utf8_to_html($teaser);

        $body = str_replace("\\", "", $body);
        $body = str_replace("'", "\'", $body);
        $body = str_replace("<br />", "
", $body);
        $body = str_replace("&quot;", "\'", $body);
        $body = str_replace("&gt;", ">", $body);
        $body = str_replace("&amp;", "&", $body);
        $body = str_replace("’’", "\"", $body);
        $body = str_replace("’", "\'", $body);
        $body = str_replace("‘", "\'", $body);
        $body = utf8_to_html($body);

        $pub_date = $article->Published;
        $mod_date = $article->Modified;
        $images = $article->images();
        $author = $article->author();



        // USERS MAP: creates authors if they don't exist
        // manage groups and roles... TODO

        // INITIALIZE DRUPAL USERS, FROM EZ PUBLISH AUTHORS

        $user_id = $author->id();
        if ($user_id == '') $user_id=1;

        // find last drupal's userid
        $db->array_query($users, "SELECT last_value FROM users_uid_seq");
        $userid = $users[0]['last_value'];
        // catch ez's author and create a drupal user, includind the password
        if ($author->id()) {
                $db->array_query($authors_passwords, "SELECT password FROM ezuser_user WHERE id=". $author->id() );
                $author_password = $authors_passwords[0]['password'];
        } else $author_password=md5('owpass');
        $db->array_query($drupal_users, "SELECT name FROM users WHERE name='" . $author->login() . "'") ;
        if ( count($drupal_users)>0 ) {
                print "";
        } else { // insert a new author in drupal
                $db->query("INSERT INTO users ( name, signature, pass, mail, status, language )
                                VALUES ('".$author->login()."', '".$author->firstName().
                                " ".$author->lastName()."', '".$author_password."', '".
                                $author->email()."', 1, '".$short_language."' )");
        }

        if ($article->ispublished()==1) $ispublished=1; else $ispublished=0;
        $db->array_query($node_nid_seq, "SELECT last_value from node_nid_seq");
        $next_node_nid_seq= $node_nid_seq[0]['last_value'] + 1;

        $db->query("INSERT INTO node (vid, type, title, uid, status,
                        created,  changed, comment, promote, moderate, sticky)
                        VALUES ($next_node_nid_seq, 'story',
                        '$title', $userid , $ispublished, $mod_date, $pub_date, 2, 0, 0, 0)");
        $intro = $teaser . $link;
        // TODO revisions
        $sql_revision = "INSERT INTO node_revisions ( nid,vid,uid,title,body,teaser,log,timestamp,format) VALUES ($next_node_nid_seq, $next_node_nid_seq, $userid, '$title', '$body', '$intro', 'imported from OW eZ publish', $pub_date, 1)";
        $db->query($sql_revision);
        $db->query("INSERT INTO indexes_ez_drupal (ez_id, drupal_id) values ($article_id, $next_node_nid_seq)");


        // MANAGE EXTERNAL LINKS, ALTERNATIVE TO THE BODY
        $external_link = $article->linkobject();
        if ($article->haslinkurl()) {
                $link_name =str_replace("'", "\'", $external_link->name() );
                $link = '<br /><a class="external_link" target="_blank" href="' . str_replace("'", "\'", $external_link->url()) . '">' . $link_name . '</a>';
        }
                else
                $link = '';


        //take Drupal node_id
        $db->array_query($nextval_arr, "SELECT last_value from node_nid_seq");
        $nextval = $nextval_arr[0]['last_value'];

        // LOOK FOR THE ISOCODE : IsoCode TODO: shows always english
        $language = $article->language();
        $locale = new eZLocale($Language);
        $language_object = new eZLocale($article->language());
        $short_language = substr($language_object->IsoCode, 0, 2);
        $db->query("INSERT into localizernode (nid, locale, pid) values ($nextval, '".$short_language."', $nextval)");

        $categories = $article->categories(true);
        $countries = $article->countries(); // categories

        foreach ($categories as $category) {


          if ( substr($category->pathstring(), 0 , 22 ) == "/0/2373#2373/2481#2481" ) {
            $db->array_query($check, "select ez_cat from ez_cat where ez_cat=" .$category->id() );
            $categ_exists = false;

        // if the category doesn't exist, create a new one
            if ( count($check)==0 ) {
                        print " categ ez: ".$category->id() ."   ";
                $category_name = str_replace("'", "\'", $category->name());
                $category_name = str_replace("’", "\'", $category_name);
                $category_name = str_replace("‘", "\'", $category_name);
                $category_name = str_replace("&gl;", "<", $category_name);
                $category_name = str_replace("&gt;", ">", $category_name);
                $category_desc = str_replace("'", "\'", $category->description());
                $category_desc = str_replace("&gt;", ">", $category_desc);
                $category_desc = str_replace("&gl;", "<", $category_desc);
                $category_desc = str_replace("‘", "\'", $category_desc);
                $category_desc = str_replace("’", "\'", $category_desc);
        // if the category doesn't exist, check if there are drupal categories with the same name
                $db->array_query($check_name, "SELECT tid FROM term_data WHERE name='". $category_name ."'");
                if ( count($check_name)>0 ) {
                        $categ_exists=true;
                        // new drupal's term id
                        $drupal_cat = $check_name[0]['tid'];
                        $db->query("INSERT INTO ez_cat (ez_art, ez_cat, drupal_cat)
                                VALUES (".$article->id().", ".$category->id(). ", " . $drupal_cat . ")");
                        $db->array_query($check_country_name_node, "SELECT tid FROM term_node WHERE nid ='".$next_node_nid_seq."' AND tid='". $drupal_cat ."'");
                                                if (count($check_country_name_node)==0)
                        $db->query("INSERT into term_node values (". $next_node_nid_seq .", ". $drupal_cat . ")");
                }

// taxonomy


 if (!$categ_exists)
                        $db->query("INSERT into term_data ( vid, name, description) values (3, '" . $category_name . "', '" . $category_desc . "' )");
                $db->array_query($nextval_arr_term, "SELECT last_value from term_data_tid_seq");
                // new drupal's term id
                $drupal_cat = $nextval_arr_term[0]['last_value'];
                if (!$categ_exists)
                        $db->query("INSERT into term_hierarchy ( tid, parent) values ($drupal_cat , 0)");
                $db->query("INSERT into ez_cat (ez_art, ez_cat, drupal_cat) values (".$article->id().", ".$category->id(). ", " . $drupal_cat . ")");
            } else {

            // the category exists: find drupal's category
                $db->array_query($drupal_cat_arr, "SELECT drupal_cat FROM ez_cat WHERE ez_cat=".$category->id());
                $drupal_cat = $drupal_cat_arr[0]['drupal_cat'];

            }

        // insert drupal's node into drupal's category

                if (!$categ_exists) {
                        $db->array_query($check_country_name_node, "SELECT tid FROM term_node WHERE nid ='".$next_node_nid_seq."' AND tid='". $drupal_cat ."'");
                        if (count($check_country_name_node)==0)
                                $db->query("INSERT into term_node values (". $next_node_nid_seq .", ". $drupal_cat . ")");
                }
          }


        } // foreach $categories as $category

// ez countries as drupal taxonomy
        foreach ($countries as $country) {

                foreach ($allCountries as $c) {
                        if ($c->id()==$country) $cc = $c;
                }
                $db->array_query($check, "SELECT ez_country from ez_countries where ez_country=" . $country);
                $country_exists = false;
                if ( count($check)==0 ) {
                        $db->array_query($check_country_name, "SELECT tid FROM term_data WHERE name='". $cc->name() ."'");
                        if (count($check_country_name)>0) {
                                $country_exists = true;
                                $country_drupal_id=$check_country_name[0]['tid'];
                                $db->query("INSERT INTO ez_countries (ez_art, ez_country, drupal_cat) values (".$article->id().", ".$country. ", " . $country_drupal_id . ")");
                                $db->array_query($check_country_name_node, "SELECT tid FROM term_node WHERE nid ='".$next_node_nid_seq."' AND tid='". $country_drupal_id ."'");
                                if (count($check_country_name_node)==0)
                                        $db->query("INSERT INTO term_node VALUES (". $next_node_nid_seq .", ". $country_drupal_id . ")");
                        } else {
                                $country_exists = false;
                                $db->query("INSERT into term_data ( vid, name, description) values (2, '" . str_replace("'", "\'", $cc->name()) . "', 'Country' )");
                        $db->array_query($nextval_arr_term, "SELECT last_value from term_data_tid_seq");
                        $drupal_cat = $nextval_arr_term[0]['last_value'];
                        $db->query("INSERT into term_hierarchy ( tid, parent) values ($drupal_cat , 0)");
                        $db->query("INSERT into ez_countries (ez_art, ez_country, drupal_cat) values (".$article->id().", ".$country. ", " . $drupal_cat . ")");
                        $db->query("INSERT INTO term_node VALUES (". $next_node_nid_seq .", ". $drupal_cat . ")");
                        }
                } else {

                // the category exists: find drupal's category
                        $db->array_query($drupal_cat_arr, "SELECT drupal_cat FROM ez_countries WHERE ez_country=".$country);
                        $drupal_cat = $drupal_cat_arr[0]['drupal_cat'];
                // insert drupal's node into drupal's category
                        $db->array_query($check_country_name_node, "SELECT tid FROM term_node WHERE nid ='".$next_node_nid_seq."' AND tid='". $drupal_cat ."'");
                        if (count($check_country_name_node)==0)
                        $db->query("INSERT INTO term_node values (". $next_node_nid_seq .", ". $drupal_cat . ")");
                }

        }



        print "  Successfully processed!
";
  } // if
 } //if
} // for articles

 

www.opencontent.it
via Verdi 19, 38100 Trento

Paul Leclercq

Tuesday 27 November 2007 9:15:25 am

personnaly I would recommend using rss feeds to export your data.

There is a very good exemple on how to export your data using rss feeds on this page:
http://pwet.fr/blog/des_fils_rss_sur_mesure_dans_ez_publish

unfortunatly this is in french, but basically, all you need to do is create an overide for your pagelayout, and create an rss view. In which you will put the following code:

<?xml version="1.0" encoding="iso-8859-1"?>
<rss version="2.0">
<channel>
<language>fr-FR</language>
{$module_result.content}
</channel>
</rss>

You will then simply need to create overrides to display the various content:
design/<mon_design>/templates/node/view/rss_full.tpl

You will then need an import function to import your data in drupal.

this simple function will read the url you whish. All you will need to do the is get the info in the xml using the xml parser you would like and import the data in your database. This way you could

function getFlux($flux)
{
$handle = fopen($flux, "rb");
$contents = '';
while (!feof($handle)) {
$contents .= fread($handle, 8192);
}
fclose($handle);
return($contents);
}

You must be logged in to post messages in this topic!

36 542 Users on board!

Forums menu