Share » Forums » Developer » Import XML Data Topic

Import XML Data Topic

Import XML Data Topic

Wednesday 07 December 2005 1:09:59 pm - 54 replies

Author Message

Olivier Pierret

Monday 20 February 2006 12:41:06 am

I don't think so Guillaume because of the way data are extracted from the xml import file. Extracted data are text nodes and in case you have other xml tags inside your data they will not be considered as text nodes. Actually the parser/xpath library I use in PHP is quite poor in my opinion. We should wait for PHP 5 to rewrite this extension in a better way.

You may give a try using escaped characters but I am pessimistic.

Olivier

Samuel Sauder

Friday 31 March 2006 11:24:35 am

Another solution would be to add CDATA like this:

<keyword_s>
<![CDATA[
various <thing />s
]]>
<keyword_s>

It does come over to the textblock. So then you have to clean out the CDATA on the ezpublish side. (It would be sweet if the extension did that for you)

This maybe could be a way to get XMLblock support built in?

references:
http://staff.science.uva.nl/~kaper/webprog/naslag/PhpXPath/Php.XPathDocumentation.html#_translateAmpersand
http://www.w3schools.com/xml/xml_cdata.asp

Samuel Sauder

Friday 31 March 2006 11:33:35 am

Olivier,
I seem to have found a minor bug. There are portions of the content tree that disappear after I import a file. This is a temporary problem until I reset cache or publish a node. This is only in admin and only in the left tree view. The reason it is frustrating is that I want to go to a node following my import, but its not there! It doesn't affect the public site.

Samuel Sauder

Wednesday 24 May 2006 11:44:07 am

Any idea if this is compatible with 3.8 ?

Philipp Kamps

Friday 16 June 2006 10:40:48 am

Based on Samuel post I patched the extension "Import XML Data Topic" for a support of the data type 'ezxmltext'. It is a bit of a hack because I had to define a fake 'http' class in order to use ez xml validation. Here the code:

add this case statement in 'importXMLDatafunctioncollection.php'

case 'ezxmltext':
  if( $xml_content_string ) // should be outside the 'switch' statement
  {
    // Get the embedded data by cutting out everything between the escape tokens
    $begin_escape_string = '<![CDATA[';
    $end_escape_string = ']]>';
							
    $start_pos = strpos($xml_content_string, $begin_escape_string );
    $end_pos   = strpos($xml_content_string, $end_escape_string );
 							
    if( $start_pos && $end_pos )
    {
      $embedded_content = substr( $xml_content_string, $start_pos + strlen( $begin_escape_string) , ( $end_pos - $start_pos - strlen($begin_escape_string) ) ); 

      // ez allows a data validation for the datatype ezxmltext
require_once('kernel/classes/datatypes/ezxmltext/handlers/input/ezsimplifiedxmlinput.php');

      // but unfortunately the method expect a "http" object and the
      // 'contentobjectattribute'
      // so here a lot of code just to make this function happy
								
      $fakeHttp = new fakeHttp( $embedded_content );

      $myhandler = new eZSimplifiedXMLInput(); // get the handler for input validation

      $myhandler->validateInput( $fakeHttp, '', $attribute ); // tons of magic happens here
								
       // I don't have the set the value to the attribute
       // the function 'validateInput' does it already
       //$attribute->setAttribute( 'data_text', $embedded_content );
								
       // debug the $attribute object
       //echo "<pre>";
       //print_r( $attribute );
       //echo "</pre>";
    }
  }

   break;

add this class to the file

class FakeHttp
{

	var $data;

	function FakeHttp( $data )
	{
		$this->data = $data;
	}
	
	function postVariable ( $whatever )
	{
		return $this->data;
	}
	
	function hasPostVariable( $whatever )
	{
		return true;
	}
}

http://www.mugo.ca
Mugo Web, eZ Partner in Vancouver, Canada

Thomas Nunninger

Sunday 18 June 2006 5:29:53 am

Hi,

I think, I found a small bug with importing user account. In ImportXMLDatafunctioncolletion.php you write:

if ( $isAcountObject ) {
    //create new user
    $objectId = $contentObject->attribute( 'id' );
    if ( $userId > 0 ) {
        ...
    } else {
        eZDebug::writeError("error \$objectId invalid (value: $objectId)","ImportXMLDatafunctioncollection");
    }
....

I think you need to compare if

$objectId > 0

not

$userId > 0

Have a nice day

Thomas

Thomas Nunninger

Sunday 18 June 2006 5:41:00 am

Another thing: I can't see the sense of the $i counter. If you import many data you will fill your memory because all the XML data of all the just saved objects is still available. Just replace

foreach ($paths_item as $path_item) {
    foreach($fieldNameList as $fieldName) {
        $fieldValue = $xPathEngine->wholeText("$path_item/$fieldName"."[1]");
        $parsedItems[$i][$fieldName]=$fieldValue;
        if ($fieldName=='account-login') $isAcountObject=true;
    }

with

foreach ($paths_item as $path_item) {
    $parsedItems = array();  // resets the parsed data 
    foreach($fieldNameList as $fieldName) {
        $fieldValue = $xPathEngine->wholeText("$path_item/$fieldName"."[1]");
        $parsedItems[$fieldName]=$fieldValue; // delete the [$i] everywhere
        if ($fieldName=='account-login') $isAcountObject=true;
    }

Have a nice day and of course: thanks for the extension - saves much work :-)

Thomas

Edit: Sorry, I didn't read the next lines:

$result['data']= $parsedItems;

But I don't know if you really want that feature in a mass import situation...

Andrew Kelly

Tuesday 25 July 2006 2:44:09 am

So,
is this extension available anywhere with the changes that Xavier made last year?

Andy

Vytautas Germanavičius

Tuesday 25 July 2006 10:45:44 am

some parts of code can be taken from cronjobs/rss_import.php

{set-block scope=root variable=cache_ttl}0{/set-block}

H-Works Agency

Wednesday 06 September 2006 5:30:28 am

Here are the problems i am facing with this module on the 3.8.1 and 3.8.3. In fact i never managed to make this work :

Fatal error: Cannot instantiate non-existent class: xpathengine in (...)/ezpublish-3.8.3/extension/importXMLData/modules/importXMLData/importXMLDatafunctioncollection.php on line 76
Fatal error: eZ publish did not finish its request

Then in Warnings :

Warning: PHP  	Sep 06 2006 14:28:22

main(lib/phpxpath/XPath.class.php): failed to open stream: No such file or directory

Here are the improvement i see :

Make the "select" form in form.tpl automatically select all classes in current siteaccess.

Make module.ini.append.php accessible on a siteaccess basis (in fact its only seen when i modify the module's file)

Move the "help" folder inside "importXMLData" folder. Otherwise you have to change the folder layout cause when you decompress you have : importXMLData/importXMLData which does not work.

Make the module use the translation system

Thanx for any help. This extension would be very usefull to everyone if it worked :\

EZP 3.8.x - LAMP - Martin

EZP is Great

Xavier Dutoit

Wednesday 06 September 2006 5:58:27 am

Hi,

You have to download and install the xpath lib (php file).

And yes it works and yes that's useful.

X+

http://www.sydesy.com

Samuel Sauder

Wednesday 06 September 2006 6:01:31 am

Martin, my guess is that Xpath is not loaded correctly in your ezpublish installation. Download it from sourceforce and install it. The default place it is looking for is ezpublish/lib/phpxpath/..
Hope that helps

Anara Davletaliyeva

Thursday 14 September 2006 11:26:22 pm

Hi to all!
I have recently downloaded ImportXMLData.tar.
I think,I have done everything in a right way,according to installation instruction,but when I import xml_sample.xml,the data that appears above is empty.Help me,please!

Samuel Sauder

Friday 22 September 2006 7:04:32 am

Anara,
There could be many reasons for this. Try opening the file you are importing in a browser as an .xml file extension. This will weed out if there is a parse error in your xml.

Also check for ezpublish and/or php errors that may lead to the source of the problem.

*- pike

Monday 30 October 2006 2:37:03 pm

Hi

I needed quite some extra features and started rebuilding some code this weekend. I ended up completely changing the php interface, the fetch call, the gui, the inifile. So eventually I renamed the whole extension to avoid confusion. I have it, local, here, it's nowhere for download at this date.

The thing still supports the original datatypes from ImportXMLData, plus a few extra, .. ezxmltext and ezobjectrelationlist amongst others. But most importantly, adding support for other datatypes is really easy now.

So how to proceed ? I can send you (Olivier) the sources so you can look at it and see if you want to pick it up ? I can't find your email address or PM you in this forum, so you may contact me at pike-ez [AT] kw.nl, if you like ?

Alternatively, I could post it as a new contribution. But having 2 xmlimporters is confusing too.

there are some screenshots here
http://pike.kw.nl/browse/files/projects/pike/2006/ezpublish/XMLImport/screensnaps

$2c,
*pike

---------------
The class eZContentObjectTreeNode does.

Vytautas Germanavičius

Thursday 02 November 2006 9:58:12 am

I made update of this extension to be able update nodes, if data is already imported.
it works fine for eveythig, except images.
If i change image, page still shows old. This is because EZ do not update automaticaly resized versions of image, when i update image.
Does anyone know, how to force EZ to create new versions of resized images?

{set-block scope=root variable=cache_ttl}0{/set-block}

*- pike

Thursday 02 November 2006 2:33:28 pm

Hi

Esu .. interesting .. how do you import images ? the version i had only imported numbers,strings,boolean,dates,..and user. i need support for files and images myself .. planned to write it.

& I'm curious: how do you recognize if one object was already imported ? do you keep a node_id or object_id in your xml, or some ini setting for a unique id ?

curiousss,
*-pike

---------------
The class eZContentObjectTreeNode does.

Vytautas Germanavičius

Thursday 02 November 2006 11:06:50 pm

I found extension for importing images. and i took code fro there.
And i made in this way: one of attributes (i specify in ini file which) in xml file, contains name of image. So, i upload images to directory via ftp, then i start import script.

function to import image:

function saveImage( $sourceImage, $originalImageFileName, $caption, &$contentObjectAttribute )
{
    include_once( "lib/ezutils/classes/ezdir.php" );
    $contentObjectAttributeID = $contentObjectAttribute->attribute( "id" );
    $version = $contentObjectAttribute->attribute( "version" );
    include_once( "kernel/common/image.php" );
    $image =& eZImage::create( $contentObjectAttributeID , $version );

    $image->setAttribute( "contentobject_attribute_id", $contentObjectAttributeID );
    $image->setAttribute( "version", $version );

    $sys =& eZSys::instance();
    $storage_dir = $sys->storageDirectory();
    $nameArray = explode( '.', $originalImageFileName );
    $ext = $nameArray[ count($nameArray ) - 1 ];
    $uniqueName = tempnam( $storage_dir . "/original/image/", "imp") . '.' . $ext;
    $uniqueNameArray = explode( '/', $uniqueName );
    $uniqueNameFile = $uniqueNameArray[ count( $uniqueNameArray ) - 1 ];
    $image->setAttribute( "filename", $uniqueNameFile );
    $image->setAttribute( "original_filename", $originalImageFileName );

    $mimeObj = new eZMimeType();
    $mime = $mimeObj->mimeTypeFor( false, $originalImageFileName );
    $image->setAttribute( "mime_type", $mime );
    $image->setAttribute( "alternative_text", $caption );
    $image->store();
    $sys =& eZSys::instance();
    $storage_dir = $sys->storageDirectory();
    $ori_dir = $storage_dir . '/' . "original/image";
    $ref_dir = $storage_dir . '/' . "reference/image";
    if ( !file_exists( $ori_dir ) )
    {
        eZDir::mkdir( $ori_dir, 0777, true);
    }
    if ( !file_exists( $ref_dir ) )
    {
        eZDir::mkdir( $ref_dir, 0777, true);
    }
    $source_file = $sourceImage;
    $target_file = $storage_dir . "/original/image/" . $uniqueNameFile;
    $reference_file = $storage_dir . "/reference/image/" . $uniqueNameFile;
    copy($source_file, $target_file );
    copy($source_file, $reference_file );
}

This function for good for first time import. If you will try to UPDATE images using this function, it will not update resized images. Can anyone tell how to update this function?

{set-block scope=root variable=cache_ttl}0{/set-block}

Kåre Køhler Høvik

Thursday 02 November 2006 11:59:21 pm

Hi

Here's how we handle import of eZXMLType datatype in the RSS import.

function setEZXMLAttribute( &$attribute, &$attributeValue, $link = false )
{
    include_once( 'kernel/classes/datatypes/ezxmltext/handlers/input/ezsimplifiedxmlinputparser.php' );
    $contentObjectID = $attribute->attribute( "contentobject_id" );
    $parser = new eZSimplifiedXMLInputParser( $contentObjectID, false, 0 );

    $attributeValue = str_replace( "\r", '', $attributeValue );
    $attributeValue = str_replace( "\n", '', $attributeValue );
    $attributeValue = str_replace( "\t", ' ', $attributeValue );

    $document = $parser->process( $attributeValue );
    if ( !is_object( $document ) )
    {
        $cli =& eZCLI::instance();
        $cli->output( 'Error in xml parsing' );
        return;
    }
    $domString = eZXMLTextType::domString( $document );

    $attribute->setAttribute( 'data_text', $domString );
    $attribute->store();
}

See also : <i>cronjobs/rssimport.php:490</i>

Kåre Høvik

Kristof Coomans

Friday 03 November 2006 5:22:55 am

If the parser does not return an object then you can get any parsing errors with

$errors = $parser->getMessages();

and print them

foreach ( $errors as $error )
{
    $cli->output( $error );
}

independent eZ Publish developer and service provider | http://blog.coomanskristof.be | http://ezpedia.org

You must be logged in to post messages in this topic!

36 542 Users on board!

Forums menu