Share » Forums » Extensions » eZ Find » Add sub attributes to a custom SolR...

Add sub attributes to a custom SolR field mapping

Add sub attributes to a custom SolR field mapping

Wednesday 24 March 2010 8:08:21 am - 19 replies

Author Message

Matthieu Sévère

Wednesday 24 March 2010 9:06:15 am

After further testing the getData works so it seems that's the method that will return data to solR.

With the example above I have the array 1, 2, 3 in the index.

I'll try to see how to filter it

--
eZ certified developer: http://ez.no/certification/verify/346216

Matthieu Sévère

Wednesday 24 March 2010 9:32:48 am

Flash update :

I have my coordinates indexed. See the extract from solR admin below :

<arr name="subattr_location-latitude_f">
  <float>51.49998</float>
</arr>
<arr name="subattr_location-longitude_f">
  <float>-0.126944</float>
</arr>

But I got no result If I add the following filter to ezfind search :

'stockist/location/latitude:51.49998'

If I look into the debug I see that my search params are good :

["Filter"]=>
  array(2) {     
[0]=>     string(9) "path:1267"     
[1]=>     string(35) "stockist/location/latitude:51.49998"   }

But the query to solR is wrong :

meta_path_si:1267 AND ( meta_contentclass_id_si:49 AND attr_location_t:51.49998 )

We see that the filter is on attr_location and not subattr_location-latitude !

What do I miss ?

--
eZ certified developer: http://ez.no/certification/verify/346216

Matthieu Sévère

Wednesday 24 March 2010 1:50:21 pm

Wouhou I suceed !

Ok, to sum up, to add subattribute filtering on a custom datatype, you need to :

  • Declare a new class for the custom mapping in ezfind.ini
  • Create a new class : ezfSolrDocumentFieldGmapLocation
  • Implement getDate method : this sould return your attribute and sub attributes following this syntaxe for subattribute :
return array( 'attr_location_t' => $location->attribute('address'),
               'subattr_location-latitude_f' => $location->attribute('latitude'),
               'subattr_location-longitude_f' => $location->attribute('longitude'));

Finally, add definition of new subattribute in : $subattributesDefinition

I'm not sure I've done it the right way but it works this way ;)

(If any ezfind master pass by a feedback is more than welcome to see what I missed, thanks !)

I'll try to publish that in projects.ez.no

Cheers !

--
eZ certified developer: http://ez.no/certification/verify/346216

Paul Borgermans

Wednesday 24 March 2010 2:50:54 pm

Hi Mathieu

I am working on a custom handler as well for the ezgmaplocation datatype

However, I'm heading for using the dedicated Solr geospatial fields as the dedicated boost/filter/... functions will land soon there

Here is the code so far, it relies on schema definitions added in ez find 2.2 and remains untested, however it may shed some light:

<?php

class ezfSolrDocumentFieldGmapLocation extends ezfSolrDocumentFieldBase
{
    public static $subattributesDefinition = array( self::DEFAULT_SUBATTRIBUTE => 'text',
                                                    'coordinates' => 'geopoint' );

    
    const DEFAULT_SUBATTRIBUTE = 'address';

    function __construct( eZContentObjectAttribute $attribute )
    {
        parent::__construct( $attribute );
    }

 
    public function getData()
    {
        $data = array();
        $contentClassAttribute = $this->ContentObjectAttribute->attribute( 'contentclass_attribute' );
        $subattributesDefinition = self::$subattributesDefinition;
        $gmapObject = $this->ContentObjectAttribute->attribute( 'content' );

        foreach ( $subattributesDefinition as $name => $type )
        {
            $fieldName = parent::generateSubattributeFieldName( $classAttribute, $name, $type );
            switch ($name)
            {
                case 'address':
                    $fieldValue = $gmapObject->attribute( 'address');
                    break;
                case 'coordinates':
                    $fieldValue = $gmapObject->attribute( 'latitude' ) . ',' . $gmapObject->attribute( 'longitude' );
                    break;
                default:
                    break;
            }
            $data[$fieldName] = $fieldValue;
        }
        return $data;
    }

    public static function getFieldName( eZContentClassAttribute $classAttribute, $subAttribute = null, $context = null )
    {
        if ( $subAttribute and
             $subAttribute !== '' and
             array_key_exists( $subAttribute, self::$subattributesDefinition ) and
             $subAttribute != self::DEFAULT_SUBATTRIBUTE )
        {
            return parent::generateSubattributeFieldName( $classAttribute,
                                                          $subAttribute,
                                                          self::$subattributesDefinition[$subAttribute] );
        }
        else
        {
            return parent::generateAttributeFieldName( $classAttribute,
                                                       self::$subattributesDefinition[self::DEFAULT_SUBATTRIBUTE] );
        }
    }

    public static function getFieldNameList( eZContentClassAttribute $classAttribute, $exclusiveTypeFilter = array() )
    {

        $subfields = array();

        //   Handle first the default subattribute
        $subattributesDefinition = self::$subattributesDefinition;
        if ( !in_array( $subattributesDefinition[self::DEFAULT_SUBATTRIBUTE], $exclusiveTypeFilter ) )
        {
            $subfields[] = parent::generateAttributeFieldName( $classAttribute, $subattributesDefinition[self::DEFAULT_SUBATTRIBUTE] );
        }
        unset( $subattributesDefinition[self::DEFAULT_SUBATTRIBUTE] );

        //   Then hanlde all other subattributes
        foreach ( $subattributesDefinition as $name => $type )
        {
            if ( empty( $exclusiveTypeFilter ) or !in_array( $type, $exclusiveTypeFilter ) )
            {
                $subfields[] = parent::generateSubattributeFieldName( $classAttribute, $name, $type );
            }
        }
        return $subfields;
    }
    static function getClassAttributeType( eZContentClassAttribute $classAttribute, $subAttribute = null )
    {
        if ( $subAttribute and
             $subAttribute !== '' and
             array_key_exists( $subAttribute, self::$subattributesDefinition ) )
        {
            return self::$subattributesDefinition[$subAttribute];
        }
        else
        {
            return self::$subattributesDefinition[self::DEFAULT_SUBATTRIBUTE];
        }
    }
}
?>

The geopoint field type is declared as follows:

 <fieldType name="geopoint" class="solr.PointType" dimension="2" subFieldTypes="double"/>

I'll add this when finished/cleaned/tested to a project ezfind-utils which will contain more that did not make it into eZ Find 2.2.0

hth

Paul

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

André R.

Thursday 25 March 2010 1:07:03 am

"I'll add this when finished/cleaned/tested to a project ezfind-utils which will contain more that did not make it into eZ Find 2.2.0"

Why not make it part of ezgmaplocation?

eZ Online Editor 5: http://projects.ez.no/ezoe || eZJSCore (Ajax): http://projects.ez.no/ezjscore || eZ Publish EE http://ez.no/eZPublish/eZ-Publish-Enterprise-Subscription
@: http://twitter.com/andrerom

Nicolas Pastorino

Thursday 25 March 2010 1:10:48 am

"I'll add this when finished/cleaned/tested to a project ezfind-utils which will contain more that did not make it into eZ Find 2.2.0"

Why not make it part of ezgmaplocation?

+1

--
Nicolas Pastorino
Director Community - eZ
Member of the Community Project Board

eZ Publish Community on twitter: http://twitter.com/ezcommunity

t : http://twitter.com/jeanvoye
G+ : http://plus.tl/jeanvoye

Matthieu Sévère

Thursday 25 March 2010 1:59:40 am

Thank you Paul, this looks very helpful !!

But this means you use solR 1.5 to take advantage of Solr Geogspatial fields ? (I think ezfind 2.2 is solR 1.4)

--
eZ certified developer: http://ez.no/certification/verify/346216

Nicolas Pastorino

Thursday 25 March 2010 2:21:48 am

Hi Matthieu,

Excellent to see you made your way.

This topic properly illustrates the 'subattribute' feature brought in eZ Find 2.1. As a rule of thumb, and also to speed up any subattribute-handler development, a good advise is to copy the classes/ezfsolrdocumentfielddummyexample.php and start from there. It contains the standard skeleton for both overloading the index-time methods (getData() overloading the metaData() in datatypes ) and also handling subattributes to a datatype, as in your case here. This is pretty much what Paul did above.

Sorry for being late on this, but here is a version for the ezgmaplocation subattribute-handler which relies on eZ Find 2.1 (meaning the coordinates are simple 'float' types, and no change is required in schema.xml ). It also is a bit more compact than the other example above.

<?php

class ezfSolrDocumentFieldGmapLocation extends ezfSolrDocumentFieldBase
{
    /**
     * Contains the definition of subattributes for this given datatype.
     * This associative array takes as key the name of the field, and as value
     * the type. The type must be picked amongst the value present as keys in the
     * following array :
     * ezfSolrDocumentFieldName::$FieldTypeMap
     *
     * WARNING : this definition *must* contain the default attribute's one as well.
     *
     * @see ezfSolrDocumentFieldName::$FieldTypeMap
     * @var array
     */
    public static $subattributesDefinition = array( 'longitude'                => 'float',
                                                    self::DEFAULT_SUBATTRIBUTE => 'float' );

    /**
     * The name of the default subattribute. It will be used when
     * this field is requested with no subfield refinement.
     *
     * @see ezfSolrDocumentFieldDummyExample::$subattributesDefinition
     * @var string
     */
    const DEFAULT_SUBATTRIBUTE = 'latitude';

    /**
     * @see ezfSolrDocumentFieldBase::__construct()
     */
    function __construct( eZContentObjectAttribute $attribute )
    {
        parent::__construct( $attribute );
    }

    /**
     * @see ezfSolrDocumentFieldBase::getData()
     */
    public function getData()
    {
        // @TODO : Extract data from the attribute, and format it as described in the doc link above.
        //         Dummy content here, for testing purposes.
        $data = array();
        $contentClassAttribute = $this->ContentObjectAttribute->attribute( 'contentclass_attribute' );
        $data[self::getFieldName( $contentClassAttribute, self::DEFAULT_SUBATTRIBUTE )] = $this->ContentObjectAttribute->attribute( 'content' )->attribute( 'latitude' );
        $data[self::getFieldName( $contentClassAttribute, 'longitude' )] = $this->ContentObjectAttribute->attribute( 'content' )->attribute( 'longitude' );
        return $data;
    }

    /**
     * @see ezfSolrDocumentFieldBase::getFieldName()
     */
    public static function getFieldName( eZContentClassAttribute $classAttribute, $subAttribute = null )
    {
        // article/location/ longitude
        if ( $subAttribute and
             $subAttribute !== '' and
             array_key_exists( $subAttribute, self::$subattributesDefinition ) and
             $subAttribute != self::DEFAULT_SUBATTRIBUTE )
        {
            // A subattribute was passed
            return parent::generateSubattributeFieldName( $classAttribute,
                                                          $subAttribute,
                                                          self::$subattributesDefinition[$subAttribute] );
        }
        else
        {
            // return the default field name here.
            return parent::generateAttributeFieldName( $classAttribute,
                                                       self::$subattributesDefinition[self::DEFAULT_SUBATTRIBUTE] );
        }
    }

    /**
     * @see ezfSolrDocumentFieldBase::getFieldNameList()
     */
    public static function getFieldNameList( eZContentClassAttribute $classAttribute, $exclusiveTypeFilter = array() )
    {
        // Generate the list of subfield names.
        $subfields = array();

        //   Handle first the default subattribute
        $subattributesDefinition = self::$subattributesDefinition;
        if ( !in_array( $subattributesDefinition[self::DEFAULT_SUBATTRIBUTE], $exclusiveTypeFilter ) )
        {
            $subfields[] = parent::generateAttributeFieldName( $classAttribute, $subattributesDefinition[self::DEFAULT_SUBATTRIBUTE] );
        }
        unset( $subattributesDefinition[self::DEFAULT_SUBATTRIBUTE] );

        //   Then hanlde all other subattributes
        foreach ( $subattributesDefinition as $name => $type )
        {
            if ( empty( $exclusiveTypeFilter ) or !in_array( $type, $exclusiveTypeFilter ) )
            {
                $subfields[] = parent::generateSubattributeFieldName( $classAttribute, $name, $type );
            }
        }
        return $subfields;
    }

    /**
     * @see ezfSolrDocumentFieldBase::getClassAttributeType()
     */
    public static function getClassAttributeType( eZContentClassAttribute $classAttribute, $subAttribute = null )
    {
        if ( $subAttribute and
             $subAttribute !== '' and
             array_key_exists( $subAttribute, self::$subattributesDefinition ) )
        {
            // If a subattribute's type is being explicitly requested :
            return self::$subattributesDefinition[$subAttribute];
        }
        else
        {
            // If no subattribute is passed, return the default subattribute's type :
            return self::$subattributesDefinition[self::DEFAULT_SUBATTRIBUTE];
        }
    }
}
?>

Thanks for sharing this Matthieu,
I am so excited about the new features coming in eZ Find 2.2, which i hope will not be fragmented, as evoked by Paul above.

Cheers!

--
Nicolas Pastorino
Director Community - eZ
Member of the Community Project Board

eZ Publish Community on twitter: http://twitter.com/ezcommunity

t : http://twitter.com/jeanvoye
G+ : http://plus.tl/jeanvoye

Matthieu Sévère

Thursday 25 March 2010 3:09:08 am

Thanks Nicolas !

I'm also excited about new features in 2.2 and I hope there will be a bit of doc or example ( ezfsolrdocumentfielddummyexample is very useful for developing new handlers!)

--
eZ certified developer: http://ez.no/certification/verify/346216

Paul Borgermans

Thursday 25 March 2010 3:46:25 am

"I'll add this when finished/cleaned/tested to a project ezfind-utils which will contain more that did not make it into eZ Find 2.2.0"

Why not make it part of ezgmaplocation?

Because we are in feature freeze now.

Having a seperate extension to install along the official ones is a cleaner approach IMO. And its also meant as a sandbox, with the idea to merge most if not all with the next major release of eZ Find. Also the handler for ezgmaplocation should go with ezfind itself, as there is a stronger dependency on the eZ Find API than on the datatype here.

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

Paul Borgermans

Thursday 25 March 2010 4:00:04 am

<snip>

I am so excited about the new features coming in eZ Find 2.2, which i hope will not be fragmented, as evoked by Paul above.

As explained above, we are in feature freeze and what will go in there will ultimately be merged with newer releases of the extensions.

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

Bertrand Dunogier

Thursday 25 March 2010 4:06:01 am

Having a seperate extension to install along the official ones is a cleaner approach IMO. And its also meant as a sandbox, with the idea to merge most if not all with the next major release of eZ Find. Also the handler for ezgmaplocation should go with ezfind itself, as there is a stronger dependency on the eZ Find API than on the datatype here.

Hmmm... I have to disagree here. The ezgmaplocation indexing handler really depends on ezgmaplocation, not ezfind: if you use ezgmaplocation, content has to be indexed, wether you're using ezfind or not. If you enable ezfind, content will be indexed using ezfind, if you don't, well, it won't. Adding the handler to ezfind creates a strong coupling that has no reason to be.

Bertrand Dunogier
eZ Systems Engineering, Lyon
http://twitter.com/bdunogier
http://gplus.to/BertrandDunogier

Paul Borgermans

Thursday 25 March 2010 4:16:14 am

Thank you Paul, this looks very helpful !!

But this means you use solR 1.5 to take advantage of Solr Geogspatial fields ? (I think ezfind 2.2 is solR 1.4)

eZ Find 2.2.0 has a carefully chosen Solr trunk which carries quite some bug fixes and new filter functions (including ones for spatial search!). I hoped a new release of Solr was already out: mainly for "near realtime search" (updating is still quite "expensive" in terms of CPU), "field collapsing" and the full set of geospatial feature. But that will be for the next release of eZ Find.

Cheers

Paul

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

Nicolas Pastorino

Thursday 25 March 2010 4:21:21 am

"I'll add this when finished/cleaned/tested to a project ezfind-utils which will contain more that did not make it into eZ Find 2.2.0"

Why not make it part of ezgmaplocation?

Because we are in feature freeze now.

Having a seperate extension to install along the official ones is a cleaner approach IMO. And its also meant as a sandbox, with the idea to merge most if not all with the next major release of eZ Find. Also the handler for ezgmaplocation should go with ezfind itself, as there is a stronger dependency on the eZ Find API than on the datatype here.

Reacting on the second point here :
when developing this feature, i made sure basic OOP/software design concepts were applied, as much as possible. One of them is "Low coupling, High cohesion", which was deliberately implemented within the subattribute feature.

This concretely means that :

  • the subattribute handlers are used by eZ Publish only when eZ Find is activated
  • anyone developing his own extension can join a bunch of subattribute handlers, if they think they want to functionally enhance the search-related behavior of their datatype when, and only when eZ Find is enabled. These handler belong to the specific extension rather than to the ezfind extension, functionally speaking. They do not interfere by any means when eZ Find is not enabled.

This "Low coupling, High cohesion" thing, driven by the use case of "Enhancing the search experience with eZ Find, by refining a datatype's behavior", is what lets me vote for having extension-specific subatribute-handlers outside the eZ Find extension.

My 2 cents :)
Cheers !

--
Nicolas Pastorino
Director Community - eZ
Member of the Community Project Board

eZ Publish Community on twitter: http://twitter.com/ezcommunity

t : http://twitter.com/jeanvoye
G+ : http://plus.tl/jeanvoye

Paul Borgermans

Thursday 25 March 2010 4:27:33 am

Having a seperate extension to install along the official ones is a cleaner approach IMO. And its also meant as a sandbox, with the idea to merge most if not all with the next major release of eZ Find. Also the handler for ezgmaplocation should go with ezfind itself, as there is a stronger dependency on the eZ Find API than on the datatype here.

Hmmm... I have to disagree here. The ezgmaplocation indexing handler really depends on ezgmaplocation, not ezfind: if you use ezgmaplocation, content has to be indexed, wether you're using ezfind or not. If you enable ezfind, content will be indexed using ezfind, if you don't, well, it won't. Adding the handler to ezfind creates a strong coupling that has no reason to be.

Well, the handler is dedicated to eZ Find (and its API which may change with releases), just as the ones for ezxml, ezmatrix, ezobjectrelation which are bundled today, so where should those go? In the kernel? The handlers with eZ Find replace the standard datatype calls for indexing ...

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

Bertrand Dunogier

Thursday 25 March 2010 4:37:37 am

Honestly, this doesn't prove much. Indeed, the handlers for complex, native eZ Publish datatypes are part of eZ Find. But there is a big difference between native & extension features. The question is interesting, though.

For kernel datatypes, we are indeed right to place the handlers in ezfind itself. Without ezfind, these make no sense in the kernel itself, and this is the choice that produces the weakest coupling, and that is good. But for extensions, I still have to disagree.

The API changes is a wrong reason. We have managed to keep the eZ Publish "public" API backward compatible for years, and we have to do the same for eZ Find anyway, since 3rd party extensions might depend on them.

Bertrand Dunogier
eZ Systems Engineering, Lyon
http://twitter.com/bdunogier
http://gplus.to/BertrandDunogier

Paul Borgermans

Thursday 25 March 2010 4:41:09 am

<snip>

This "Low coupling, High cohesion" thing, driven by the use case of "Enhancing the search experience with eZ Find, by refining a datatype's behavior", is what lets me vote for having extension-specific subatribute-handlers outside the eZ Find extension.

My 2 cents :)
Cheers !

Yes, for the contributed datatype extensions, these handlers belong within their extensions. ezgmaplocation and some others (enhanced selection, ...) may be merged with the kernel.

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

Ivo Lukac

Thursday 10 June 2010 9:33:59 am

Just a note on this subject:

Currently there is a open bug in Solr: https://issues.apache.org/jira/browse/SOLR-1172

It is not possible to use geo point fields in solr queries because of the problem with dash ('-') in the generated field name, e.g. subattr_gmaps_location-coordinates_gpt

Ugly and fast workaround is to have an extra field and a copyfield statement in schema.xml. Name of the extra field should not have dashes....

http://www.linkedin.com/in/ivolukac
http://www.netgen.hr/eng/blog
http://twitter.com/ilukac

Paul Borgermans

Monday 14 June 2010 1:27:17 pm

Indeed, and the geo seaching in Solr stalled a bit the past weeks, still waiting for a bug free LatLon field type for example

Cheers

Paul

eZ Publish, eZ Find, Solr expert consulting and training
http://twitter.com/paulborgermans

You must be logged in to post messages in this topic!

36 542 Users on board!

Forums menu