I am using Lucene in PHP (using the implementation of the Zend Framework). I am having a problem with the fact that I cannot perform a search in a field that contains a number.
Here is the data in the index:
ts | contents
-------------- + -----------------
1236917100 | dog cat gerbil
1236630752 | cow pig goat
1235680249 | lion tiger bear
nonnumeric | bass goby trout
My problem : a query for " ts:1236630752" does not return any hits. However, a query for " ts:nonnumeric" returns a hit.
I store "ts" as a keyword field, which according to the documentation is "not tokenized, but indexed and saved. For non-text fields, for example, date or URL". I tried to treat this as a text field, but the behavior is the same, except that the query for " ts:*" returns nothing when ts is text.
I am using Zend Framework 1.7 (just downloaded the last 3 days ago) and PHP 5.2.9. Here is my code:
<?php
set_include_path(realpath('../library') . PATH_SEPARATOR . get_include_path());
require_once('Zend/Loader.php');
Zend_Loader::registerAutoload();
define('SEARCH_INDEX', 'test_search_index');
if(file_exists(SEARCH_INDEX))
foreach(scandir(SEARCH_INDEX) as $file)
if(!is_dir($file))
unlink(SEARCH_INDEX . "/$file");
$index = Zend_Search_Lucene::create(SEARCH_INDEX);
function add_to_index($index, $ts, $contents) {
$doc = new Zend_Search_Lucene_Document();
$doc->addField(Zend_Search_Lucene_Field::Keyword('ts', $ts));
$doc->addField(Zend_Search_Lucene_Field::Text('contents', $contents));
$index->addDocument($doc);
}
add_to_index($index, '1236917100', 'dog cat gerbil');
add_to_index($index, '1236630752', 'cow pig goat');
add_to_index($index, '1235680249', 'lion tiger bear');
add_to_index($index, 'nonnumeric', 'bass goby trout');
echo '<html><body><pre>';
function run_query($index, $query) {
echo "Running query: $query\n";
$hits = $index->find($query);
echo 'Got ' . count($hits) . " hits.\n";
foreach($hits as $hit)
echo " ts='$hit->ts', contents='$hit->contents'\n";
echo "\n";
}
run_query($index, 'pig');
run_query($index, 'ts:1236630752');
run_query($index, '1236630752');
run_query($index, 'ts:pig');
run_query($index, 'contents:pig');
run_query($index, 'ts:[1236630700 TO 1236630800]');
run_query($index, 'ts:*');
run_query($index, 'nonnumeric');
run_query($index, 'ts:nonnumeric');
run_query($index, 'trout');
Exit
Running query: pig
Got 1 hits.
ts = '1236630752', contents = 'cow pig goat'
Running query: ts: 1236630752
Got 0 hits.
Running query: 1236630752
Got 0 hits.
Running query: ts: pig
Got 0 hits.
Running query: contents: pig
Got 1 hits.
ts = '1236630752', contents = 'cow pig goat'
Running query: ts: [1236630700 TO 1236630800]
Got 0 hits.
Running query: ts: *
Got 4 hits.
ts='1236917100', contents='dog cat gerbil'
ts='1236630752', contents='cow pig goat'
ts='1235680249', contents='lion tiger bear'
ts='nonnumeric', contents='bass goby trout'
Running query: nonnumeric
Got 1 hits.
ts='nonnumeric', contents='bass goby trout'
Running query: ts:nonnumeric
Got 1 hits.
ts='nonnumeric', contents='bass goby trout'
Running query: trout
Got 1 hits.
ts='nonnumeric', contents='bass goby trout'