Drupal Search Funkiness

I've been noticing that the search feature of this Drupal blog has been acting up for awhile – searching for "drupal" turns up only 4 items, but I've written many many posts mentioning Drupal. I didn't think it was a big deal, but I've actually been getting emails and IMs asking me wtf wrt searching.

So, I dug a bit deeper. Turns out, Drupal is refusing to index my content when cron.php is called. It's called every hour, but the /admin/settings/search status indicator is stuck at:

Drupal Search Index Not Updating: Taken on 2006/06/07, showing the search index not updating, even though cron.php is called every hour (and I even manually triggered it several times) and the number of items to process is turned down to 10.Drupal Search Index Not Updating: Taken on 2006/06/07, showing the search index not updating, even though cron.php is called every hour (and I even manually triggered it several times) and the number of items to process is turned down to 10.

Some poking around on the Drupal site didn't turn up anything useful. I'll keep poking around to hopefully find out wtf is going on with searching. It's a puzzler…

Update: Temporarily fixed. Something's definitely borked. It's only updating the first batch of nodes, even if cron.php is called multiple times. The hack fix involves editing search.module to allow larger batches so all nodes make it into the first run. I added a "2000" item to the $items array on line 217, then cleared the old index by clicking the "Re-index site" button. Manually called cron.php and let it chew, and now all nodes are properly indexed. No idea if I'll have to keep re-indexing. That would be an ugly hack…

Update the Second: Looks like everything's updating ok now… I'll try dropping the batch size back down to a sane value to see if it still works (or if it really is just indexing the first batch of records only)

Update the Third: Yeah. All's well now. New content is being automatically indexed, and all old content is properly indexed. Wonder what happened…

Drupal Search Funkiness - resolved: it's now 100% indexed. no idea what was wrong before...Drupal Search Funkiness – resolved: it's now 100% indexed. no idea what was wrong before…

I've been noticing that the search feature of this Drupal blog has been acting up for awhile – searching for "drupal" turns up only 4 items, but I've written many many posts mentioning Drupal. I didn't think it was a big deal, but I've actually been getting emails and IMs asking me wtf wrt searching.

So, I dug a bit deeper. Turns out, Drupal is refusing to index my content when cron.php is called. It's called every hour, but the /admin/settings/search status indicator is stuck at:

Drupal Search Index Not Updating: Taken on 2006/06/07, showing the search index not updating, even though cron.php is called every hour (and I even manually triggered it several times) and the number of items to process is turned down to 10.Drupal Search Index Not Updating: Taken on 2006/06/07, showing the search index not updating, even though cron.php is called every hour (and I even manually triggered it several times) and the number of items to process is turned down to 10.

Some poking around on the Drupal site didn't turn up anything useful. I'll keep poking around to hopefully find out wtf is going on with searching. It's a puzzler…

Update: Temporarily fixed. Something's definitely borked. It's only updating the first batch of nodes, even if cron.php is called multiple times. The hack fix involves editing search.module to allow larger batches so all nodes make it into the first run. I added a "2000" item to the $items array on line 217, then cleared the old index by clicking the "Re-index site" button. Manually called cron.php and let it chew, and now all nodes are properly indexed. No idea if I'll have to keep re-indexing. That would be an ugly hack…

Update the Second: Looks like everything's updating ok now… I'll try dropping the batch size back down to a sane value to see if it still works (or if it really is just indexing the first batch of records only)

Update the Third: Yeah. All's well now. New content is being automatically indexed, and all old content is properly indexed. Wonder what happened…

Drupal Search Funkiness - resolved: it's now 100% indexed. no idea what was wrong before...Drupal Search Funkiness – resolved: it's now 100% indexed. no idea what was wrong before…

9 thoughts on “Drupal Search Funkiness”

  1. Thanks so much for putting me on the right track for this. Your suggestion of reducing the index batch size worked – eventually – seems there’s also a delay while the cron_busy variable is true. Finally managed to get mine to restart reindexing – I’m now using cron to call the cron.php script every 2 minutes with a batch of 20! I can’t use a larger one as the max script execution time is specifically configured low on our server.

    Just wanted to let you & your readers know of another resolution for related issues – if you don’t have comments enabled, or a post ha no comments, a node won’t be indexed (even if the search status page shows 100%), due to a combination of a NULL from a JOIN, and 5.0.13 MySQL changing how GREATEST works when one of the values is NULL. The fix is buried here: http://drupal.org/node/139537 – involves editing node.module to change c.last_comment_timestamp for COALESCE(c.last_comment_timestamp,0) in 4-5 separate GREATEST(…) statements.

    Note that I’m on 4.7, but this was still an issue in 5 and patched/worked around for something relatively recent.

  2. I had to manually run cron.php up to 7 different times to get it to 100% index all items. What I started to notice after the second time I manually ran cron.php and checked the Indexing status, the number of items remaining was always half of what it previously was…. so it was something like 34 items left, 17 items left, 8 items left, 4 items left, 2 items left, then 0 items left.

    I consider this to be very odd and unintended behavior.

  3. Good work-around…I boosted it up a bit as I have 150k nodes so we’ll see if we get a time out 🙂

    Thank you

  4. I had this problem as well. Increasing the limit also solved the problem (temporarily). I had to find the root or the problem since the site was for CNN. Keep in mind this site is highly customized drupal environment via /sites/, but your problems may be caused from the same thing.

    We are only indexing two types of nodes, ‘listings’ and ‘articles’. Everytime you create a listing or article in our system it creates multiple ‘pointer’ nodes which we are not indexing. It would index the first couple listing and articles fine but then stop and never move forward on each following cron. The limit was set to 500. What was happening is everytime it would find the ‘pointer’ nodes and try to index them, they wouldn’t be indexed. Then it would run again and select the same pointers. Only by increasing the limit to a size greater than the ‘pointer’ node count could it move forward. The problem is that the ‘pointer’ node count will forever be increasing, so upping the limit is not a good fix.

    What we had to do: Move the /nodes/ module to /sites/ and add a type select ( AND (type = ‘article’ OR type = ‘listing’)) under the function, node_update_index(). Not elegant at all, but a must for our time constraints.

    Good luck!

  5. Eric, if you can, I would be glad to see how your final select line stayed with your end clause. I am having trouble to properly implement your solution

Comments are closed.