Full-text search on MongoDB GridFS?

Say, if I want to store PDF files or ePub files using MongoDB GridFS, is it possible to perform full-text search in data files?

+6
source share
2 answers

You cannot currently perform a full text search in mongo: http://www.mongodb.org/display/DOCS/Full+Text+Search+in+Mongo

You can vote for him here: https://jira.mongodb.org/browse/SERVER-380

Mongo is more likely a scalable general-purpose data warehouse, and yet it does not have full text search support. Depending on your use case, you can use standard b-tree indexes with an array of all the words in the text, but it will not perform interrupts or fuzzy matches, etc.

However, I would recommend combining mongodb with a lucene-based application (a popular popular search for elasticity). You can store all your data in mongodb (binary data, metadata, etc.), and then index the plain text of your documents in lucene. Or, if your use case is full-text search, you might consider using elastic search instead of mongodb.

Update (April 2013): MongoDB 2.4 now supports basic full-text index! Below are some useful resources.

http://docs.mongodb.org/manual/applications/text-search/

http://docs.mongodb.org/manual/reference/command/text/#dbcmd.text

http://blog.mongohq.com/blog/2013/01/22/first-week-with-mongodb-2-dot-4-development-release/

+3
source

Do not use the MongoDB API, not what I know. GridFS seems to be more like a simplified file system with APIs that provides a simple semantic key-value. On their project ideas page, they list two things that will help you if they exist in a finished state:

  • GridFS FUSE , which allows you to mount GridFS as a local file system and then index it, as if you were indexing files on your disk
  • Integrate real-time search in real time with tools like Lucene and Solr . There are several github and bitbucket projects you can check out.

Also check out ElasticSearch . I saw some integration with Mongo , but I'm not sure how much has been done to use GridFS (mention of support for attachments in GridFS, but I did not work with it to know for sure). Maybe you will become the one to build it, and then open it? must be a fun adventure

0
source

Source: https://habr.com/ru/post/915142/


All Articles