First of all, I must report that I am a Product Manager for Lucidworks Fusion.
You seem to already know that Fusion works with Solr (or one or more clusters or Solr instances), using Solr to store and query data. Fusion's goal is to simplify the use of Solr, integrate Solr, and build sophisticated solutions that use Solr. Some of the things that Fusion ensures that many people find useful for this include:
- Connectors and connecting structure. Bare Solr gives you a good API and the ability to push certain types of files on the command line. Fusion comes with several built-in data sources that retrieve data from various types of systems, process it as needed (including parsing, transforming, and mapping fields) and send the results to Solr. These connectors include shared document stores (cloud and on-premises), relational databases, NoSQL data warehouses, HDFS, enterprise applications, and a very powerful and highly customizable web crawler.
- Security integration. Solr does not have authentication or authorization (although since version 5.2 this week it has a plug-in API and a basic Kerberos implementation for authentication). Fusion wraps Solr interfaces with a secure version. Fusion has pure integration in LDAP, Active Directory and Kerberos for authentication. It also has a fine-grained authorization model for managing and customizing Fusion and Solr. And the Fusion authorization model can automatically associate LDAP / AD group memberships with access control lists from Fusion Connectors data sources so you can control the level of access at the document level from the source systems when you run search queries.
- Pipeline processing model. Fusion provides a pipeline model with modular steps (both in API forms and in the GUI format) to simplify the definition and editing of data and document transformations. This is similar to unix shells. For example, when indexing, you can include steps for defining field mappings, calculating new fields, aggregated documents, pulling data from other sources, etc., before writing to Solr. When prompted, you can do the same, along with query conversion, triggering and returning the results of other analytical tools, and applying security filtering.
- Admin GUI. Fusion has a web interface to view and configure the above (as well as the basic Solr configurator). We think this is convenient for people who want to use Solr but don't use it regularly enough to remember how to use the APIs, configuration files, and command line tools.
- Sophisticated search features. Using the pipeline model described above, Fusion includes (and simplifies use) some of the richer components based on search engines, including: natural language processing and object removal processes; Adjust the relevance of signals in real time. In the future we intend to provide more data.
- Processing Google Analytics: Fusion includes and integrates Apache Spark for deep analytics regarding data stored in Solr (or on the way to Solr). Although Solr implicitly includes certain data analytics capabilities, this is not its primary purpose. We use Apache Spark to extract Fusion signals and adjust relevance, and expect you to open APIs so that users can easily perform other processing there.
- Other: many useful features, such as: dashboarding UI; Basic search interface with manual relevance adjustment simpler monitoring; job management and planning; real-time alert with email integration, etc.
Many of the above, of course, can be built or written against Solr without Fusion, but we believe that providing this kind of enterprise integration will be valuable to many people.
source share