RavenDb pending performance for document requests, millions

I was able to download several million documents with the built-in version of RavenDb, pretty smooth !.

Now I'm trying to query for these elements, and I found that performance is not what I expected, almost instantly, if possible, but instead more than 18 seconds on a rather muscular machine.

Below you will find my naive code.

Note. I resolved this, and the last code is at the bottom of the post. To remove this is that you need indexes, they must be of the correct type, and RavenDB must be aware of them. VERY satisfied with the performance and quality of the returned records through the query mechanism.

Thanks Steven

using (var store = new EmbeddableDocumentStore { DataDirectory = @"C:\temp\ravendata" }.Initialize()) { using (IDocumentSession session = store.OpenSession()) { var q = session.Query<Product>().Where(x => x.INFO2.StartsWith("SYS")).ToList(); } } [Serializable] public class Product { public decimal ProductId { get; set; } .... public string INFO2 { get; set; } } 

EDIT

I added this class

 public class InfoIndex_Search : AbstractIndexCreationTask<Product> { public InfoIndex_Search() { Map = products => from p in products select new { Info2Index = p.INFO2 }; Index(x => x.INFO2, FieldIndexing.Analyzed); } } 

and changed the calling method this way.

  using (var store = new EmbeddableDocumentStore { DataDirectory = @"C:\temp\ravendata" }.Initialize()) { // Tell Raven to create our indexes. IndexCreation.CreateIndexes(Assembly.GetExecutingAssembly(), store); List<Product> q = null; using (IDocumentSession session = store.OpenSession()) { q = session.Query<Product>().Where(x => x.INFO2.StartsWith("SYS")).ToList(); watch.Stop(); } } 

But I'm still reporting 18 seconds to do the search. What am I missing? On the other hand, there are quite a few new files in the C: \ temp \ ravendata \ Indexes \ InfoIndex% 2fSearch folder, although not as many as when I inserted the data, they seem to have disappeared after running this code several times trying to execute the query . If IndexCreation.CreateIndexes (Assembly.GetExecutingAssembly (), storage); call before insertion and only then?

EDIT1

Using this code, I was able to get the request almost in an instance, but it seems you can only run it once to ask a question. Where does it start and what are the correct initialization procedures?

 store.DatabaseCommands.PutIndex("ProdcustByInfo2", new IndexDefinitionBuilder<Product> { Map = products => from product in products select new { product.INFO2 }, Indexes = { { x => x.INFO2, FieldIndexing.Analyzed} } }); 

EDIT2: working example

 static void Main() { Stopwatch watch = Stopwatch.StartNew(); int q = 0; using (var store = new EmbeddableDocumentStore { DataDirectory = @"C:\temp\ravendata" }.Initialize()) { if (store.DatabaseCommands.GetIndex("ProdcustByInfo2") == null) { store.DatabaseCommands.PutIndex("ProdcustByInfo2", new IndexDefinitionBuilder<Product> { Map = products => from product in products select new { product.INFO2 }, Indexes = { { x => x.INFO2, FieldIndexing.Analyzed } } }); } watch.Stop(); Console.WriteLine("Time elapsed to create index {0}{1}", watch.ElapsedMilliseconds, System.Environment.NewLine); watch = Stopwatch.StartNew(); using (IDocumentSession session = store.OpenSession()) { q = session.Query<Product>().Count(); } watch.Stop(); Console.WriteLine("Time elapsed to query for products values {0}{1}", watch.ElapsedMilliseconds, System.Environment.NewLine); Console.WriteLine("Total number of products loaded: {0}{1}", q, System.Environment.NewLine); if (q == 0) { watch = Stopwatch.StartNew(); var productsList = Parsers.GetProducts().ToList(); watch.Stop(); Console.WriteLine("Time elapsed to parse: {0}{1}", watch.ElapsedMilliseconds, System.Environment.NewLine); Console.WriteLine("Total number of items parsed: {0}{1}", productsList.Count, System.Environment.NewLine); watch = Stopwatch.StartNew(); productsList.RemoveAll(_ => _ == null); watch.Stop(); Console.WriteLine("Time elapsed to remove null values {0}{1}", watch.ElapsedMilliseconds, System.Environment.NewLine); Console.WriteLine("Total number of items loaded: {0}{1}", productsList.Count, System.Environment.NewLine); watch = Stopwatch.StartNew(); int batch = 0; var session = store.OpenSession(); foreach (var product in productsList) { batch++; session.Store(product); if (batch % 128 == 0) { session.SaveChanges(); session.Dispose(); session = store.OpenSession(); } } session.SaveChanges(); session.Dispose(); watch.Stop(); Console.WriteLine("Time elapsed to populate db from collection {0}{1}", watch.ElapsedMilliseconds, System.Environment.NewLine); } watch = Stopwatch.StartNew(); using (IDocumentSession session = store.OpenSession()) { q = session.Query<Product>().Where(x => x.INFO2.StartsWith("SYS")).Count(); } watch.Stop(); Console.WriteLine("Time elapsed to query for term {0}{1}", watch.ElapsedMilliseconds, System.Environment.NewLine); Console.WriteLine("Total number of items found: {0}{1}", q, System.Environment.NewLine); } Console.ReadLine(); } 
+4
source share
2 answers

First, do you have an index covering INFO2?

Secondly, see Daniel Lang "Row Search in RavenDB" here:

http://daniellang.net/searching-on-string-properties-in-ravendb/

If that helps, here is how I created the index:

 public class LogMessageCreatedTime : AbstractIndexCreationTask<LogMessage> { public LogMessageCreatedTime() { Map = messages => from message in messages select new { MessageCreatedTime = message.MessageCreatedTime }; } } 

And as I added it at runtime:

 private static DocumentStore GetDatabase() { DocumentStore documentStore = new DocumentStore(); try { documentStore.ConnectionStringName = "RavenDb"; documentStore.Initialize(); // Tell Raven to create our indexes. IndexCreation.CreateIndexes(typeof(DataAccessFactory).Assembly, documentStore); } catch { documentStore.Dispose(); throw; } return documentStore; } 

In my case, I did not have to explicitly request the index; It was just used when I usually asked.

+6
source

As Bob suggests, you must ensure that you create indexes in Raven that will span the fields you want to query.

Raven is pretty fast and can let you go pretty far without requiring a lot. However, as soon as you start getting large document numbers or want something other than the default, you will find that you need static indexes.

There are many examples of setting up and using indexes in Raven.

0
source

Source: https://habr.com/ru/post/1403106/


All Articles