Summary. Most of the examples I've seen in MDX connections are related to relatively small sets, for example, tens or hundreds of elements. But I am also in the fact that I want to try to join (in particular, "non-empty connection") sets, each of which contains thousands or tens of thousands of units, and so far it does not work. I am wondering if this can be done, or if I might have to use something other than Mondrian / OLAP.
To be specific, I have a cube that registers interactions between Firms (n = 7000) and Clients (n = 27000). At present, both the Company and the Client are completely flat hierarchies; there is a level of "All" and "individual", without other levels. There is a table of basic facts and separate size tables for companies and customers.
My users, at least, want to receive summary reports on these lines, combining all non-empty interactions between Firms and Clients:
select [Measures].[Amount] on columns, NonEmptyCrossJoin([Firm].Children, [Client].Children) on rows from MyCube
But this request and its variations do not work in my Mondrian test setup. Either I get an OutOfMemoryException (on a 2GB Java heap), or Java seems to spend an incredibly long time in the file mondrian.rolap.RolapResult $ AxisMember.mergeTuple (TupleCursor). (I could provide a more complete stack trace if that would help.) “Impossibly long” I mean that Java will be distracted by the request for hours and hours before I give up.
At first, I expected that the above query will be executed normally, because conceptually this can be done somewhat efficiently, simply by executing the SQL query in these lines:
select Firm, Client, Sum(Amount) as n from fact, firm, client where fact.firmid = firm.firmid and fact.clientid = client.clientid group by Firm, Client
(In fact, if I execute something similar directly in MySql, it does not take more than 15 seconds to execute).
But from the debug logs, Mondrian does not seem to be trying this optimization. Instead, it seems to make the connection inside, and in a way that is especially slow. I set mondrian.native.crossjoin.enable = true in my mondrian.properties files, but this is not like one of the types of connections that Mondrian can "make native". (If I turn on mondrian.native.unsupported.alert = ERROR, then I will get the corresponding exception.)
I have to wonder if I need to prevent users from trying to join such large sizes / sets, or maybe Mondrian is not the tool I'm looking for here. But maybe I'm just doing something wrong.