lowest-level data in the cube? I.E., before it is aggregated?
What I'm trying to do is find all of the rows where an amount is >
1000. However, when I create a calculated member or constrain in
any way, it compares to the aggregated amount. I have tried a few
ways, but they all seem to act only on the aggregated data.
Thoughts?
BTW, the SQL would be:
Select sum(measure), count(measure)
from fact_table
where measure > 1000
Thanks,
Doug
Just to understand your issue:
For example, you have sales at a daily level, some of which are <1000 some over 1000. When you roll these up to the week, month, etc , they aggregate as expected.For a given higher-level time period, do you want to sum the sales of all the days with sales<1000, and also count the number of these days? (eg in a given week where only two days are <1000 the sum might return 750 and the count 2)
It's not actually trivial:
The way I would do it would be to use DESCENDANTS to return a set of the lowest level members that are underneath your current member, then FILTER that set to pick out those where [measures].[sales] <1000 then finally SUM the values from that set.
This is a bit cumbersome to write, but will work. Check out BOL for the syntax, (or I'll post a code sample when I can check BOL to make sure I get the syntax right!)
Anyone knows a more efficient way of doing this I would appreciate it.
Richard R.
|||Thanks Richard. Your suggestion was definitely helpful, but
ultimately it ended up being way too slow because I need to span
hundreds of thousands of records, and to do that it essentially meant
creating the cartesian that described them! Anyway, what I've
done for now is to create another dimension that has ranges what the
sum of the measures are. I.E. sales range (0-49, 50-99, 100-499,
etc.). and amm using those to constrain. I guess this makes
sense, since it is a constraint, not a measure, but it is definitely
frustrating to have to do it this way. The major drawback is that
I have to now create a dimension attribute for every type of sale, and
combination of sales types, to be able to properly analyze the
data.
If anyone finds an easier way to do this, please let me know!
Thanks,
Doug|||
Hi again Doug,
I don't know if this is will be useful or not in real life, but I've found a feature in AS2005 which automatically clusters continuous variables into groups for use as dimensions - which is pretty much exactly what you want.
Create a new dimension using your fact table as the source, and your sales value as the dimension attribute. On the properties box of the attribute is a wonderfully-named property "DiscretizationBucketCount", and underneath is is "DiscretizationMethod".
The first determines the number of buckets the values will be separated into (eg 7 groups), the second the method (Automatic, Even-spaced, Clustered)
I tried it on a 14 million fact table row, asking it to cluster the value spend attribute into 7 buckets automatically, and the dimension processed in a few minutes (Fast P4 laptop 1GB RAM). The resulting clusters were not that elegant, as they had the decimal points on the range names (eg 0-15.79, 15.89 - 99.34), but they are good enough for the evaluation cubes I'm delivering to the customer tomorrow :-D
I'll dig a bit more and see if the names or clusters can be frigged manually. It's all XML behind the scenes, so I'm sure it can.
Anyone out there want to REALLY show off their expertise?
Regards,
Richard
No comments:
Post a Comment