Wednesday, March 21, 2012

how clustering works?

Hi,

I am having data like this

Studid Date Perf

001 01/01/2008 90

001 02/01/2008 89

001 03/02/2008 91

002 01/01/2008 75

002 02/01/2008 79

002 03/02/2008 69

I gave Perf as PREDICT. When I use the

"SELECT * FROM [Cluster_Model]"

Query I am getting

Perf

82.

Can anyone help me how clustering works? and how to write a Query to group the values here based on StudId?

Based on the query result, it seem that your model uses Perf as a Predictable column.

First, how to solve the clustering problem:

Make sure the Studid column is used as an Input attribute (and not a key). If you only want to use Studid to build clusters, then ignore all other columns.

Once the model is trained, the cluster of a new data point can be determined with a query like:

SELECT Cluster() FROM [Cluster_Model] PREDICTION JOIN <new data>

Note that Cluster() is a function computed by the model on top of new data, and not a predictable variable.

If you need to see the distance from the new data point to all the clusters, then the query should look like:

SELECT PredictHistogram( Cluster() ) FROM [Cluster_Model] PREDICTION JOIN <new data>

Now, here is what the query does (and explanation of the results):

The query executes a prediction, based on the model, against new data (general DMX syntax, not related to clustering yet). The prediction is executed for all the predictable attributes of the mining model (*), i.e. Perf and the new data is empty (all attributes have the missing state).

Now, here is how Clustering prediction works: the algorithm computes the distances from the input data point to all the clusters, then predicts the target attribute by using a weighted average between the distributions of the target attribute across all the clusters.

If you want to cluster by Studentid only, but still make predictions for Perf (based on the distribution of Perf in the clusters), then make sure that Studentid is Input and Perf is Predict Only

Hope this helps

No comments:

Post a Comment