Rules of association with the pandas framework

I have such a data frame

df = pd.DataFrame(data=[980,169,104,74], columns=['Count'], index=['X,Y,Z', 'X,Z','X','Y,Z'])

           Count
X, Y, Z      980
X,Z          169
X            104
Y,Z           74

I want to be able to extract associations from this rule. I saw that the Apriori algorithm is a link. They also found that the Orange Data Mining Library is well known in the field.

But the problem is that in order to use AssociationRulesInducer I need to first create a file containing all the transactions. Since my data set is really huge (20 columns and 5 million rows), it will be too expensive to write all this data to a file and read it again using Orange.

Do you have any ideas how I can use the existing dataframe structure to search for association rules?

+4
source share
1 answer

The new Orange3-Associate add - on for Orange's data hosting set seems to include widgets and code that flashes with frequent sets of items (and of these association rules ) even from sparse arrays or lists of lists that may work for you.

With 5mm lines, it would be nice if that happened. :)

+2
source

Source: https://habr.com/ru/post/1615421/


All Articles