R
使用 arules 為新數據尋找合適的規則
我正在使用 R(和 arules 包)來挖掘關聯規則的交易。我想做的是構建規則,然後將它們應用於新數據。
例如,假設我有很多規則,其中之一是規範的
{Beer=YES} -> {Diapers=YES}
.然後我有新的交易數據,其中一條記錄購買了啤酒但沒有購買尿布。如何識別滿足 LHS 但尚未滿足 RHS 的規則?
示例:
install.packages("arules") library(arules) data("Groceries") **#generate Rules omitting second record** rules <- apriori(Groceries[-2],parameter = list(supp = 0.05, conf = 0.2,target = "rules"))
生成的規則是:
> inspect(rules) lhs rhs support confidence lift 1 {} => {whole milk} 0.25554200 0.2555420 1.000000 2 {yogurt} => {whole milk} 0.05603010 0.4018964 1.572722 3 {whole milk} => {yogurt} 0.05603010 0.2192598 1.572722 4 {rolls/buns} => {whole milk} 0.05664023 0.3079049 1.204909 5 {whole milk} => {rolls/buns} 0.05664023 0.2216474 1.204909 6 {other vegetables} => {whole milk} 0.07484238 0.3867578 1.513480 7 {whole milk} => {other vegetables} 0.07484238 0.2928770 1.513480
第二筆交易顯示該客戶,因為他們有酸奶但沒有全脂牛奶,可能應該發送牛奶優惠券。如何為新交易找到“規則”中的任何適用規則?
> LIST(Groceries[2]) [[1]] [1] "tropical fruit" "yogurt" "coffee"
關鍵是同一個包中的is.subset-function
這是代碼…
basket <- Groceries[2] # find all rules, where the lhs is a subset of the current basket rulesMatchLHS <- is.subset(rules@lhs,basket) # and the rhs is NOT a subset of the current basket (so that some items are left as potential recommendation) suitableRules <- rulesMatchLHS & !(is.subset(rules@rhs,basket)) # here they are inspect(rules[suitableRules]) # now extract the matching rhs ... recommendations <- strsplit(LIST(rules[suitableRules]@rhs)[[1]],split=" ") recommendations <- lapply(recommendations,function(x){paste(x,collapse=" ")}) recommendations <- as.character(recommendations) # ... and remove all items which are already in the basket recommendations <- recommendations[!sapply(recommendations,function(x){basket %in% x})] print(recommendations)
和生成的輸出……
> inspect(rules[suitableRules]) lhs rhs support confidence lift 1 {} => {whole milk} 0.2555420 0.2555420 1.000000 2 {yogurt} => {whole milk} 0.0560301 0.4018964 1.572722 > print(recommendations) [1] "whole milk"