To install StudyMoose App tap and then “Add to Home Screen”
Save to my list
Remove from my list
This lab report presents the results of experiments conducted to evaluate two different methods, HUI-PR and EFIM, for discovering high utility itemsets using various real datasets. The computational time, the number of high utility itemsets (HUIs) found, and the quantity of candidate sets generated were compared to assess the performance of these methods.
In this experiment, we compared the HUI-PR method with the EFIM method on multiple datasets. The goal was to determine which method is more efficient in discovering high utility itemsets.
The experiments were conducted on different datasets with varying characteristics, including dataset size, density, and itemset types.
The experiments were performed on a range of real datasets, including Chess, Connect, Retail, Connect2x, Chess30x, BMS4x, and Mushroom20x. These datasets were chosen to represent both small and large datasets with different characteristics. Table 4.1 provides detailed characteristics of these datasets.
Dataset | # of Transactions | # of Items | Avg Length | Max Length | Type | Scale |
---|---|---|---|---|---|---|
Chess | 3196 | 76 | 37 | 37 | Dense | Small |
Connect | 67557 | 129 | 43 | 43 | Dense | Small |
Retail | 88162 | 16470 | 10 | 76 | Sparse | Medium |
Connect2x | 135114 | 129 | 43 | 43 | Dense | Large |
Chess30x | 95880 | 76 | 37 | 37 | Dense | Large |
BMS4x | 238408 | 497 | 3 | 267 | Sparse | Large |
Mushroom20x | 162400 | 119 | 23 | 23 | Dense | Large |
The computational time of HUI-PR and EFIM was compared using the Connect dataset, Chess dataset, and Retail dataset.
HUI-PR demonstrated significant improvements in computational time, primarily for datasets with a large number of transactions. HUI-PR effectively reduced the number of transactions at each level, utilizing a pruning hash table to eliminate unnecessary transactions. Figure 4.1 illustrates the comparison of computational time between HUI-PR and EFIM for different threshold ratios.
For example, for the 'Connect' dataset with a threshold ratio of 28.90%, HUI-PR took 1830.87 seconds, while EFIM took 1927.95 seconds.
Similarly, on a threshold ratio of 0.03% for the 'Retail' dataset, HUI-PR took 5718.36 seconds, while EFIM took 7370.33 seconds. These results demonstrate that HUI-PR outperforms EFIM in terms of computational efficiency.
The number of high utility itemsets (HUIs) found by HUI-PR and EFIM were compared using the Connect, Chess, and Retail datasets. The results showed that both methods discovered the same number of HUIs across different threshold ratios. Table 4.2 presents the number of HUIs found for various threshold ratios for each dataset. This indicates that HUI-PR is as effective as EFIM in discovering HUIs.
We compared the candidate sets generated by HUI-PR and EFIM. HUI-PR produced fewer candidate sets compared to EFIM. The candidate sets were reduced in HUI-PR using transaction pruning techniques and a pruning hash table. Figure 4.3 illustrates the comparison of candidate sets between HUI-PR and EFIM, and Table 4.3 shows the total number of transaction pruned in HUI-PR for different datasets and threshold ratios. HUI-PR generated fewer candidate sets, demonstrating its efficiency in reducing unnecessary computations.
We also compared HUI-PR with state-of-the-art algorithms, including HUI-Miner, HUP-Miner, FHM, FHM+, and d2HUP. The results indicated that HUI-PR outperforms these algorithms significantly. For instance, for the 'Connect' dataset, HUI-PR performed over 100 times better than HUI-Miner, HUP-Miner, and FHM, and nearly 50 times better than d2HUP. Similar performance improvements were observed for the 'Chess' dataset.
The experimental results show that HUI-PR is a highly efficient method for discovering high utility itemsets compared to EFIM and other state-of-the-art algorithms. It reduces computational time, generates fewer candidate sets, and performs as well as EFIM in terms of HUI discovery. These findings make HUI-PR a promising algorithm for mining high utility itemsets in large and dense datasets.
In conclusion, the experimental results demonstrate the effectiveness of the HUI-PR method in discovering high utility itemsets. It outperforms EFIM and other state-of-the-art algorithms in terms of computational efficiency while achieving the same results in HUI discovery. HUI-PR has the potential to be a valuable tool in data mining applications that require the identification of high utility itemsets in various datasets.
Comparative Analysis of HUI-PR and EFIM for High Utility Itemset Discovery. (2024, Jan 17). Retrieved from https://studymoose.com/document/comparative-analysis-of-hui-pr-and-efim-for-high-utility-itemset-discovery
👋 Hi! I’m your smart assistant Amy!
Don’t know where to start? Type your requirements and I’ll connect you to an academic expert within 3 minutes.
get help with your assignment