Hire Writer

Comparative Analysis of HUI-PR and EFIM for High Utility Itemset Discovery

Categories: Technology

Analysis, Pages 3 (633 words)

Views

Abstract

This lab report presents the results of experiments conducted to evaluate two different methods, HUI-PR and EFIM, for discovering high utility itemsets using various real datasets. The computational time, the number of high utility itemsets (HUIs) found, and the quantity of candidate sets generated were compared to assess the performance of these methods.

1. Introduction

In this experiment, we compared the HUI-PR method with the EFIM method on multiple datasets. The goal was to determine which method is more efficient in discovering high utility itemsets.

Don't use plagiarized sources. Get your custom paper on

“ Comparative Analysis of HUI-PR and EFIM for High Utility Itemset Discovery ”

Get high-quality paper

NEW! smart matching with writer

The experiments were conducted on different datasets with varying characteristics, including dataset size, density, and itemset types.

2. Methodology

2.1 Datasets

The experiments were performed on a range of real datasets, including Chess, Connect, Retail, Connect2x, Chess30x, BMS4x, and Mushroom20x. These datasets were chosen to represent both small and large datasets with different characteristics. Table 4.1 provides detailed characteristics of these datasets.

Dataset	# of Transactions	# of Items	Avg Length	Max Length	Type	Scale
Chess	3196	76	37	37	Dense	Small
Connect	67557	129	43	43	Dense	Small
Retail	88162	16470	10	76	Sparse	Medium
Connect2x	135114	129	43	43	Dense	Large
Chess30x	95880	76	37	37	Dense	Large
BMS4x	238408	497	3	267	Sparse	Large
Mushroom20x	162400	119	23	23	Dense	Large

3. Results

3.1 HUI-PR versus EFIM

3.1.1 Comparison of Computational Time

The computational time of HUI-PR and EFIM was compared using the Connect dataset, Chess dataset, and Retail dataset.

Writer Lyla

Verified writer

Proficient in: Technology

5 (876)

“ Have been using her for a while and please believe when I tell you, she never fail. Thanks Writer Lyla you are indeed awesome ”

+84 relevant experts are online

Hire writer

HUI-PR demonstrated significant improvements in computational time, primarily for datasets with a large number of transactions. HUI-PR effectively reduced the number of transactions at each level, utilizing a pruning hash table to eliminate unnecessary transactions. Figure 4.1 illustrates the comparison of computational time between HUI-PR and EFIM for different threshold ratios.

For example, for the 'Connect' dataset with a threshold ratio of 28.90%, HUI-PR took 1830.87 seconds, while EFIM took 1927.95 seconds.

Get to Know The Price Estimate For Your Paper

Topic

Deadline: 10 days left

Number of pages

Email Invalid email

By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy. We’ll occasionally send you promo and account related email

"You must agree to out terms of services and privacy policy"

Write my paper

You won’t be charged yet!

Similarly, on a threshold ratio of 0.03% for the 'Retail' dataset, HUI-PR took 5718.36 seconds, while EFIM took 7370.33 seconds. These results demonstrate that HUI-PR outperforms EFIM in terms of computational efficiency.

3.1.2 Comparison of HUIs

The number of high utility itemsets (HUIs) found by HUI-PR and EFIM were compared using the Connect, Chess, and Retail datasets. The results showed that both methods discovered the same number of HUIs across different threshold ratios. Table 4.2 presents the number of HUIs found for various threshold ratios for each dataset. This indicates that HUI-PR is as effective as EFIM in discovering HUIs.

3.1.3 Comparison of Candidate Sets

We compared the candidate sets generated by HUI-PR and EFIM. HUI-PR produced fewer candidate sets compared to EFIM. The candidate sets were reduced in HUI-PR using transaction pruning techniques and a pruning hash table. Figure 4.3 illustrates the comparison of candidate sets between HUI-PR and EFIM, and Table 4.3 shows the total number of transaction pruned in HUI-PR for different datasets and threshold ratios. HUI-PR generated fewer candidate sets, demonstrating its efficiency in reducing unnecessary computations.

3.2 Comparison with State-of-the-Art Algorithms

We also compared HUI-PR with state-of-the-art algorithms, including HUI-Miner, HUP-Miner, FHM, FHM+, and d2HUP. The results indicated that HUI-PR outperforms these algorithms significantly. For instance, for the 'Connect' dataset, HUI-PR performed over 100 times better than HUI-Miner, HUP-Miner, and FHM, and nearly 50 times better than d2HUP. Similar performance improvements were observed for the 'Chess' dataset.

4. Discussion

The experimental results show that HUI-PR is a highly efficient method for discovering high utility itemsets compared to EFIM and other state-of-the-art algorithms. It reduces computational time, generates fewer candidate sets, and performs as well as EFIM in terms of HUI discovery. These findings make HUI-PR a promising algorithm for mining high utility itemsets in large and dense datasets.

5. Conclusion

In conclusion, the experimental results demonstrate the effectiveness of the HUI-PR method in discovering high utility itemsets. It outperforms EFIM and other state-of-the-art algorithms in terms of computational efficiency while achieving the same results in HUI discovery. HUI-PR has the potential to be a valuable tool in data mining applications that require the identification of high utility itemsets in various datasets.

Comparative Analysis of HUI-PR and EFIM for High Utility Itemset Discovery. (2024, Jan 17). Retrieved from https://studymoose.com/document/comparative-analysis-of-hui-pr-and-efim-for-high-utility-itemset-discovery