Dynamic itemset hiding under multiple support thresholds
Öztürk, Ahmet Cumhur, author.
Data sharing is commonly performed between organizations for mutual benefits. However, if confidential knowledge is not hidden before the data is published it may pose threat to security and privacy. The privacy preserving frequent itemset mining is the process of hiding sensitive itemsets from being discovered with any frequent itemset mining algorithm. The privacy constraint of sensitive itemset hiding is sensitive threshold. If support of a given sensitive itemset is under the sensitive threshold, then this sensitive itemset is considered as non-interesting and hidden. One possible way of decreasing support of sensitive itemsets under predefined sensitive threshold is deleting items from a set of transaction. This type of frequent itemset sanitization is called distortion based frequent itemset hiding. The main focus of this thesis is to preserve sensitive itemsets with considering the multiple sensitive thresholds on both static and dynamic environments. Three different distortion based frequent itemset hiding algorithms proposed; Pseodo Graph Based Sanitization (PGBS), Itemset Oriented Pseudo Graph Based Sanitization (IPGBS) and DynamicPGBS are proposed. Both PGBS and IPGBS algorithms are designed for static environment and the DynamicPGBS algorithm is designed for the dynamic environment. The main objective of these three algorithms is to hide all sensitive itemsets with giving minimum distortion on non-sensitive knowledge and data in the resulting sanitized database.
