Tuesday, June 4, 2019

Privacy Preserving Data Mining in Partitioned Databases

silence Preserving Data archeological site in Partitioned DatabasesA survey Privacy preserving info mine in horizont every(prenominal)(prenominal)y partitioned selective informationbasesKomal Kapadia, Ms.Raksha Chauhan_______________________________________________________________________________________________________Abstract Privacy preserving data excavation techniques are introduced with the aim of extract the relevant knowledge from the large occur of data while protecting the sensible information at the same time. The success of data excavation relies on the availability of high quality data. To tick quality of data mining, effective information sharing between organizations becomes a vital requirement in todays society. Privacy preserving data mining deals with hiding an individuals sensitive identity without sacrificing the usability of data. Whenever we are concerning with data mining, Security is measure issue while extracting data. Privacy Preserving Data Minin g concerns with the protective covering of data and provide the data on demand as well as amount of data that is required.Index Terms data mining, secretiveness preserving, error correction code steganography, randomized rejoinder technique._______________________________________________________________________________________________INTRODUCTIONData mining techniques have been spaciously utilise in some(prenominal) areas especially for strategic decision making. The main bane of data mining is to protective covering and screen of data residing in large data stores. Some of the information considered as reclusive and secret can be bought out with advanced data mining tools. Different research efforts are under way to address this problem of privacy preserving and preserving security. The privacy term has wide range of different meanings. For example, in the context of the health insurance accountability and portability act privacy rule, privacy means the individuals abil ity to take care who has the access to personal health care information. In organization, privacy means that it involves the definition of policies stating which information is collected, how it is used, how customers are involved and informed in this process. We can considering privacy as Individuals desire and ability to keep certain information about themselves hidden from some others. Privacy preserving data mining refers to the area of data mining that seeks to safeguard sensitive information from unsolicited disclosure. Historically, issues related to PPDM were first studied by the national statistical agencies interested in collecting cliquish social and economical data, such as census and tax records, and making it available for analysis by public servants, companies, and researchers. Building accurate socio-economical stylishels is vital for headache planning and public policy. Yet, t here(predicate) is no way of knowing in advance what models may be needed, nor is it f easible for the statistical agency to perform all data processing for everyone, playing the role of a trusted third political party. Instead, the agency provides the data in a sanitized form that allows statistical processing and protects the privacy of individual records, solving a problem known as privacy preserving data publishing. There are many methods for preserving the privacy. In this surevey many methods try to compute the answer to the mining without revealing any additional information about user privacy.Progress in scientific research depends on the sharing and availability of information and ideas. But the researchers are mainly focusing on preserving the security or privacy of individuals. This issue leads to an emerging research area, privacy preserving data mining. For privacy preserving data mining, many authors proposed many technologies. The main aim of this paper is, to develop efficient methodology to find privacy preserving.LITERATURE bailiwickWe have studied some of the related work for the privacy preserving in horizontally partitioned databases. Existing work for privacy preserving in horizonatally partitioned database has different types of techniques.TYPES OF PRIVACY PRESERVING TECHNIQUES cheat honest partyWithout trusted partyWith trusted partyIn without trusted party to each one party leave calculate their own partial tone support and add their own random second and sends the result to the next party in the ring so that the other party will never know the result of others and in at last the initiator party will tell the result that is global support.In trusted party each party will calculate their partial support and send to the trusted party and add the own random number and send to the next coming post in the ring so that other party will never know the result of other parties after that trusted party will disclose the result and send to all puts that presents in the ring.Fig. 1 Framework of privacy preserving data minin g5SECURE MULTIPARTY COMMUNICATIONApproximately all Privacy Preserving data mining techniques rely on Secure multi party communication communications protocol. Secure multi party communication is defined as a computation protocol at the last part of which no party involved knows anything else except its own inputs the outcome, i.e. the view of each party during the execution can be effectively simulated by the input and output of the party. Secure multi party communication has commonly concentrated on two models of security. The semi-honest model as addes that every party follows the rule of the protocol, but is exempt to later use what it sees during execution of the protocol. The malicious model assumes that parties can arbitrarily cheat and such cheating will not compromise moreover security or the outcome, i.e. the results from the malicious party will be correct or the malicious party will be detected. Most of the Privacy Preserving data mining techniques assume an intermediate model, Preserving Privacy with non-colluding parties. A malicious party May dishonest the results, but will not be able to learn the private data of other parties without colluding with another party.(1)MHS algorithmic program FOR HORIZONTALLY PARTITION DATABASEM. Hussein et al.s Scheme (MHS) was introduced to improve privacy or security and try to castrate communication cost on increasing number of sites. Behind this main idea was to use effective cryptosystem and rearrange the communication path. For this, two sites were discovered. This algorithm whole shebang with minimum 3 sites. One site acts as Data Mining Initiator and other site as a Data Mining Combiner. Rests of other sites were called client sites. This scenario was able to decrease communication time. Fig. shows MHS algorithm.The working of the algorithm is as followsThe initiator generates RSA public key and a private key. It sends the public key to combiner and all other client sites.2. Each site, except initiator computes frequent item sterilise and local support for each frequent itemset using Local Data Mining .3. all Client sites encrypt their computed data using public key and send it to the combiner.4. The combiner merges the received data with its own encrypted data, encrypts it again and sends it to initiator to find global link rules.5. Initiator decrypts the received data using the private key. Then it merges its own local data mining data and computes to find global results.6. Finally, it finds global stand rules and sends it to all other sites.Fig.2 MHS algorithm11(2) EMHS ALGORITHM FOR HORIZONTALLY PARTITION DATABASEEnhanced M. Hussein et al.s Scheme (EMHS) was introduced to improve privacy and reduce communication cost on increasing number of sites. This algorithm also works with minimum 3 sites. One site acts as Data Mining Initiator and other site as a Data Mining Combiner. Rests of other sites were called client sites . But this algorithm works on the concept of MFI (Maxim al patronage Itemset) instead of Frequent Itemset.a) MFI (Maximal Frequent Itemset) A Frequent Itemset which is not a subset of any other frequent itemset is called MFI. By using MFI, communication cost is reduced .b) RSA (Rivest, Shamir, Adleman) Algorithm one of the widely used public key cryptosystem. It is based on keeping factoring product of two large prime numbers secret. Breaking RSA encryption is tough.(3)MODIFIED EMHS ALGORITHM FOR HORIZONTALLY PARTIOTION DATABASEIn this technique, they used modified EMHS algorithm for improving its efficiency by using Elliptic curve cryptography. Here Elgamal cryptography technique is used which is of ECC for homomorphic encryption.ELLIPTIC CURVE CRYPTOGRAPHYElliptic curve cryptography provides public cryptosystem based on the descrete logarithm problem over integer modulo a prime. Elliptic curve cryptosystem requires much shorter key length to provide a security level same as RSA with larger key length. In this elgamal cryptography is used.ELGAMAL CRYPTOGRAPHYa)A wishes to exchange message M with B9.b) B first chooses Prime Number p, Generator g and private key x.c)B computes its Public Key Y = gx mod p and sends it to A.d) Now A chooses a random number k.e) A calculates one time key K = Yk mod p.f) A calculates C1 = gk mod p and C2 = M*K mod p and sends (C1,C2) to B.g) B calculates K = C1x mod ph) B calculates K-1 = inverse of K mod p i)B recovers M = K-1 * C2 mod pj) Thus, Message M is exchanged between A and B securely.In this system, Elgamal cryptography paillier cryptosystem is used. Here, Elgamal cryptography is used for security purpose. Compared to EMHS algorithm here performance is better in terms of computation time.RANDOMIZED RESPONSE TECHNIQUEIn this technique, here mainly focused on CK secure sum in randomized response technique for privacy preserving. Here, the multi party transaction data who discover frequent item sets with minimum support.In the randomized response technique, consider the data s ets I = I1, I2, I3In and the random number or noise part are denoted by R= R1, R2, R3 Rn, the new set of records are denoted by I1+R1, I2+R2 .In+Rn and after that take a partial support Pij = Pi1, Pi2..Pin so that partial support isP ij=I+R10I=Pij-R10In Randomized response secure sum technique, secure sum each site will determine their own data value and send to forerunner site that near to original site and this goes on till the original site collects all the value of data after that the parent site will determine the global support.CK SECURE SUM ALGORITHM10Step1-Consider parties P1, P2, P3Pn.Step2-Each party will generate their own random number R1, R2.RNStep3-Connect the parties in the ring (P1, P2, P3PN) and let P1 is aprotocol initiator.Step4-Let RC=N, and Pij=0 (RC is round counter and Pij is partial support)Step5-Partial support P1 site calculating by using following formulaPsij = Xij.support Min support * DB + RN1 RNnStep6-Site P2 computes the PSj for each item received t he list using theList using the formula,PSij= PSij + Xij. Support minimum support * DB +Rn1-Rn (i-1)Step7-While RC =0 begin for j=1 to N dobegin for I=1 to N doStep8-P1 exchange its position to P (j+1) mod N andRC=RC-1endStep9-Party P1 allowance the result PijStep10-EndIn ck secure sum technique, mainly focused on for computing global support within a scenario of homogeneous database and provides the high security to the database and hacking of data is zero. expirationIn this paper we reviewed five privacy preserving technique in horizontally partitioned database. In MHS algorithm RSA cryptography is used. In EMHS algorithm, by using MFI approach trueness is high compared to MHS. Modified EMHS algorithm used elgamal technique so privacy is high than EMHS technique. Randomized response technique provides high security to the database compared to other techniques. In future we can compute less number of rounds instead of n number of rounds. Here , we can use encryption technique for encrypting random number and sends it to the predecessor.REFERENCES1 Neelamadhab Padhy, Dr. Pragnyaban Mishra Rasmita Panigrahi. The Survey of Data Mining Applications and Feature Scope. 2012 IJCSEIT.2 Xinjun qi, Mingkui zong. An overview of privacy preserving data mining. 2011 ICESE.3 Kishori pawar, Y.B. gurav. Overview of privacy in horizontally distributed databases. 2014 IJIRAE.4 Manish Sharma, Atul chaudhary , Manish mathuria Shalini chaudhary. A review study on the privacy preserving data mining techniques and approaches.. 2013 IJCST.5 Shweta taneja, shashank khanna, sugandha tilwalia, ankita. A review on privacy preserving data mining techniques and research challenges. 2014 IJCSIT.6 Jayanti dansana, Raghvendra kumar Jyotirmayee rautaray. Techniques for privacy preserving association rule mining in distributed database. 2012 IJCSITS.7 Xuan canh nguyen, Tung anh cao. An heighten scheme for privacy preserving association rules minig on horizonatally distributed databases. 2 012 IEEE.8 Manish Sharma, Atul chaudhary, Manish mathuria, Shalini chaudhary Santosh kumar. An efficient approach for privacy preserving in data mining. 2014 IEEE.9 Rachit v. Adhvaryu, Nikunj h. Domadiya. Privacy preserving in association rule mining on horizontally partitioned database. 2014 IJARCET.10 Jayanti Dansana , Raghvendra Kumar , Debadutta Dey. Privacy preservation in horizontally partitioned databases using randomized response technique. 2013 IEEE.11 Rachit v. Adhvaryu, Nikunj h. Domadiya, Research Trends in Privacy Preserving in Association Rule Mining (PPARM) On Horizontally Partitioned Database. 2014 IJEDR.12 Agrawal D. Aggarwal C. C. On the Design and Quantification of Privacy-Preserving Data Mining Algorithms.ACM PODS Conference, 2002.13 D.W.Cheung,etal.,Ecient Mining of Association Rules in Distributed Databases, IEEE Trans. Knowledge and Data Eng., vol. 8, no. 6, 1996,pp.911-922.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.