SlideShare une entreprise Scribd logo
1  sur  6
Télécharger pour lire hors ligne
Procedia Computer Science 20 (2013) 414 – 419
1877-0509 © 2013 The Authors. Published by Elsevier B.V.
Selection and peer-review under responsibility of Missouri University of Science and Technology
doi:10.1016/j.procs.2013.09.295
ScienceDirect
Available online at www.sciencedirect.com
Complex Adaptive Systems, Publication 3
Cihan H. Dagli, Editor in Chief
Conference Organized by Missouri University of Science and Technology
2013- Baltimore, MD
A Comparative Analysis of Data Privacy and Utility Parameter
Adjustment, Using Machine Learning Classification as a Gauge
Kato Mivulea
and Claude Turnerb
a*
ab
Bowie State University, 14000 Jericho Park Road, Bowie, MD, 20715
Abstract
During the data privacy process, the utility of datasets diminishes as sensitive information such as personal identifiable
information (PII) is removed, transformed, or distorted to achieve confidentiality. The intractability of attaining an equilibrium
between data privacy and utility needs is well documented, requiring trade-offs, and further complicated by the fact that making
such trade-offs also remains problematic. Given such complexity, in this paper, we endeavor to empirically investigate what
parameters could be fine-tuned to achieve an acceptable level of data privacy and utility during the data privacy process, while
making reasonable trade-offs. Therefore, we present the comparative classification error gauge (Comparative x-CEG) approach, a
data utility quantification concept that employs machine learning classification techniques to gauge data utility based on the
classification error. In this approach, privatized datasets are passed through a series of classifiers, each of which returns a
classification error, and the classifier with the lowest classification error is chosen; if the classification error is lower or equal to a
set threshold then better utility might be achieved, otherwise, adjustment to the data privacy parameters are made to the chosen
classifier. The process repeats x times until the desired threshold is reached. The goal is to generate empirical results after a range
of parameter adjustments in the data privacy process, from which a threshold level might be chosen to make trade-offs. Our
preliminary results show that given a range of empirical results, it might be possible to choose a tradeoff point and publish
privacy compliant data with an acceptable level of utility.
Keywords: Data privacy; utility; machine learning classifiers; comparative data analysis
1. Introduction
During the privacy preserving process, the utility of a dataset a measure of how useful a privatized dataset is to
the user of that dataset diminishes as sensitive data such as PII is removed, transformed, or distorted to achieve
confidentiality [1, 2, 3]. Yet, finding equilibrium between privacy and utility needs remains intractable, necessitating
trade-offs [4, 5, 6]. For example, by using the suppression of data that is removing or deleting sensitive attributes
before publication of data, privacy might be guaranteed to an extent that the PII and other sensitive data is removed.
* Corresponding author. Tel.: 443-333-6169
E-mail address: mivulek0220@students.bowiestate.edu
Available online at www.sciencedirect.com
© 2013 The Authors. Published by Elsevier B.V.
Selection and peer-review under responsibility of Missouri University of Science and Technology
415Kato Mivule and Claude Turner / Procedia Computer Science 20 (2013) 414 – 419
However, a researcher using such a published dataset might find it difficult to fully account for some categorical
entries despite confidentiality guarantees provided to that privatized dataset. In another example involving numerical
data, employing high noise perturbation levels to provide privacy for sensitive numerical data might render the
privatized dataset useless to a researcher as too much noise distorts the original traits of the data. Given such
intricacy, in this paper, we attempt to empirically explore what parameters could be adjusted to attain an adequate
level of data privacy and utility during the data privacy process, while making practical trade-offs. We present the
comparative classification error gauge (Comparative x-CEG) approach, a data utility quantification model that uses
machine learning classification techniques to gauge data utility based on the classification error. The remaining part
of this paper is ordered as follows. Section 2 discusses background and related work. Section 3 presents our
methodology and experiment. In section 4, we present preliminary results. Finally, in section 5, provides concluding
remarks.
2. Background and Related Work
The task of achieving a satisfactory level of data utility while preserving privacy is a well-documented NP-hard
problem that would necessitate trade-offs [7, 8]. Additionally, the problem is further complicated by the fact that
attaining such trade-offs also remains intractable. Researchers have observed that the problem of preserving
confidentiality while publishing useful statistical data is extensive and that the trade-off between the privacy and the
utility of any anonymization technique, would largely depend on the level of background knowledge
about a particular dataset, with higher levels indicating that it would be impossible to achieve any such trade-offs [9,
10, 11]. Li and Li (2009) indicated in their study that it is not promising to equate privacy and utility as datasets get
distorted when they are privatized, leading to a decline of utility [12]. Even with the latest state of the art data
privacy algorithms like differential privacy, confidentiality is guaranteed but at a major loss of data utility [13]. As
Dwork (2006) concisely put it [14],
utility; perfect utility can b . In other
words, the more confidential data is made to be, the more likely that the privatized data will be become useless and
decline in utility.
The privacy definition problem: On the other hand, there happens to be no specific standard metric to define
privacy, as Katos, Stowell, and Bedner (2011) observed, that privacy is a human and socially driven characteristic
comprised of human traits such as acuities and sentiments [15]. Dayarathna (2011) stated that to wholly comprehend
the notion of data privacy, an all-inclusive methodology to outlining data privacy should include the legal, technical,
and ethical facets [16]. This point is additionally exemplified by Spiekermann (2012) who noted that one of the
difficulties in designing and engineering data privacy is that the idea of privacy is fuzzy, frequently confused with
data security, and as a result, very problematic to implement [17]. Adding to this point, Friedewald, Wright,
Gutwirth, and Mordini (2010), in their research on the legal aspects of data privacy stated that since privacy is an
evolving and shifting complex multi-layered notion, being described and treasured otherwise by various people; and
that empirical studies are needed to assess how different people define and value privacy [18]. As Mathews and
Harel (2011) observed, the human element remains a key factor and as such data privacy is intrinsically tied to
fuzziness and evolutions of how individuals view privacy, and the same applies to data utility; that is, what
information about themselves that individuals determine as fit to share with others [19]. Therefore, it becomes
problematic to create a generalized data privacy and utility solution; however, different individuals and entities will
have different data privacy needs and thus tailored data privacy solutions. Given the complexities of defining
privacy, quantifying data utility is likewise problematic. However, various methods have been employed to
enumerate data utility by basically quantifying the statistical differences between the original and privatized
datasets, such as, the relative query error metric, research value metric, discernibility data metric, classification error
metric, the Shannon entropy, and the information loss metric (mean square error) [ 20, 21, 22, 23, 24, 25]. However,
in this paper, we suggest using the machine learning as a gauge, by using the classification error to adjust data
privacy parameters until a desired threshold is attained.
2.1. Essential Terms
While a number of data privacy algorithms exist, in this paper, the subsequent methods are used. Noise addition:
is a data privacy algorithm in which random numbers selected from a normal distribution with zero mean and a
standard deviation are added to data for obfuscation, and is expressed by [26, 27]:
416 Kato Mivule and Claude Turner / Procedia Computer Science 20 (2013) 414 – 419
(1)
X is the original numerical data and is the set of random numbers (noise) with a distribution that is
added to X, and Z is the privatized dataset [26, 27]. Multiplicative noise: random numbers whose mean μ= 1 and a
small variance are produced, and then multiplied to the original data, generating confidential data, expressed as
follows: [27, 28]:
(2)
Where is the original data; is the random numbers with mean μ= 1 and a small variance ; is the
confidential data after multiplying and [27, 28]. Logarithmic noise: makes a logarithmic modification of the
original data as shown below [26, 28]:
(3)
Random values are then produced and added to the data that underwent a logarithmic alteration . Lastly, the
confidential data, , is generated as given in Equation (4) [26, 28]:
(4)
Note that the original data is represented by ; is the logarithmic altered data values; and is the confidential
data.
3. Methodology
The Comparative x-CEG approach: We present the Comparative Classification Error Gauge (x-CEG) approach.
This approach is motivated by a need to investigate how a dataset that is subjected to various data privacy
algorithms performs under different machine classifiers. Empirical results from this process are then gathered and a
comparative analysis is performed to determine the best classifier. In the comparative x-CEG approach, a data
privacy algorithm is applied on a dataset, where are the various data privacy procedures, as
shown in Fig 1. The privatized datasets is generated, where are the new privatized datasets with
each corresponding to . The privatized datasets are passed through a series of machine learning
classifiers, where is the different machine learning classifiers. Statistical properties of the
privatized datasets, after applying the various data privacy algorithms are quantified. The classification
error from each classifier is then measured and the classifier with the lowest classification error is chosen.
If the classification error of that chosen classifier is lower or equal to a set threshold, then better utility might be
achieved; otherwise, an adjustment of the data privacy parameters of and or the machine learning parameters of
the chosen classifier. , is made. The results are resent to the classifier and a measure of the classification error is
done again. The procedure replicates until a desired classification error using the chosen classifier is achieved,
indicating a better utility for the user of that privatized dataset. The advantages of this approach is that multiple
checks for utility using various classifiers is achieved but also a more in-depth comparative analysis is done and as
such a knowledge of which machine learning classifier works best on which dataset. Secondly, since a large dataset
of empirical data is generated, choosing a reasonable threshold (trade-off) point becomes feasible.
4. Experiment
This section presents the experiment setup and preliminary results to be used. The Iris Fisher multivariate dataset
from the UCI repository [29], with both numeric and categorical data was used. One hundred and fifty data points
were used as the original dataset. The dataset consisted of numeric attributes: sepal length, sepal width, petal length,
and petal width. It also included one categorical attribute with three classes, Iris-Setosa, Iris-Versicolor, and Iris-
Virginica. Three data privacy algorithms were applied on the original data set; namely, noise addition, logarithmic
noise, and multiplicative noise. The comparative x-CEG algorithm was applied, using KNN, Neural Nets, Decision
Trees, AdaBoost, and Naïve Bayes classification methods. MATLAB and RapidMiner were employed as tools in
this experiment. The original Iris dataset with 150 records was imported into MATLAB and then vertically
partitioned with the first partition containing the continuous data and the second partition containing categorical data
of class labels. Data privacy algorithms were applied on the continuous portion of the data, and the categorical
417Kato Mivule and Claude Turner / Procedia Computer Science 20 (2013) 414 – 419
portion was used in classification of the data. The datasets were then set to RapidMiner, and a series of machine
learning classification techniques was applied on both the original Iris data and the various privatized Iris datasets,
allowing for an implementation of the Comparative x-CEG concept.
Fig. 1. The Comparative x-CEG procedure
4.1. Results
Fig. 2. (a) The comparative x-CEG classification error results; (b) The suggested tradeoff point (threshold) at 0.2 classification error
418 Kato Mivule and Claude Turner / Procedia Computer Science 20 (2013) 414 – 419
Fig. 3. Statistical distribution for the original data and privatized data using noise addition, logarithmic, and multiplicative privacy
4.2. Discussion
As shown in Fig. 2. (a)., using the original Iris dataset as a benchmark, the lowest classification error was 0.04 for
the original data using KNN, with 0.06 for AdaBoost Ensemble for the highest classification error. For a normal
original Iris dataset without privacy, classification error is less than 1 percent. However, after applying noise
addition privacy method, the classification error is at average 0.3, approximately 30 percent misclassification with
the non-adjusted mean and variance. Still, after adjusting noise addition parameters, with the mean set to 0 and
variance at 0.1, the classification error drops on average to 0.04, similar to the original dataset. While such close
results might be appealing on a data utility basis, the privacy of such a dataset is significantly diminished as the
dataset becomes similar to the original. Additionally, the classification error for both logarithmic and multiplicative
noise is on average 0.4, approximately 40 per cent misclassification. On the privacy basis, both logarithmic and
multiplicative noise provides better privacy. And, as shown in Fig. 3, in the scatter plots, it is very difficult to
separate or cluster data after applying logarithmic and multiplicative noise. However, the utility of such datasets is
not appealing when approximately 40 percent of the data is misclassified. As illustrated in Fig. 2. (b)., after
generating empirical data, a preferred threshold or trade-off point classification error is set at 0.2, approximately 20
percent misclassification. However, this could be lower, depending on user privacy and utility needs.
5. Conclusion
Employing the Comparative x-CEG methodology by adjusting data privacy parameters could generate adequate
empirical data to assist in selecting a trade-off point for preferred data privacy and utility levels. However, more
rigorous empirical studies are needed to further test this hypothesis. Furthermore, finding the optimal balance
between privacy and utility needs remains an intractable problem, and, as such, for future work, we seek to study
optimal trade-off points based on larger empirical datasets, and on a case by case basis. Also, we plan to conduct
analytical and empirical work comparing various data privacy algorithms not covered in this study.
419Kato Mivule and Claude Turner / Procedia Computer Science 20 (2013) 414 – 419
References
1. K. Mivule, C. Turner, and S.-
Computer Science, 2012, vol. 12, pp. 176 181.
2. rence
542.
3. M. Sramka, R. Safavi-naini, J. -oriented Framework for Measuring Privacy and Utility in Data
4. R. C.-W. Wong, A. W.-C. Fu, K. Wan
33rd international conference on Very large data bases, pp. 543 554, 2007.
5. - of the twenty third ACM SIGMOD-
SIGACT-SIGART symposium on Principles of database systems PODS 04, 2004, pp. 223 228.
6. -
conference on Management of data SIGMOD 07, 2007, pp. 67 78.
7. -
vol. 39, pp. 633 662, 2010.
8. Y. W. Y. Wang and X. W. X. Wu, Approximate inverse frequent itemset mining: privacy, complexity, and approximation. 2005.
9. -
annual ACM symposium on Symposium on theory of computing STOC 09, p. 351, 2008.
10. 80,
2010.
11. between , pp. 531 542, 2007.
12. ACM SIGKDD, 9, pp. 517 526, 2009.
13. -utility tradeoff for multi-dimensional
Privacy in Statistical Databases, Vol. 6344., Josep Domingo-Ferrer and Emmanouil Magkos, Eds. Springer, 2011, pp. 187 199.
14. eneel, V. Sassone, and
I. Wegener, Eds. Springer, 2006, pp. 1 12.
15.
Computer Science Volume 6514, 2011, pp. 123 139.
16. JICLT, vol. 6, no. 4, pp. 194 206, 2011.
17.
18. acy, data protection and emerging sciences and technologies: towards a
67, Mar. 2010.
19. hods for statistical disclosure limitation and methods for assessing
29, 2011.
20. - Dissertation, The University of
Texas at San Antonio, 2012.
21.
ACM SIGHIT symposium on International health informatics - 436.
22. R. J. Bayardo and R. - 228, 2005.
23. , 288, 2002.
24. M. H. Dunham(b), Data Mining, Introductory and Advanced Topics. Upper Saddle River, New Jersey: Prentice Hall, 2003, pp. 58 60.
25. A. Oganian and J. Domingo-
Journal of the United Nations Economic Commission for Europe, vol. 4, no. 18, pp. 345 353., 2001.
26.
Research Methods, American Statistical Association,, 1986, vol. Jay Kim, A, no. 3, pp. 370 374.
27. 71, 2012.
28. -01,
Statistical Research
29. -
and Computer Science., Irvine, CA, 2013.

Contenu connexe

Tendances

Using Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data MiningUsing Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data Mining14894
 
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data MiningPerformance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Miningidescitation
 
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...IRJET Journal
 
78201919
7820191978201919
78201919IJRAT
 
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREMPRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREMIJNSA Journal
 
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...IJDKP
 
Paper id 212014109
Paper id 212014109Paper id 212014109
Paper id 212014109IJRAT
 
Cluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesCluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesEditor IJMTER
 
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and ApproachesA Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches14894
 
Privacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data SetsPrivacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data SetsIJERA Editor
 
Enabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in CloudEnabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in CloudIOSR Journals
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data miningeSAT Publishing House
 
AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...
AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...
AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...cscpconf
 
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...Editor IJMTER
 
Privacy preserving dm_ppt
Privacy preserving dm_pptPrivacy preserving dm_ppt
Privacy preserving dm_pptSagar Verma
 
A Study of Usability-aware Network Trace Anonymization
A Study of Usability-aware Network Trace Anonymization A Study of Usability-aware Network Trace Anonymization
A Study of Usability-aware Network Trace Anonymization Kato Mivule
 
Ppback propagation-bansal-zhong-2010
Ppback propagation-bansal-zhong-2010Ppback propagation-bansal-zhong-2010
Ppback propagation-bansal-zhong-2010girivaishali
 

Tendances (20)

Using Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data MiningUsing Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data Mining
 
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data MiningPerformance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
 
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
 
F046043234
F046043234F046043234
F046043234
 
78201919
7820191978201919
78201919
 
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREMPRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
 
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
 
Paper id 212014109
Paper id 212014109Paper id 212014109
Paper id 212014109
 
Cluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesCluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for Databases
 
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and ApproachesA Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
 
Privacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data SetsPrivacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data Sets
 
Enabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in CloudEnabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in Cloud
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data mining
 
AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...
AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...
AN EFFICIENT SOLUTION FOR PRIVACYPRESERVING, SECURE REMOTE ACCESS TO SENSITIV...
 
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...
 
Privacy preserving dm_ppt
Privacy preserving dm_pptPrivacy preserving dm_ppt
Privacy preserving dm_ppt
 
Ijnsa050202
Ijnsa050202Ijnsa050202
Ijnsa050202
 
A Study of Usability-aware Network Trace Anonymization
A Study of Usability-aware Network Trace Anonymization A Study of Usability-aware Network Trace Anonymization
A Study of Usability-aware Network Trace Anonymization
 
Hy3414631468
Hy3414631468Hy3414631468
Hy3414631468
 
Ppback propagation-bansal-zhong-2010
Ppback propagation-bansal-zhong-2010Ppback propagation-bansal-zhong-2010
Ppback propagation-bansal-zhong-2010
 

En vedette

17.mengadministrasi server dalam_jaringan
17.mengadministrasi server dalam_jaringan17.mengadministrasi server dalam_jaringan
17.mengadministrasi server dalam_jaringanAn Atsa
 
Wmit introduction 2012 english
Wmit introduction 2012 englishWmit introduction 2012 english
Wmit introduction 2012 englishgmesmatch
 
춘천MBC 정보통신공사업 소개
춘천MBC 정보통신공사업 소개춘천MBC 정보통신공사업 소개
춘천MBC 정보통신공사업 소개chmbc
 
Реальные углы обзора видеорегистраторов
Реальные углы обзора видеорегистраторовРеальные углы обзора видеорегистраторов
Реальные углы обзора видеорегистраторовarsney
 
Comparison between different marketing plans
Comparison between different marketing plansComparison between different marketing plans
Comparison between different marketing plansAji Subramanyan
 
June 2013 IRMAC slides
June 2013 IRMAC slidesJune 2013 IRMAC slides
June 2013 IRMAC slidesAlistair Croll
 
Resolution Independence - Preparing Websites for Retina Displays
Resolution Independence - Preparing Websites for Retina DisplaysResolution Independence - Preparing Websites for Retina Displays
Resolution Independence - Preparing Websites for Retina Displayssteveschrab
 
Wmit introduction 2012 english slideshare
Wmit introduction 2012 english slideshareWmit introduction 2012 english slideshare
Wmit introduction 2012 english slidesharegmesmatch
 
Baker Business Bootcamp
Baker Business BootcampBaker Business Bootcamp
Baker Business BootcampLGLG Ministry
 
Top Firefox Addons
Top Firefox AddonsTop Firefox Addons
Top Firefox Addonstechnobz
 
Book Design by Jason Gonzales
Book Design by Jason GonzalesBook Design by Jason Gonzales
Book Design by Jason GonzalesJason Gonzales
 
PROFESSIONAL LEARNING NETWORKS- MASS CUE 2013
PROFESSIONAL LEARNING NETWORKS- MASS CUE 2013PROFESSIONAL LEARNING NETWORKS- MASS CUE 2013
PROFESSIONAL LEARNING NETWORKS- MASS CUE 2013Jennifer L. Scheffer
 
Mechanical engineering
Mechanical engineeringMechanical engineering
Mechanical engineeringElavarasan S
 
HumanCloud - Trace
HumanCloud - TraceHumanCloud - Trace
HumanCloud - Traceutkarsh_hcbs
 

En vedette (20)

17.mengadministrasi server dalam_jaringan
17.mengadministrasi server dalam_jaringan17.mengadministrasi server dalam_jaringan
17.mengadministrasi server dalam_jaringan
 
Wmit introduction 2012 english
Wmit introduction 2012 englishWmit introduction 2012 english
Wmit introduction 2012 english
 
춘천MBC 정보통신공사업 소개
춘천MBC 정보통신공사업 소개춘천MBC 정보통신공사업 소개
춘천MBC 정보통신공사업 소개
 
Реальные углы обзора видеорегистраторов
Реальные углы обзора видеорегистраторовРеальные углы обзора видеорегистраторов
Реальные углы обзора видеорегистраторов
 
About P&T
About P&TAbout P&T
About P&T
 
Comparison between different marketing plans
Comparison between different marketing plansComparison between different marketing plans
Comparison between different marketing plans
 
AM01PRO
AM01PROAM01PRO
AM01PRO
 
June 2013 IRMAC slides
June 2013 IRMAC slidesJune 2013 IRMAC slides
June 2013 IRMAC slides
 
Resolution Independence - Preparing Websites for Retina Displays
Resolution Independence - Preparing Websites for Retina DisplaysResolution Independence - Preparing Websites for Retina Displays
Resolution Independence - Preparing Websites for Retina Displays
 
Wmit introduction 2012 english slideshare
Wmit introduction 2012 english slideshareWmit introduction 2012 english slideshare
Wmit introduction 2012 english slideshare
 
Presentt
PresenttPresentt
Presentt
 
Baker Business Bootcamp
Baker Business BootcampBaker Business Bootcamp
Baker Business Bootcamp
 
Carta mordiscon
Carta mordisconCarta mordiscon
Carta mordiscon
 
Top Firefox Addons
Top Firefox AddonsTop Firefox Addons
Top Firefox Addons
 
Book Design by Jason Gonzales
Book Design by Jason GonzalesBook Design by Jason Gonzales
Book Design by Jason Gonzales
 
PROFESSIONAL LEARNING NETWORKS- MASS CUE 2013
PROFESSIONAL LEARNING NETWORKS- MASS CUE 2013PROFESSIONAL LEARNING NETWORKS- MASS CUE 2013
PROFESSIONAL LEARNING NETWORKS- MASS CUE 2013
 
Oumh1103 bab 4
Oumh1103 bab 4Oumh1103 bab 4
Oumh1103 bab 4
 
Vocab dict
Vocab dictVocab dict
Vocab dict
 
Mechanical engineering
Mechanical engineeringMechanical engineering
Mechanical engineering
 
HumanCloud - Trace
HumanCloud - TraceHumanCloud - Trace
HumanCloud - Trace
 

Similaire à A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Using Machine Learning Classification as a Gauge

78201919
7820191978201919
78201919IJRAT
 
Data Anonymization Process Challenges and Context Missions
Data Anonymization Process Challenges and Context MissionsData Anonymization Process Challenges and Context Missions
Data Anonymization Process Challenges and Context Missionsijdms
 
A Comparative Study on Privacy Preserving Datamining Techniques
A Comparative Study on Privacy Preserving Datamining  TechniquesA Comparative Study on Privacy Preserving Datamining  Techniques
A Comparative Study on Privacy Preserving Datamining TechniquesIJMER
 
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...IJECEIAES
 
Evaluation of a Multiple Regression Model for Noisy and Missing Data
Evaluation of a Multiple Regression Model for Noisy and Missing Data Evaluation of a Multiple Regression Model for Noisy and Missing Data
Evaluation of a Multiple Regression Model for Noisy and Missing Data IJECEIAES
 
Privacy Preserving in Cloud Using Distinctive Elliptic Curve Cryptosystem (DECC)
Privacy Preserving in Cloud Using Distinctive Elliptic Curve Cryptosystem (DECC)Privacy Preserving in Cloud Using Distinctive Elliptic Curve Cryptosystem (DECC)
Privacy Preserving in Cloud Using Distinctive Elliptic Curve Cryptosystem (DECC)ElavarasaN GanesaN
 
Performance analysis of perturbation-based privacy preserving techniques: an ...
Performance analysis of perturbation-based privacy preserving techniques: an ...Performance analysis of perturbation-based privacy preserving techniques: an ...
Performance analysis of perturbation-based privacy preserving techniques: an ...IJECEIAES
 
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATION
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATIONPRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATION
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATIONcscpconf
 
IEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishing
IEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishingIEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishing
IEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishingIEEEFINALYEARSTUDENTPROJECTS
 
2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishing
2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishing2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishing
2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishingIEEEMEMTECHSTUDENTSPROJECTS
 
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...Kumar Goud
 
Efficient technique for privacy preserving publishing of set valued data on c...
Efficient technique for privacy preserving publishing of set valued data on c...Efficient technique for privacy preserving publishing of set valued data on c...
Efficient technique for privacy preserving publishing of set valued data on c...ElavarasaN GanesaN
 
Framework to Avoid Similarity Attack in Big Streaming Data
Framework to Avoid Similarity Attack in Big Streaming Data Framework to Avoid Similarity Attack in Big Streaming Data
Framework to Avoid Similarity Attack in Big Streaming Data IJECEIAES
 
Indexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record DeduplicationIndexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record Deduplicationidescitation
 
Back to Basics - Firmware in NFV security
Back to Basics - Firmware in NFV securityBack to Basics - Firmware in NFV security
Back to Basics - Firmware in NFV securityLilminow
 
Review of the Implications of Uploading Unverified Dataset in A Data Banking ...
Review of the Implications of Uploading Unverified Dataset in A Data Banking ...Review of the Implications of Uploading Unverified Dataset in A Data Banking ...
Review of the Implications of Uploading Unverified Dataset in A Data Banking ...ssuser793b4e
 

Similaire à A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Using Machine Learning Classification as a Gauge (20)

78201919
7820191978201919
78201919
 
130509
130509130509
130509
 
130509
130509130509
130509
 
Data attribute security and privacy in Collaborative distributed database Pub...
Data attribute security and privacy in Collaborative distributed database Pub...Data attribute security and privacy in Collaborative distributed database Pub...
Data attribute security and privacy in Collaborative distributed database Pub...
 
Data Anonymization Process Challenges and Context Missions
Data Anonymization Process Challenges and Context MissionsData Anonymization Process Challenges and Context Missions
Data Anonymization Process Challenges and Context Missions
 
A Comparative Study on Privacy Preserving Datamining Techniques
A Comparative Study on Privacy Preserving Datamining  TechniquesA Comparative Study on Privacy Preserving Datamining  Techniques
A Comparative Study on Privacy Preserving Datamining Techniques
 
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...
 
Evaluation of a Multiple Regression Model for Noisy and Missing Data
Evaluation of a Multiple Regression Model for Noisy and Missing Data Evaluation of a Multiple Regression Model for Noisy and Missing Data
Evaluation of a Multiple Regression Model for Noisy and Missing Data
 
Privacy Preserving in Cloud Using Distinctive Elliptic Curve Cryptosystem (DECC)
Privacy Preserving in Cloud Using Distinctive Elliptic Curve Cryptosystem (DECC)Privacy Preserving in Cloud Using Distinctive Elliptic Curve Cryptosystem (DECC)
Privacy Preserving in Cloud Using Distinctive Elliptic Curve Cryptosystem (DECC)
 
Performance analysis of perturbation-based privacy preserving techniques: an ...
Performance analysis of perturbation-based privacy preserving techniques: an ...Performance analysis of perturbation-based privacy preserving techniques: an ...
Performance analysis of perturbation-based privacy preserving techniques: an ...
 
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATION
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATIONPRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATION
PRIVACY PRESERVING CLUSTERING IN DATA MINING USING VQ CODE BOOK GENERATION
 
IEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishing
IEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishingIEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishing
IEEE 2014 JAVA DATA MINING PROJECTS M privacy for collaborative data publishing
 
2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishing
2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishing2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishing
2014 IEEE JAVA DATA MINING PROJECT M privacy for collaborative data publishing
 
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
 
Efficient technique for privacy preserving publishing of set valued data on c...
Efficient technique for privacy preserving publishing of set valued data on c...Efficient technique for privacy preserving publishing of set valued data on c...
Efficient technique for privacy preserving publishing of set valued data on c...
 
Framework to Avoid Similarity Attack in Big Streaming Data
Framework to Avoid Similarity Attack in Big Streaming Data Framework to Avoid Similarity Attack in Big Streaming Data
Framework to Avoid Similarity Attack in Big Streaming Data
 
Indexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record DeduplicationIndexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record Deduplication
 
Back to Basics - Firmware in NFV security
Back to Basics - Firmware in NFV securityBack to Basics - Firmware in NFV security
Back to Basics - Firmware in NFV security
 
Review of the Implications of Uploading Unverified Dataset in A Data Banking ...
Review of the Implications of Uploading Unverified Dataset in A Data Banking ...Review of the Implications of Uploading Unverified Dataset in A Data Banking ...
Review of the Implications of Uploading Unverified Dataset in A Data Banking ...
 
B04010610
B04010610B04010610
B04010610
 

Plus de Kato Mivule

Cancer Diagnostic Prediction with Amazon ML – A Tutorial
Cancer Diagnostic Prediction with Amazon ML – A TutorialCancer Diagnostic Prediction with Amazon ML – A Tutorial
Cancer Diagnostic Prediction with Amazon ML – A TutorialKato Mivule
 
Kato Mivule - Towards Agent-based Data Privacy Engineering
Kato Mivule - Towards Agent-based Data Privacy EngineeringKato Mivule - Towards Agent-based Data Privacy Engineering
Kato Mivule - Towards Agent-based Data Privacy EngineeringKato Mivule
 
Lit Review Talk by Kato Mivule: A Review of Genetic Algorithms
Lit Review Talk by Kato Mivule: A Review of Genetic AlgorithmsLit Review Talk by Kato Mivule: A Review of Genetic Algorithms
Lit Review Talk by Kato Mivule: A Review of Genetic AlgorithmsKato Mivule
 
Lit Review Talk by Kato Mivule: Protecting DNA Sequence Anonymity with Genera...
Lit Review Talk by Kato Mivule: Protecting DNA Sequence Anonymity with Genera...Lit Review Talk by Kato Mivule: Protecting DNA Sequence Anonymity with Genera...
Lit Review Talk by Kato Mivule: Protecting DNA Sequence Anonymity with Genera...Kato Mivule
 
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeAn Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeKato Mivule
 
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeAn Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeKato Mivule
 
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...Kato Mivule
 
Kato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance ComputingKato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance ComputingKato Mivule
 
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...Kato Mivule
 
Kato Mivule: An Overview of Adaptive Boosting – AdaBoost
Kato Mivule: An Overview of  Adaptive Boosting – AdaBoostKato Mivule: An Overview of  Adaptive Boosting – AdaBoost
Kato Mivule: An Overview of Adaptive Boosting – AdaBoostKato Mivule
 
Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...
Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...
Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...Kato Mivule
 
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...Kato Mivule
 
Towards A Differential Privacy Preserving Utility Machine Learning Classifier
Towards A Differential Privacy Preserving Utility Machine Learning ClassifierTowards A Differential Privacy Preserving Utility Machine Learning Classifier
Towards A Differential Privacy Preserving Utility Machine Learning ClassifierKato Mivule
 
A Robust Layered Control System for a Mobile Robot, Rodney A. Brooks; A Softw...
A Robust Layered Control System for a Mobile Robot, Rodney A. Brooks; A Softw...A Robust Layered Control System for a Mobile Robot, Rodney A. Brooks; A Softw...
A Robust Layered Control System for a Mobile Robot, Rodney A. Brooks; A Softw...Kato Mivule
 
Two Pseudo-random Number Generators, an Overview
Two Pseudo-random Number Generators, an Overview Two Pseudo-random Number Generators, an Overview
Two Pseudo-random Number Generators, an Overview Kato Mivule
 
Applying Data Privacy Techniques on Published Data in Uganda
Applying Data Privacy Techniques on Published Data in UgandaApplying Data Privacy Techniques on Published Data in Uganda
Applying Data Privacy Techniques on Published Data in UgandaKato Mivule
 
Utilizing Noise Addition For Data Privacy, an Overview
Utilizing Noise Addition For Data Privacy, an OverviewUtilizing Noise Addition For Data Privacy, an Overview
Utilizing Noise Addition For Data Privacy, an OverviewKato Mivule
 

Plus de Kato Mivule (17)

Cancer Diagnostic Prediction with Amazon ML – A Tutorial
Cancer Diagnostic Prediction with Amazon ML – A TutorialCancer Diagnostic Prediction with Amazon ML – A Tutorial
Cancer Diagnostic Prediction with Amazon ML – A Tutorial
 
Kato Mivule - Towards Agent-based Data Privacy Engineering
Kato Mivule - Towards Agent-based Data Privacy EngineeringKato Mivule - Towards Agent-based Data Privacy Engineering
Kato Mivule - Towards Agent-based Data Privacy Engineering
 
Lit Review Talk by Kato Mivule: A Review of Genetic Algorithms
Lit Review Talk by Kato Mivule: A Review of Genetic AlgorithmsLit Review Talk by Kato Mivule: A Review of Genetic Algorithms
Lit Review Talk by Kato Mivule: A Review of Genetic Algorithms
 
Lit Review Talk by Kato Mivule: Protecting DNA Sequence Anonymity with Genera...
Lit Review Talk by Kato Mivule: Protecting DNA Sequence Anonymity with Genera...Lit Review Talk by Kato Mivule: Protecting DNA Sequence Anonymity with Genera...
Lit Review Talk by Kato Mivule: Protecting DNA Sequence Anonymity with Genera...
 
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeAn Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
 
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeAn Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
 
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Usin...
 
Kato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance ComputingKato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance Computing
 
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
Literature Review: The Role of Signal Processing in Meeting Privacy Challenge...
 
Kato Mivule: An Overview of Adaptive Boosting – AdaBoost
Kato Mivule: An Overview of  Adaptive Boosting – AdaBoostKato Mivule: An Overview of  Adaptive Boosting – AdaBoost
Kato Mivule: An Overview of Adaptive Boosting – AdaBoost
 
Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...
Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...
Kato Mivule: COGNITIVE 2013 - An Overview of Data Privacy in Multi-Agent Lear...
 
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
Kato Mivule: An Investigation of Data Privacy and Utility Preservation Using ...
 
Towards A Differential Privacy Preserving Utility Machine Learning Classifier
Towards A Differential Privacy Preserving Utility Machine Learning ClassifierTowards A Differential Privacy Preserving Utility Machine Learning Classifier
Towards A Differential Privacy Preserving Utility Machine Learning Classifier
 
A Robust Layered Control System for a Mobile Robot, Rodney A. Brooks; A Softw...
A Robust Layered Control System for a Mobile Robot, Rodney A. Brooks; A Softw...A Robust Layered Control System for a Mobile Robot, Rodney A. Brooks; A Softw...
A Robust Layered Control System for a Mobile Robot, Rodney A. Brooks; A Softw...
 
Two Pseudo-random Number Generators, an Overview
Two Pseudo-random Number Generators, an Overview Two Pseudo-random Number Generators, an Overview
Two Pseudo-random Number Generators, an Overview
 
Applying Data Privacy Techniques on Published Data in Uganda
Applying Data Privacy Techniques on Published Data in UgandaApplying Data Privacy Techniques on Published Data in Uganda
Applying Data Privacy Techniques on Published Data in Uganda
 
Utilizing Noise Addition For Data Privacy, an Overview
Utilizing Noise Addition For Data Privacy, an OverviewUtilizing Noise Addition For Data Privacy, an Overview
Utilizing Noise Addition For Data Privacy, an Overview
 

Dernier

Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 

Dernier (20)

Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 

A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Using Machine Learning Classification as a Gauge

  • 1. Procedia Computer Science 20 (2013) 414 – 419 1877-0509 © 2013 The Authors. Published by Elsevier B.V. Selection and peer-review under responsibility of Missouri University of Science and Technology doi:10.1016/j.procs.2013.09.295 ScienceDirect Available online at www.sciencedirect.com Complex Adaptive Systems, Publication 3 Cihan H. Dagli, Editor in Chief Conference Organized by Missouri University of Science and Technology 2013- Baltimore, MD A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Using Machine Learning Classification as a Gauge Kato Mivulea and Claude Turnerb a* ab Bowie State University, 14000 Jericho Park Road, Bowie, MD, 20715 Abstract During the data privacy process, the utility of datasets diminishes as sensitive information such as personal identifiable information (PII) is removed, transformed, or distorted to achieve confidentiality. The intractability of attaining an equilibrium between data privacy and utility needs is well documented, requiring trade-offs, and further complicated by the fact that making such trade-offs also remains problematic. Given such complexity, in this paper, we endeavor to empirically investigate what parameters could be fine-tuned to achieve an acceptable level of data privacy and utility during the data privacy process, while making reasonable trade-offs. Therefore, we present the comparative classification error gauge (Comparative x-CEG) approach, a data utility quantification concept that employs machine learning classification techniques to gauge data utility based on the classification error. In this approach, privatized datasets are passed through a series of classifiers, each of which returns a classification error, and the classifier with the lowest classification error is chosen; if the classification error is lower or equal to a set threshold then better utility might be achieved, otherwise, adjustment to the data privacy parameters are made to the chosen classifier. The process repeats x times until the desired threshold is reached. The goal is to generate empirical results after a range of parameter adjustments in the data privacy process, from which a threshold level might be chosen to make trade-offs. Our preliminary results show that given a range of empirical results, it might be possible to choose a tradeoff point and publish privacy compliant data with an acceptable level of utility. Keywords: Data privacy; utility; machine learning classifiers; comparative data analysis 1. Introduction During the privacy preserving process, the utility of a dataset a measure of how useful a privatized dataset is to the user of that dataset diminishes as sensitive data such as PII is removed, transformed, or distorted to achieve confidentiality [1, 2, 3]. Yet, finding equilibrium between privacy and utility needs remains intractable, necessitating trade-offs [4, 5, 6]. For example, by using the suppression of data that is removing or deleting sensitive attributes before publication of data, privacy might be guaranteed to an extent that the PII and other sensitive data is removed. * Corresponding author. Tel.: 443-333-6169 E-mail address: mivulek0220@students.bowiestate.edu Available online at www.sciencedirect.com © 2013 The Authors. Published by Elsevier B.V. Selection and peer-review under responsibility of Missouri University of Science and Technology
  • 2. 415Kato Mivule and Claude Turner / Procedia Computer Science 20 (2013) 414 – 419 However, a researcher using such a published dataset might find it difficult to fully account for some categorical entries despite confidentiality guarantees provided to that privatized dataset. In another example involving numerical data, employing high noise perturbation levels to provide privacy for sensitive numerical data might render the privatized dataset useless to a researcher as too much noise distorts the original traits of the data. Given such intricacy, in this paper, we attempt to empirically explore what parameters could be adjusted to attain an adequate level of data privacy and utility during the data privacy process, while making practical trade-offs. We present the comparative classification error gauge (Comparative x-CEG) approach, a data utility quantification model that uses machine learning classification techniques to gauge data utility based on the classification error. The remaining part of this paper is ordered as follows. Section 2 discusses background and related work. Section 3 presents our methodology and experiment. In section 4, we present preliminary results. Finally, in section 5, provides concluding remarks. 2. Background and Related Work The task of achieving a satisfactory level of data utility while preserving privacy is a well-documented NP-hard problem that would necessitate trade-offs [7, 8]. Additionally, the problem is further complicated by the fact that attaining such trade-offs also remains intractable. Researchers have observed that the problem of preserving confidentiality while publishing useful statistical data is extensive and that the trade-off between the privacy and the utility of any anonymization technique, would largely depend on the level of background knowledge about a particular dataset, with higher levels indicating that it would be impossible to achieve any such trade-offs [9, 10, 11]. Li and Li (2009) indicated in their study that it is not promising to equate privacy and utility as datasets get distorted when they are privatized, leading to a decline of utility [12]. Even with the latest state of the art data privacy algorithms like differential privacy, confidentiality is guaranteed but at a major loss of data utility [13]. As Dwork (2006) concisely put it [14], utility; perfect utility can b . In other words, the more confidential data is made to be, the more likely that the privatized data will be become useless and decline in utility. The privacy definition problem: On the other hand, there happens to be no specific standard metric to define privacy, as Katos, Stowell, and Bedner (2011) observed, that privacy is a human and socially driven characteristic comprised of human traits such as acuities and sentiments [15]. Dayarathna (2011) stated that to wholly comprehend the notion of data privacy, an all-inclusive methodology to outlining data privacy should include the legal, technical, and ethical facets [16]. This point is additionally exemplified by Spiekermann (2012) who noted that one of the difficulties in designing and engineering data privacy is that the idea of privacy is fuzzy, frequently confused with data security, and as a result, very problematic to implement [17]. Adding to this point, Friedewald, Wright, Gutwirth, and Mordini (2010), in their research on the legal aspects of data privacy stated that since privacy is an evolving and shifting complex multi-layered notion, being described and treasured otherwise by various people; and that empirical studies are needed to assess how different people define and value privacy [18]. As Mathews and Harel (2011) observed, the human element remains a key factor and as such data privacy is intrinsically tied to fuzziness and evolutions of how individuals view privacy, and the same applies to data utility; that is, what information about themselves that individuals determine as fit to share with others [19]. Therefore, it becomes problematic to create a generalized data privacy and utility solution; however, different individuals and entities will have different data privacy needs and thus tailored data privacy solutions. Given the complexities of defining privacy, quantifying data utility is likewise problematic. However, various methods have been employed to enumerate data utility by basically quantifying the statistical differences between the original and privatized datasets, such as, the relative query error metric, research value metric, discernibility data metric, classification error metric, the Shannon entropy, and the information loss metric (mean square error) [ 20, 21, 22, 23, 24, 25]. However, in this paper, we suggest using the machine learning as a gauge, by using the classification error to adjust data privacy parameters until a desired threshold is attained. 2.1. Essential Terms While a number of data privacy algorithms exist, in this paper, the subsequent methods are used. Noise addition: is a data privacy algorithm in which random numbers selected from a normal distribution with zero mean and a standard deviation are added to data for obfuscation, and is expressed by [26, 27]:
  • 3. 416 Kato Mivule and Claude Turner / Procedia Computer Science 20 (2013) 414 – 419 (1) X is the original numerical data and is the set of random numbers (noise) with a distribution that is added to X, and Z is the privatized dataset [26, 27]. Multiplicative noise: random numbers whose mean μ= 1 and a small variance are produced, and then multiplied to the original data, generating confidential data, expressed as follows: [27, 28]: (2) Where is the original data; is the random numbers with mean μ= 1 and a small variance ; is the confidential data after multiplying and [27, 28]. Logarithmic noise: makes a logarithmic modification of the original data as shown below [26, 28]: (3) Random values are then produced and added to the data that underwent a logarithmic alteration . Lastly, the confidential data, , is generated as given in Equation (4) [26, 28]: (4) Note that the original data is represented by ; is the logarithmic altered data values; and is the confidential data. 3. Methodology The Comparative x-CEG approach: We present the Comparative Classification Error Gauge (x-CEG) approach. This approach is motivated by a need to investigate how a dataset that is subjected to various data privacy algorithms performs under different machine classifiers. Empirical results from this process are then gathered and a comparative analysis is performed to determine the best classifier. In the comparative x-CEG approach, a data privacy algorithm is applied on a dataset, where are the various data privacy procedures, as shown in Fig 1. The privatized datasets is generated, where are the new privatized datasets with each corresponding to . The privatized datasets are passed through a series of machine learning classifiers, where is the different machine learning classifiers. Statistical properties of the privatized datasets, after applying the various data privacy algorithms are quantified. The classification error from each classifier is then measured and the classifier with the lowest classification error is chosen. If the classification error of that chosen classifier is lower or equal to a set threshold, then better utility might be achieved; otherwise, an adjustment of the data privacy parameters of and or the machine learning parameters of the chosen classifier. , is made. The results are resent to the classifier and a measure of the classification error is done again. The procedure replicates until a desired classification error using the chosen classifier is achieved, indicating a better utility for the user of that privatized dataset. The advantages of this approach is that multiple checks for utility using various classifiers is achieved but also a more in-depth comparative analysis is done and as such a knowledge of which machine learning classifier works best on which dataset. Secondly, since a large dataset of empirical data is generated, choosing a reasonable threshold (trade-off) point becomes feasible. 4. Experiment This section presents the experiment setup and preliminary results to be used. The Iris Fisher multivariate dataset from the UCI repository [29], with both numeric and categorical data was used. One hundred and fifty data points were used as the original dataset. The dataset consisted of numeric attributes: sepal length, sepal width, petal length, and petal width. It also included one categorical attribute with three classes, Iris-Setosa, Iris-Versicolor, and Iris- Virginica. Three data privacy algorithms were applied on the original data set; namely, noise addition, logarithmic noise, and multiplicative noise. The comparative x-CEG algorithm was applied, using KNN, Neural Nets, Decision Trees, AdaBoost, and Naïve Bayes classification methods. MATLAB and RapidMiner were employed as tools in this experiment. The original Iris dataset with 150 records was imported into MATLAB and then vertically partitioned with the first partition containing the continuous data and the second partition containing categorical data of class labels. Data privacy algorithms were applied on the continuous portion of the data, and the categorical
  • 4. 417Kato Mivule and Claude Turner / Procedia Computer Science 20 (2013) 414 – 419 portion was used in classification of the data. The datasets were then set to RapidMiner, and a series of machine learning classification techniques was applied on both the original Iris data and the various privatized Iris datasets, allowing for an implementation of the Comparative x-CEG concept. Fig. 1. The Comparative x-CEG procedure 4.1. Results Fig. 2. (a) The comparative x-CEG classification error results; (b) The suggested tradeoff point (threshold) at 0.2 classification error
  • 5. 418 Kato Mivule and Claude Turner / Procedia Computer Science 20 (2013) 414 – 419 Fig. 3. Statistical distribution for the original data and privatized data using noise addition, logarithmic, and multiplicative privacy 4.2. Discussion As shown in Fig. 2. (a)., using the original Iris dataset as a benchmark, the lowest classification error was 0.04 for the original data using KNN, with 0.06 for AdaBoost Ensemble for the highest classification error. For a normal original Iris dataset without privacy, classification error is less than 1 percent. However, after applying noise addition privacy method, the classification error is at average 0.3, approximately 30 percent misclassification with the non-adjusted mean and variance. Still, after adjusting noise addition parameters, with the mean set to 0 and variance at 0.1, the classification error drops on average to 0.04, similar to the original dataset. While such close results might be appealing on a data utility basis, the privacy of such a dataset is significantly diminished as the dataset becomes similar to the original. Additionally, the classification error for both logarithmic and multiplicative noise is on average 0.4, approximately 40 per cent misclassification. On the privacy basis, both logarithmic and multiplicative noise provides better privacy. And, as shown in Fig. 3, in the scatter plots, it is very difficult to separate or cluster data after applying logarithmic and multiplicative noise. However, the utility of such datasets is not appealing when approximately 40 percent of the data is misclassified. As illustrated in Fig. 2. (b)., after generating empirical data, a preferred threshold or trade-off point classification error is set at 0.2, approximately 20 percent misclassification. However, this could be lower, depending on user privacy and utility needs. 5. Conclusion Employing the Comparative x-CEG methodology by adjusting data privacy parameters could generate adequate empirical data to assist in selecting a trade-off point for preferred data privacy and utility levels. However, more rigorous empirical studies are needed to further test this hypothesis. Furthermore, finding the optimal balance between privacy and utility needs remains an intractable problem, and, as such, for future work, we seek to study optimal trade-off points based on larger empirical datasets, and on a case by case basis. Also, we plan to conduct analytical and empirical work comparing various data privacy algorithms not covered in this study.
  • 6. 419Kato Mivule and Claude Turner / Procedia Computer Science 20 (2013) 414 – 419 References 1. K. Mivule, C. Turner, and S.- Computer Science, 2012, vol. 12, pp. 176 181. 2. rence 542. 3. M. Sramka, R. Safavi-naini, J. -oriented Framework for Measuring Privacy and Utility in Data 4. R. C.-W. Wong, A. W.-C. Fu, K. Wan 33rd international conference on Very large data bases, pp. 543 554, 2007. 5. - of the twenty third ACM SIGMOD- SIGACT-SIGART symposium on Principles of database systems PODS 04, 2004, pp. 223 228. 6. - conference on Management of data SIGMOD 07, 2007, pp. 67 78. 7. - vol. 39, pp. 633 662, 2010. 8. Y. W. Y. Wang and X. W. X. Wu, Approximate inverse frequent itemset mining: privacy, complexity, and approximation. 2005. 9. - annual ACM symposium on Symposium on theory of computing STOC 09, p. 351, 2008. 10. 80, 2010. 11. between , pp. 531 542, 2007. 12. ACM SIGKDD, 9, pp. 517 526, 2009. 13. -utility tradeoff for multi-dimensional Privacy in Statistical Databases, Vol. 6344., Josep Domingo-Ferrer and Emmanouil Magkos, Eds. Springer, 2011, pp. 187 199. 14. eneel, V. Sassone, and I. Wegener, Eds. Springer, 2006, pp. 1 12. 15. Computer Science Volume 6514, 2011, pp. 123 139. 16. JICLT, vol. 6, no. 4, pp. 194 206, 2011. 17. 18. acy, data protection and emerging sciences and technologies: towards a 67, Mar. 2010. 19. hods for statistical disclosure limitation and methods for assessing 29, 2011. 20. - Dissertation, The University of Texas at San Antonio, 2012. 21. ACM SIGHIT symposium on International health informatics - 436. 22. R. J. Bayardo and R. - 228, 2005. 23. , 288, 2002. 24. M. H. Dunham(b), Data Mining, Introductory and Advanced Topics. Upper Saddle River, New Jersey: Prentice Hall, 2003, pp. 58 60. 25. A. Oganian and J. Domingo- Journal of the United Nations Economic Commission for Europe, vol. 4, no. 18, pp. 345 353., 2001. 26. Research Methods, American Statistical Association,, 1986, vol. Jay Kim, A, no. 3, pp. 370 374. 27. 71, 2012. 28. -01, Statistical Research 29. - and Computer Science., Irvine, CA, 2013.