-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Privacy and Purpose Constraints #15
Comments
This reads like an uncommon use of "purpose limitation". Assurance that no party can acquire data of a certain kind would be access control (or data minimization). Purpose limitation typically refers to an attestation that even though a party can access some data, they will limit the purposes that they use it for. We should pursue both: limiting the data that's accessible and getting promises that the data that is shared is only used for the specific purpose for which it is provided. But I think we shouldn't confuse terms of art. |
@bmayd - thanks for the suggestion. Splitting out "Purpose Limitation" and "Privacy" makes sense to me, as "Purpose Limitation" isn't de facto private. I don't think we can drop Privacy entirely though. I'm less use about adding "Verifiable Input" and "Auditability". "Verifiable Input" seems like a tactic for achieving correctness, and "Auditability" seems like a tactic for trusting that a system is purpose limited. Curious what others think here. |
@npdoty, I'm not sure it's that far off, though the use of "data outputs" might be a bit confusing, and "information" would be a bit more clear. Something like: Purpose Limited: The API (and surrounding system) provides sufficient guarantees to allow the user agent to trust that, given the APIs output, no party can learn any information beyond the intended result. |
I support the intent of your statement. You may be interested in a goal of developing functionality that will only produce a limited set of data, and that restricting the output data is the intended purpose of the system. However, "purpose limitation" is a term for a well-known concept in privacy law that refers to something different, that of a party documenting in advance what purpose it will use data for, promising only to use it for that purpose, and collecting data that can be used for multiple things but only using it for that limited purpose. |
I believe we are utilizing that well-known concept in privacy law intentionally. The only difference is, because we are working on a web standard, and not a law with some form of enforcement, we cannot rely on a promise and must instead rely on technical guarantees. The standard documents in advance what the purpose of the data is used for (i.e., attribution measurement), data is collected, which without the technical constraints could be used for more than that purpose, and then is only used for that limited purpose (because of the technical constraints.) |
@npdoty Appreciate your callout; agree terms of art should not be confused as it leads to unnecessary disambiguation overhead. What do you think of "Purpose Constraint" as an alternative?
Although I think both are worth pursuing, I think this group ought to consider promises out of scope. However, technical and data-based evidence that can be used to affirm or refute promise claims should be in scope. For example, we can't enforce a promise that data won't be tampered with, but we can sign the data so that tampering can be detected. |
@bmayd I disagree that this causes confusion or leads to unnecessary disambiguation overhead. I am using this term of art intentionally. Here's a definition for purpose limitation:
This is our intention, with the only addition being that we are aiming to additionally provide technical guarantees to enforce it. @npdoty are you arguing that adding these technical guarantees makes this definition no longer compatible? |
@eriktaubeneck yes, I still find the use of the term here confusing and in conflict with its meaning elsewhere. Typically, I think a purpose in this context would be "measurement of purchases resulting from advertising" not "to calculate a result of a certain class of function in aggregate form with a differentially private guarantee". The purpose of the data collected or how it's being used is not specified in explicit terms to the user as part of the technical guarantee of the cryptographic design of the system. It can be very helpful to privacy goals to provide some technical guarantees. Existing data protection and privacy law is familiar with the concept of providing technical guarantees. For example, the principle in the GDPR following purpose limitation is data minimisation (sic), which includes limiting the data that is collected or processed by a party (for example, the data output of a privacy-preserving measurement aggregate calculation). That's the proposed contribution of this work -- that the data output is minimized such that the recipient never learns or accesses the individual user data at all. Providing technical designs that limit the amount of output data and thus how it can subsequently be used in ways that might harm people's privacy is a valuable contribution, it's just something that we use different terms for than purpose limitation. Generally we would like to design Web technology that is more closely driven by use cases and not so easily re-used or abused for other purposes and we should come up with some terminology for that. I've occasionally used "fit for purpose" in describing that idea informally, but that's also not quite right. |
This makes sense, however I think we are mixing up the threat model for the PATCG/WG and the specific private measurement spec. For the threat model, our goal is to consider proposals which offer technical means of purpose limitation (for lack of a better name at the moment.) For the private measurement spec, the purpose is something like "differentially private measurement of conversions resulting from advertising impressions." In that sense, the goal of the threat model is to help evaluate whether a specific proposal actually provides purpose limitation of its specified purpose through technical guarantees.
I think this depends on what is meant by "the data collected". For the private measurement spec, if it's the (typically encrypted) data leaving the client, then I'd argue that it is in fact specified as the purpose above (and the system limits it to that.) If "the data collected" is the aggregates which come out of the private computation system, then I agree that it's not specified. That said, I think you can always make that argument and it's turtles all the way down. |
@eriktaubeneck I find Privacy difficult to make meaningful or verifiable assertions about and was endeavoring to find an alternative that could be technically addressed. APIs can make assertions about, and report on, data inputs, processing and outputs, but regarding:
I think suggesting what might or might not be learned is problematic -- an API can only know what data is reported, it can't know how the data is applied and in most cases the intent is presumably to use the outputs to inform an understanding of a larger context. The question then becomes: are there cases in which increased understanding of the larger context could constitute a violation of privacy? I don't think it is a question that can be answered in the context of the API or a standard, but rather see it as a policy matter.
In the context of the initial statement:
I was including them because I believe they are goals that must be achieved in order for the model to be considered reliable and trustworthy in the face of adversaries seeking to corrupt it or apply it in violation of stated terms. |
I agree with @npdoty that "purpose limitation" is confusing in this context. I think the definition of "purpose" is broad enough to extend to post-processing of the private output of the API, which we probably don't want to constrain once its outside of the API's protection. |
I sense the miscommunication here is due to relativity. An API like ARA or IPA built for attribution is relatively much more purpose constrained than an API like I suggest that it is a noble aim to set ourselves on a course for purpose limitations even if the purposes are still quite broad. |
Having read through the various responses above and looked at definitions of "purpose limitation" online, I'm still inclined to agree with @npdoty that this is an unconventional use of a term that has a domain-specific meaning and as such could be confusing. His description:
In other words, it refers to data reuse that is possible, but should not happen, while in our context the intent is to identify that data reuse is not possible and cannot happen due to technical limitations. (Is this bike shedding?) |
I'm not sure this is bike shedding - I think clarity around the way we talk to each other is critical to have confidence that when we say we agree that we are actually in agreement. In this vein, avoiding the use of any term in a way such that any significant contributor to the discussion believes there is conflation or confusion seems the right thing to do. What I believe we're talking about is technical measures in the design of the system (and crucially its data outputs) which inhibit the possible purposes to which data that exits the system can be put. We need a fairly short and clear phrase for this if it's going to be a criterion we want to assess proposals against. One reason not to conflate it with the common understanding of purpose limitation as articulated above is that we may also want to leverage that concept (for instance by recommending that implementing parties make suitable public attestations, even auditably) where we cannot inhibit certain unwanted uses for data outputs by technical means and system / protocol design alone. |
Yup, I think making a concerted effort to be specific, concise and consistent when defining key terms within the problem domain leads to more productive discussions and better, faster alignment. Perhaps "usage constrained"? |
I like this, though I might avoid using "purposes" and instead using something like "application". "technical proscription of data application" could just be "technical application proscription" or even "proscription" in colloquial usage. |
+1
The reason I used 'inhibition' is that it accepts that although some things can be completely prevented by technical design, some things may only be possible to make harder (and ideally less powerful also). I'm not sure 'proscription' conveys that - 'constrained' would work for me. However at this point we may now indeed be bike shedding, so I won't object if others are happy. |
Originally posted by @bmayd in #14 (comment). Moving to a new issue since it's a distinct issue from the PR opened in issue 14.
The text was updated successfully, but these errors were encountered: