Overview

PrivacyStreams simply explained.

PrivacyStreams is a framework for privacy-friendly personal data processing. It provides easy-to-use APIs in Android for accessing and processing various types of personal data. It focused on two challenges for accessing and processing personal data:

As an example of both issues, a sleep monitoring app might only need access to the microphone for checking loudness of the current environment. The developer would have to write a lot of code to record and process audio using MediaRecorder, and end-users might be concerned because the app has full access to data collected by the microphone.

PrivacyStreams is designed to address these issues. Its main features include:

For example, with PrivacyStreams, a sleep monitor can access and process audio data from the microphone with few lines of code:


 uqi.getData(Audio.recordPeriodic(10*1000, 2*60*1000), Purpose.HEALTH("monitoring sleep")) // Record a 10-second audio periodically with a 2-minute interval between each two records.
    .setField("loudness", AudioOperators.calcLoudness("audio_data")) // Set a customized field "loudness" for each record as the audio loudness
    .onChange("loudness", callback) // Callback with loudness value when "loudness" changes

Apps developed with PrivacyStreams can be easily analyzed and verified to address privacy concerns of users.

Microphone is used by this app to calculate loudness periodically.

- Verified by PrivacyStreams.

Installing PrivacyStreams

To use PrivacyStreams in your Android app, please add the following line to the build.gradle file under the app module.

dependencies {
    compile 'io.github.privacystreams:privacystreams-android-sdk:0.1.7'
    ...
}

That’s it!

Some types of data and operations may need extra installation and/or configuration. For example:

You may also want to check whether your installation is successful.

Quick examples

Before going into details, let’s take a quick look at what it is like to use PrivacyStreams for personal data processing.

Getting microphone loudness periodically

First, let’s review the sleep monitor example in Overview section. We used the following code to get audio loudness periodically.


 uqi.getData(Audio.recordPeriodic(10*1000, 2*60*1000), Purpose.HEALTH("monitoring sleep")) // Record a 10-second audio periodically with a 2-minute interval between each two records.
    .setField("loudness", AudioOperators.calcLoudness("audio_data")) // Set a customized field "loudness" for each record as the audio loudness
    .onChange("loudness", callback) // Callback with loudness value when "loudness" changes

UQI stands for “unified query interface”, which is the only interface in PrivacyStreams for accessing all kinds of personal data.

The first parameter of UQI.getData() is called a “Provider”, which declares the data we want to access. In the example, Audio.recordPeriodic() provides audio data by recording from microphone periodically; The second parameter specifies the purpose of the personal data access. In the example, the purpose is “monitoring sleep”, in HEALTH category.

UQI.getData() will produce a stream of data items. In this example, each item represents an audio record. The format of an audio record is shown as follows:

Reference Name Type Description
Audio.TIME_CREATED "time_created" Long The timestamp of when this item is created. It is a general field for all items.
Audio.TIMESTAMP "timestamp" Long The timestamp of when the audio/record was generated.
Audio.AUDIO_DATA "audio_data" AudioData The abstraction of audio data. The value is an AudioData instance.

It means each Audio item has 3 pre-defined fields: "time_created", "timestamp" and "audio_data". Below is an example of an Audio item:


// An example of an Audio Item.
{
    Long "time_created": 1489528276655,
    Long "timestamp": 1489528266640,
    AudioData "audio_data": <AudioData@12416728>
}

Each data type has a list of providers that can produce such type of data. In this example, the provider is Audio.recordPeriodic(), which will provide a live stream of periodically-generated audio record items.

Type Reference & Description
PStreamProvider Audio.recordPeriodic(long durationPerRecord, long interval)
Provide a live stream of Audio items. The audios are recorded from microphone periodically every certain time interval, and each Audio item is a certain duration of time long. For example, recordPeriodic(1000, 4000) will record audio from 0s-1s, 5s-6s, 10s-11s, …
- durationPerRecord: the time duration of each audio record, in milliseconds.
- interval: the time interval between each two records, in milliseconds.

The list of all available data types and corresponding providers can be found here.

The second line, .setField("loudness", AudioOperators.calcLoudness("audio_data")), transforms the stream produced by the first line. Specifically, it sets a new customized field “loudness” to each audio record item, indicating the loudness (dB) of the audio.


// An example of Audio Item after setting "loudness" field.
{
    Long "time_created": 1489528276655,
    Long "timestamp": 1489528266640,
    AudioData "audio_data": <AudioData@12416728>,
    Double "loudness": 30.0
}

The loudness value is calculated using a built-in operator calcLoudness(). You can find the list of all built-in operators here. Developers can also customize their own operators.

The third line, .onChange("loudness", callback), outputs the items with a callback. Specifically, it monitors the value of “loudness”, and fires a callback once the loudness value changes. To get the code to work, you will need to define what the callback is. A working example is shown as follows:


// Make sure you have included the following audio permission tag in manifest:
// <uses-permission android:name="android.permission.RECORD_AUDIO" />

// Define a callback to handle loudness changes
Callback<Integer> callback = new Callback<>() {
    @Override
    protected void onInput(Double loudness) {
        System.out.println("Current loudness is " + loudness + " dB.")
        // ...
    }
}

 uqi.getData(Audio.recordPeriodic(10*1000, 2*60*1000), Purpose.HEALTH("monitoring sleep"))
    .setField("loudness", AudioOperators.calcLoudness("audio_data"))
    .onChange("loudness", callback)

Getting recent called contacts

Here is another example: getting a list of recent-called phone numbers.


List<String> recentCalledNumbers = 
    uqi.getData(Call.getLogs(), Purpose.SOCIAL("finding your recent called contacts."))
       .filter("type", "outgoing")    // Only keep the call logs whose "type" field is "outgoing"
       .sortBy("timestamp")           // Sort the call logs according to "timestamp" field, in ascending order
       .reverse()                     // Reverse the order, now the most recent call log comes first
       .limit(10)                     // Keep the most recent 10 logs
       .asList("contact")             // Output the values of "contact" field (the phone numbers) to a list

The above query accesses the call logs with Call.getLogs() and processes the call logs with functions like filter, sortBy, etc.

The pre-defined fields in a Call item include:

Reference Name Type Description
Call.TIME_CREATED "time_created" Long The timestamp of when this item is created. It is a general field for all items.
Call.TIMESTAMP "timestamp" Long The timestamp of when the phone call is happened.
Call.CONTACT "contact" String The contact (phone number or name) of the phone call.
Call.DURATION "duration" Long The duration of the phone call, in milliseconds.
Call.TYPE "type" String The type of the phone call, could be “incoming”, “outgoing” or “missed”.

Note that “Reference” is the equivalence to “Name”, i.e. filter("type", "outgoing") is the same as filter("type", "outgoing").

About permissions. Accessing call logs requires READ_CALL_LOG permission in Android. To use the above code, you need to add the permission tag in AndroidManifest.xml and handle the exception if the permission request is denied by user. For example:

In AndroidManifest.xml:

...
<uses-permission android:name="android.permission.READ_CALL_LOG" />

<application
           android:theme="@style/AppTheme"
           ...

In Java code:


try {
    List<String> recentCalledNumbers = 
        uqi.getData(Call.getLogs(), Purpose.SOCIAL("finding your closest friends."))
           .filter("type", "outgoing")  // Only keep the outgoing call logs
           .sortBy("timestamp")         // Sort the call logs according to timestamp, in ascending order
           .reverse()                   // Reverse the order, now the most recent call log comes first
           .limit(10)                   // Keep the most recent 10 logs
           .asList("contact")           // Output the values of CONTACT field (the phone numbers) to a list
} catch (PSException e) {
    if (e.isPermissionDenied()) {
        String[] deniedPermissions = e.getDeniedPermissions();
        ...
    }
}

That’s it! More details about exception handling will be discussed in Permissions and exception handling section.

PrivacyStreams API

This section will explain the details about PrivacyStreams APIs with a more complicated example.

Suppose we want to do the following programming task with PrivacyStreams:

The code to do the task with PrivacyStreams is as follows:


String mostCalledContact = 
     uqi.getData(Call.getLogs(), Purpose.SOCIAL("finding closest contact."))  // get a stream of call logs
        .filter(TimeOperators.recent("timestamp", 365*24*60*60*1000))  // keep the call logs in recent 365 days
        .groupBy("contact")  // group by "contact" field (phone number)
        .setGroupField("#calls", StatisticOperators.count())  // create "#calls" field as the number of grouped call logs in each group
        .select(ItemsOperators.getItemWithMax("#calls"))  // select the item with largest "#calls"
        .getFirst("contact");  // get the "contact" field of the item

Uniform query interface (UQI)

In PrivacyStreams, all types of personal data can be accessed and processed through the unified query interface (UQI).

UQI.getData(Provider, Purpose)[.transform(Transformation)]*.output(Action)

The query describes a PrivacyStreams pipeline, which is a sequence of three types of functions, including:

The Transformation and Action functions are based on a lot of operators, including comparators, arithmetic operators, etc.. For example, filter() is a Transformation, and it can take TimeOperators.recent() operator as the parameter, meaning it only keeps the items whose TIMESTAMP field value is a recent time.

Except for the functions, a query also requires a Purpose parameter to state the purpose of the data access. In the example, the purpose of accessing call logs is “finding your closest contact”, in SOCIAL category. We suggest you carefully explain the purposes in your app, because explaining the purposes can help users understand why your app needs the data, hence improving the privacy transparency of your app. We have several purpose categories (such as Purpose.ADS(..), Purpose.SOCIAL(..), etc.) available for you to select from.

PrivacyStreams pipeline

The figure below shows an overview of the PrivacyStreams pipeline:

PrivacyStreams overview

The basic data types in PrivacyStreams include Item and Stream.

The pipeline of the running example is illustrated as follows (note that some field names are simplified and the field values are mocked):

An pipeline illustration of the code in the example.

Reusing streams

(Deprecated)

Sometimes you may need to reuse a stream for different actions. For example, in the above example, if we also want to get the phone number that has the longest total duration of calls, we may need to reuse the call log stream.

We provide a method reuse(int) to support stream reusing, where the int parameter means the number of reuses.


MStreamInterface streamToReuse = 
              uqi.getData(Call.getLogs(), Purpose.SOCIAL("finding your closest contact."))
                 .filter(recent("timestamp", 365*24*60*60*1000))
                 .groupBy("contact")
                 .reuse(2);  // reuse current stream twice.
        
String mostCalledContact = 
    streamToReuse.setGroupField("#calls", count())
                 .select(getItemWithMax("#calls"))
                 .getField("contact");
                 
String longestCalledContact = 
    streamToReuse.setGroupField("durationOfCalls", sum("duration"))
                 .select(getItemWithMax("durationOfCalls"))
                 .getField("contact");

Non-blocking pipeline

The UQI query with return values is called a blocking pipeline (which will block the execution until the result returns).

In Android, non-blocking pipelines might be more common. A non-blocking pipeline will NOT pause the code execution, and its result will be returned asynchronously.

PrivacyStreams provides many non-blocking Actions (such as forEach, onChange, ifPresent, etc.) for building non-blocking pipelines. You can find all non-blocking actions here.

Exceptions and Permissions

Sometimes the pipelines may fail due to exceptions, such as InterruptedException, PermissionDeniedException, etc.

In PrivacyStreams, exception handling is extremely easy for both blocking pipeline and non-blocking pipeline.

Handling exceptions in blocking pipelines

For blocking pipelines, simply put your query in a try block and catch PSException. For example:


try {
    result = uqi.getData(...).transform(...).output(...); // A blocking pipeline.
} catch (PSException e) {
    System.out.println(e.getMessage());
}

Handling exceptions in non-blocking pipelines

For non-blocking pipelines, simply override the onFail(PSException e) method in your result handler. For example:


 uqi.getData(...)
    .transform(...)
    .ifPresent(..., new Callback<Object>() {
        @Override
        protected void onInput(Object result) {
            ...
        }
        
        @Override
        protected void onFail(PSException e) {
            System.out.println(e.getMessage());
        }
    });

Permission configuration

In Android, access to personal data is controlled with a permission-based access control mechanism. Android apps need to declare permissions in AndroidManifest.xml. For Android 6.0+, apps must request permissions at runtime, including checking whether permissions are granted, prompting users to grant the permissions and handling users’ access control decisions. With Android standard APIs, these are often difficult.

In PrivacyStreams, permission configuration can be much easier. Follow the steps below:

  1. Write your pipeline, and catch the exception;
  2. Print the exception, and you will see which permissions are needed;
  3. Add the needed permissions to AndroidManifest.xml.

That’s it. PrivacyStream will automatically generate a dialog to ask users for permissions. If the requested permissions are not granted, a PSException will be thrown.

Debugging and testing

PrivacyStreams provides some simple interfaces to support debugging and testing.

Mocking data source

You can use TestItem and MockItem classes for debugging and testing.

Printing the streams

Most data types support serialization, i.e. you can print the streams and see what happens.

Read more

API Docs

For more information about PrivacyStreams APIs, please refer to:

News & Posts