Getting Started

The following guide will help you to get started using Connect. Once you have your project ID and API key and you have decided how to model your events, you can start pushing events and executing queries.

Installing the SDK

Installing the SDK is easy if you're using either Gradle or Maven.

Gradle

Add the following to your build.gradle:

repositories {
    mavenCentral()
}
dependencies {
    compile 'io.getconnect:connect-client-android:1.+'
}

Maven

Add the following dependency to your pom.xml:

<dependency>
  <groupId>io.getconnect</groupId>
  <artifactId>connect-client-android</artifactId>
  <version>1.3</version>
</dependency>

Permissions

You must ensure that your Android app has the INTERNET permission to allow the SDK to push events to the Connect API. Make sure you have specified this in your AndroidManifest.xml:

<uses-permission android:name="android.permission.INTERNET"/>

Creating a client

To start pushing or queuing events for delivery to Connect, you must create a Connect client:

AndroidConnectClient client = new AndroidConnectClient(getBaseContext(), "PROJECT_ID", "PUSH_API_KEY");

Each client is bound to a specific project. If you wish to push to multiple projects, simply construct multiple clients.

Pushing events

Once you have created a client, you can start pushing events.

Single event

To push a single event asynchronously to Connect:

// Construct the event
HashMap<String, Object> event = new HashMap<String, Object>();
event.put("product", "banana");
event.put("quantity", 5);
event.put("totalCost", 14.75);

// Push the event asynchronously to Connect
client.pushAsync("productsSold", event, new ConnectCallback() {
    public void onSuccess() {
        // Called if the event was successfully pushed
    }
    public void onFailure(ConnectException e) {
        e.printStackTrace();
    }
});

To push a single event synchronously to Connect:

// Construct the event
HashMap<String, Object> event = new HashMap<String, Object>();
event.put("product", "banana");
event.put("quantity", 5);
event.put("totalCost", 14.75);

// Push the event synchronously to Connect
client.push("productsSold", event);

Maps and arrays can be nested inside the root event map to allow for nesting of properties.

Queuing events

You can also queue events for pushing later to Connect. This is useful if you are collecting many events and wish to push them periodically or on a specific trigger.

Queuing events simply pushes the event into the configured EventStore (see Configuring event stores).

To queue an event:

// Construct the event
HashMap<String, Object> event = new HashMap<String, Object>();
event.put("product", "banana");
event.put("quantity", 5);
event.put("totalCost", 14.75);

// Add the event to the queue
client.add("productsSold", event);

Periodically or on a specific trigger, you must call pushPending() or pushPendingAsync() which will push the queued events in a batch to Connect:

// Push the queued events to Connect
client.pushPendingAsync(new ConnectBatchCallback() {
    public void onSuccess(Map<String, Iterable<EventPushResponse>> details) {
        // Details will contain the success status of each event
    }
    public void onFailure(ConnectException e) {
        e.printStackTrace();
    }
});

Batches of events

You can also push multiple events to multiple collections in a single call:


// Construct the events
HashMap<String, Object> event1 = new HashMap<String, Object>();
event1.put("product", "banana");
event1.put("quantity", 5);
event1.put("totalCost", 14.75);

HashMap<String, Object> event2 = new HashMap<String, Object>();
event2.put("product", "carrot");
event2.put("quantity", 2);
event2.put("totalCost", 4.00);

// Create the batch (collection name is the key)
HashMap<String, Map<String, Object>[]> batch = new HashMap<String, Map<String, Object>[]>();
batch.put("productsSold", new Map[]{event1, event2});

client.pushBatchAsync(batch, new ConnectBatchCallback() {
    public void onSuccess(Map<String, Iterable<EventPushResponse>> details) {
        // details will contain the success status of each event
    }
    public void onFailure(ConnectException e) {
        e.printStackTrace();
    }
});

Exception handling

When using the synchronous pushing events, exceptions are thrown, so you should either ignore or handle those exceptions gracefully.

Specifically, the following exceptions can be thrown when pushing events synchronously:

InvalidEventException - the event being pushed is invalid (e.g. invalid event properties)
ServerException - a server-side exception occurred in Connect's API
ConnectException - a generic exception. (e.g. a network failure)

Configuring event stores

To queue events, the SDK uses an EventStore to store and retrieve events for queuing and later pushing, respectively.

By default, AndroidConnectClient uses FileEventStore to store events temporarily on the filesystem (in Android's cache directory). This store is persistent and will guarantee delivery even in the event of app/device failure.

Bulk importing events

Currently, this SDK does not support bulk importing events.

However, you can use the HTTP API to run bulk imports if you need.

Restrictions on pushing

There are a number of restrictions on the properties you can use in your events and the limitations on querying which influences how you should structure your events.

Refer to restrictions in the modeling your events section.

Reliability of events

You can ensure delivery of events reliably by queuing the events and configuring event stores. You should then handle the response from pushPending() or pushPendingAsync() to verify that all the events were successfully pushed.

client.pushPendingAsync(new ConnectBatchCallback() {
    public void onSuccess(Map<String, Iterable<EventPushResponse>> details) {
        for (String collection : details.keySet()) {
            for (EventPushResponse eventResponse : details.get(collection)) {
                // eventResponse will contain the details about the success of the event.
            }
        }
    }
    public void onFailure(ConnectException e) {
        e.printStackTrace();
    }
});

Events also allow a custom ID to be sent in the event document which will prevent duplicates (i.e. guarantees idempotence even if the event is delivered multiple times). For example:

// Construct the event
HashMap<String, Object> event = new HashMap<String, Object>();
event.put("product", "banana");
event.put("quantity", 5);
event.put("totalCost", 14.75);

// Set an ID on the event to prevent duplicates
event.put("id", "1849506679");

// Push the event to Connect
client.pushAsync("productsSold", event, new ConnectCallback() {
    public void onSuccess() {}
    public void onFailure(ConnectException e) {
        e.printStackTrace();
    }
});

Timestamps

All events have a single timestamp property which records when the event being pushed occurred. Events cannot have more than one date/time property. If you feel you need more than one date/time property, you probably need to reconsider how you're modeling your events.

Querying

You can only run time interval queries or timeframe filters on the timestamp property. No other date/time property in an event is supported for querying.

By default, if no timestamp property is sent with the event, the SDK will use the current date and time as the timestamp of the event.

The timestamp, however, can be overridden to, for example, accommodate historical events or maintain accuracy of event times when events are queued. For example:

// Construct the event
HashMap<String, Object> event = new HashMap<String, Object>();
event.put("product", "banana");
event.put("quantity", 5);
event.put("totalCost", 14.75);

// Set the event's timestamp
event.put("timestamp", new Date());

// Push the event asynchronously to Connect
client.pushAsync("productsSold", event, new ConnectCallback() {
    public void onSuccess() {
        // Called if the event was successfully pushed
    }
    public void onFailure(ConnectException e) {
        e.printStackTrace();
    }
});

Timezones

Timestamps are always recorded in UTC. If you supply a timestamp in a timezone other than UTC, it will be converted to UTC. When you query your events, you can specify a timezone so things like time intervals will be returned in local time.

Querying events

Currently, this SDK does not support querying events. However, you have the following options to query:

HTTP API - send queries and receives results directly via JSON
JavaScript SDK - query events and build visualizations client-side in the browser
.NET SDK - query events using the fluent .NET query syntax

We're really looking forward to supporting querying in all SDKs soon!

Exporting events

Currently, this SDK does not support exporting events.

However, you can use HTTP API to perform exports as required.

Deleting collections

Currently, this SDK does not support deleting collections.

However, you can use the one of the following methods to delete collections if required:

Projects and keys

Connect allows you to manage multiple projects under a single account so that you can easily segregate your collections into logical projects.

You could use this to separate analytics for entire projects, or to implement separation between different environments (e.g. My Project (Prod) and My Project (Dev)).

To start pushing and querying your event data, you will need both a project ID and an API key. This information is available to you via the admin console inside each project under the "Keys" tab:

Screenshot of project keys in Connect admin console

By default, you can choose from four different types of keys, each with their own specific use:

Push/Query Key - you can use this key to both push events and execute queries.
You should only use this key in situations where it is not possible to isolate merely pushing or querying.
Push Key - you can only use this key to push events.
You should use this key in your apps where you are tracking event data, but do not require querying.
Query Key - you can only use this key to execute queries.
You should use this key in your reporting interfaces where you do not wish to track events.
Master Project Key - you can use this key to execute all types of operations on a project, including pushing, querying and deleting collections.
Keep this key safe - it is intended for very limited use and definitely should not be included in your main apps.

You must use your project ID and desired key to begin using Connect:

AndroidConnectClient client = new AndroidConnectClient(getBaseContext(), "YOUR_PROJECT_ID", "YOUR_API_KEY");

Security

Security is a vital component to the Connect service and we take it very seriously. It is important to consider how to ensure your data remains secure.

API Keys

API keys are the core security mechanisms by which you can push and query your data. It is important to keep these keys safe by controlling where these keys exist and who has access to them.

Each key can either push, query or both. The most important key is the Project Master Key which can perform all of these actions, as well as administrative functions such as deleting data. Read more about the keys here.

Keeping API Keys Secure

You should carefully consider when and which API keys to expose to users.

Crucially, you should never expose your Project Master Key to users or embed it in client applications. If this key does get compromised, you can reset it.

If you embed API keys in client applications, you should consider these keys as fully accessible to anyone having access to that client application. This includes both mobile and web applications.

Pushing events securely

While you can use a Push Key to prevent clients from querying events, you cannot restrict the collections or events clients can push to the API. Unfortunately, this is the nature of tracking events directly client-side and opens the door to malicious users potentially sending bad data.

In many circumstances, this is not an issue as users can already generate bad data simply by using your application in an incorrect way, generating events with bad or invalid data. In circumstances where you absolutely cannot withstand bad event data, you should consider pushing the events server-side from a service under your control.

Finally, if a Push Key is compromised or being used maliciously, you can always reset it by resetting the master key.

Querying events securely

To query events, you must use an API key that has query permissions. By default, a Query Key has full access to all events in all collections in your project. If this key is exposed, a client could execute any type of query on your collections.

You have a number of options on querying events securely:

For internal querying or dashboard, you may consider it acceptable to expose the normal Query Key in client applications. Keep in mind that this key can execute any query on any collection in the project.
Generate a filtered key, which applies a specific set of filters to all queries executed by clients with the key.
Only allow clients to execute queries via a service you control, which in turn executes queries via the Connect API server-side.

Finally, if a Query Key is compromised or being used maliciously, you can always reset it by resetting the master key.

Resetting the master key

Resetting the Project Master Key will invalidate the previous key and generate a new, random key. This action will also reset all other keys for the project (including the push, query and any filter keys generated).

Doing this is irreversible and would prevent all applications with existing keys from pushing to or querying the project.

You can only reset the master key in the projects section of the admin console.

Filtered keys

Filtered keys allows you to create an API key that can either push or query, and in the case of querying, apply one or more filters to all queries executed with the key.

This allows you to have finer control over security and what data clients can access, especially in multi-tenant environments.

Filters are only applied to queries

Any filters specified in your filtered key only apply to querying. We currently do not support applying filters to restrict the pushing of events.

Filtered keys can only push or query (as you specify), never administrative functions or deleting data.

Generating a filtered key

Filtered keys are generated and encrypted with the Project Master Key. You do not have to register the filtered key with the Connect service.

It is important that you never generate filtered keys client-side and always ensure the are generated by a secure, server-side service.

Create a JSON object describing the access allowed by the key, including any filters, for example:

{
    "filters": {
        "customerId": 1234
    },
    "canQuery": true,
    "canPush": false
}

Property	Type	Description
`filters`	`object`	The filters to apply all queries executed when using the key. This uses the same specification for defining filters when querying normally.
`canQuery`	`boolean`	Whether or not the key can be used to execute queries. If `false`, the `filters` property is ignored (as it does not applying to pushing).
`canPush`	`boolean`	Whether or not the key can be used to push events.

Serialize the JSON object to a string.
Generate a random 128-bit initialization vector (IV).
Encrypt the JSON string using AES256-CBC with PKCS7 padding (128-bit block size) using the 128-bit random IV we generated in the previous step and the Project Master Key as the encryption key.
Convert the IV and cipher text to hex strings and combine them (IV first, followed by the cipher text), separating them with a hyphen (-). The resulting key should look something like this:
- - Not Found
```
5D07A77D87D5B20FA5508303F748A43B-DDFA284A9341068A704A846E83ACF49069D960632C4F74A44B5EE330073F79A8324ADC91023F88F63AAE4507F3D119B5C7F31A2D7D9616408E9665EC6C1DEBE3
```

You can now use this API key to either push events or execute queries depending on the canPush and canQuery properties, respectively.

Finally, if a filtered key is compromised or being used maliciously, you can always reset it by resetting the master key.

Modeling your events

When using Connect to analyze and visualize your data, it is important to understand how best to model your events. The way you structure your events will directly affect your ability to answer questions with your data. It is therefore important to consider up-front the kind of questions you anticipate answering.

What is an event?

An event is an action that occurs at a specific point in time. To answer "why did this event occur?", our event needs to contain rich details about what the "world" looked like at that point in time.

Put simply, events = action + time + state.

For example, imagine you are writing an exercise activity tracker app. We want to give users of your app the ability to analyze their performance over time. This is an event produced by our hypothetical activity tracker app:

HashMap<String, Object> myEvent = new HashMap<String, Object>();
myEvent.put("type", "cycling");
myEvent.put("timestamp", new Date());
myEvent.put("duration", 67);
myEvent.put("distance", 21255);
myEvent.put("caloriesBurned", 455);
myEvent.put("maxHeartRate", 182);

HashMap<String, Object> user = new HashMap<String, Object>();
user.put("id", 698396);
user.put("firstName", "Bruce");
user.put("lastName", "Jones");
user.put("age", 35);

myEvent.put("user", user);

Action

What happened? In the above example, the action is an activity was completed.

In most circumstances, we group all events of the same action into a single collection. In this case, we could call our collection activityCompleted, or alternatively, just activity.

Time

When did it happen? In the above example, we specified the start time of the activity as the value of the timestamp property. The top-level timestamp property is a special property in Connect. This is because time is an essential property of event data - it's not optional.

When an event is pushed to Connect, the current time is assigned to the timestamp property if no value was provided by you.

State

What do we know about this action? What do we know about the entities associated with this action? What do we know about the "world" at this moment in time? Every property in our event, besides the timestamp and the name of the collection, serves to answer those questions. This is the most important aspect of our event - it's where all the answers live.

The richer the data you provide in your event, the more questions you can answer for your users, therefore it's important to enrich your events with as much information as possible. In stark contrast to the relational model where you would store this related information in separate tables and join at query time, in the event model this data is denormalized into each event, so as to know the state of the "world" at the point in time of the event.

Collections

It is important when modeling your events to consider how you intend to group those events into collections. This is a careful balance between events being broad enough to answer queries for your users, while specific enough to be manageable.

In our activity example, the activity contains different properties based on what the type of activity. Our cycling activity contains properties associated with the bike that was used, while a kayaking activity may contain properties associated with a kayak that is used.

Because a kayaking event may have different properties to a running event, it might seem logical to put each of them in distinct collections. However, if we had distinct cycling, running and kayaking collections, we would lose the opportunity to query details that are common to all activities.

As a general rule, consider the common action among your events and decide if the specific variants of that action warrant grouping those events together.

Structuring your events

Events have the following core properties:

Denormalized
Immutable
Rich/nested
Schemaless

It is also important to consider how to group events into collections to enable future queries to be answered.

Events are denormalized

Consider our example event again, notice the age property of the user:

HashMap<String, Object> myEvent = new HashMap<String, Object>();
myEvent.put("type", "cycling");
...

HashMap<String, Object> user = new HashMap<String, Object>();
user.put("id", 698396);
user.put("firstName", "Bruce");
user.put("lastName", "Jones");
user.put("age", 35);

myEvent.put("user", user);

The user's age is going to be duplicated in every activity he/she completes throughout the year. This may seem inefficient; however, remember that Connect is about analyzing. This denormalization is a real win for analysis; the key is that event data stores state over time, rather than merely current state. This helps us answer questions about why something happened, because we know what the "world" looked like at that point of time.

For example imagine we wanted to chart the average distance cycled per ride, grouped by the age of the rider at the time of the ride. We could simply execute the following query:

var query = connect.query('activity')
  .select({ averageDistance: { avg: 'distance' } })
  .groupBy('user.age');

var chart = connect.chart(query, '#chart', {
    title: 'Average distance per activity by age',
    chart: { type: 'bar' }
});

It's this persistence of state over time that makes event data perfect for analysis.

Events are immutable

By their very nature, events cannot change, as they always record state at the point in time of the event. This is also the reason to record as much rich information about the event and "state of the world" as possible.

For example, in our example event above, while Bruce Jones may now be many years older, at the time he completed his bike ride, he was 35 years of age. By ensuring this event remains immutable, we can correctly analyze bike riding over time by 35-year-olds.

Consider events as recording history - as much as we'd occasionally like to, we can't change history!

Events are rich and nested

Events are rich in that they specify very detailed state. They specify details about the event itself, the entities involved and the state of the "world" at that point in time.

Consider our example activity event - the top level type property describes something about the activity itself (a run, a bike ride, a kayak etc.). The user property specifies rich information about the actor who performed the event. In this case it's the person who completed the activity, complete with their name and age.

In reality, though, we may decide to include a few other nested entities in our event, for example:

HashMap<String, Object> myEvent = new HashMap<String, Object>();
myEvent.put("type", "cycling");
...

HashMap<String, Object> user = new HashMap<String, Object>();
user.put("id", 698396);
user.put("firstName", "Bruce");
...

HashMap<String, Object> bike = new HashMap<String, Object>();
bike.put("id", 231806);
bike.put("brand", "Specialized");
bike.put("model", "S-Works Venge");

HashMap<String, Object> weather = new HashMap<String, Object>();
weather.put("condition", "Raining");
weather.put("temperature", 21);
weather.put("humidity", 99);
weather.put("wind", 17);

myEvent.put("user", user);
myEvent.put("bike", bike);
myEvent.put("weather", weather);

Note our event now includes details about the bike used and the weather conditions at the time of the activity. By adding this extra bike state information to our event, we have opened up extra possibilities for interrogating our data. For example, we can now query the average distance cycled by each model of bike that was built by "Specialized":

var query = connect.query('activity')
  .select({ averageDistance: { avg: 'distance' } })
  .groupBy('bike.model');
  .filter({
    'bike.brand': 'Specialized'
  });

The weather also provides us with exciting insights - what did the world look like at this point in time? What was the weather like? Storing this data allows us to answer yet more questions. We can test our hypothesis that "older people are less scared of riding in the rain" by simply charting the following query:

var query = connect.query('activity')
  .select({ averageDistance: { avg: 'distance' } })
  .groupBy(['user.age', 'weather.condition']);

As you can see, the richer and more denormalized the event, the more interesting answers can be derived when later querying.

Events are schemaless

Events in Connect should be considered semi-structured - that is, they have an inherent structure, but it is not defined. This means you can, and should, push as much detailed information about an event and the state of the "world" as possible. Moreover, this allows you to improve your schema over time and add extra information about new events as that information becomes available.

Restrictions

While you can post almost any event structure to Connect, there are a few, by-design restrictions.

Property names

You cannot have any property in the root document beginning with "tp_". This is because we prefix our own internal properties with this. Internally, we merge our properties into your events for performance at query time.
The property "_id" is reserved and cannot be pushed.
The properties "id" and "timestamp" have special purposes. These allow consumers to specify a unique ID per event and override the event's timestamp respectively. You cannot use the "id" property in queries. Refer to "reliability of events" and "timestamps" for information.
The length of property names can't exceed 255 characters. If you need property names longer than this, you probably need to reconsider the structure of your event!
Properties cannot include a dot in their names. This is because dots are used in querying to access nested properties. The following is an example of an invalid event property due to a dot in the name:

event.put("invalid.property", "value");

Arrays

While you can create events with arrays, it is currently not possible to take advantage of these arrays at query time. Therefore, you should avoid using arrays in your events unless you plan to export the raw events.

Distinct count

Distinct count is currently not supported for querying, therefore you should consider how to structure your event if your application relies on this.