The following guide will help you to get started using Connect. Once you have your project ID and API key and you have decided how to model your events, you can start pushing events and executing queries.
Installing the SDK is easy if you're using either Gradle or Maven.
Add the following to your build.gradle:
repositories {
mavenCentral()
}
dependencies {
compile 'io.getconnect:connect-client-android:1.+'
}
Add the following dependency to your pom.xml:
<dependency>
<groupId>io.getconnect</groupId>
<artifactId>connect-client-android</artifactId>
<version>1.3</version>
</dependency>
You must ensure that your Android app has the INTERNET
permission to allow the SDK to push events to the Connect API. Make sure you have specified this in your AndroidManifest.xml:
<uses-permission android:name="android.permission.INTERNET"/>
To start pushing or queuing events for delivery to Connect, you must create a Connect client:
AndroidConnectClient client = new AndroidConnectClient(getBaseContext(), "PROJECT_ID", "PUSH_API_KEY");
Each client is bound to a specific project. If you wish to push to multiple projects, simply construct multiple clients.
Once you have created a client, you can start pushing events.
To push a single event asynchronously to Connect:
// Construct the event
HashMap<String, Object> event = new HashMap<String, Object>();
event.put("product", "banana");
event.put("quantity", 5);
event.put("totalCost", 14.75);
// Push the event asynchronously to Connect
client.pushAsync("productsSold", event, new ConnectCallback() {
public void onSuccess() {
// Called if the event was successfully pushed
}
public void onFailure(ConnectException e) {
e.printStackTrace();
}
});
To push a single event synchronously to Connect:
// Construct the event
HashMap<String, Object> event = new HashMap<String, Object>();
event.put("product", "banana");
event.put("quantity", 5);
event.put("totalCost", 14.75);
// Push the event synchronously to Connect
client.push("productsSold", event);
Maps and arrays can be nested inside the root event map to allow for nesting of properties.
You can also queue events for pushing later to Connect. This is useful if you are collecting many events and wish to push them periodically or on a specific trigger.
Queuing events simply pushes the event into the configured EventStore
(see Configuring event stores).
To queue an event:
// Construct the event
HashMap<String, Object> event = new HashMap<String, Object>();
event.put("product", "banana");
event.put("quantity", 5);
event.put("totalCost", 14.75);
// Add the event to the queue
client.add("productsSold", event);
Periodically or on a specific trigger, you must call pushPending()
or pushPendingAsync()
which will push the queued events in a batch to Connect:
// Push the queued events to Connect
client.pushPendingAsync(new ConnectBatchCallback() {
public void onSuccess(Map<String, Iterable<EventPushResponse>> details) {
// Details will contain the success status of each event
}
public void onFailure(ConnectException e) {
e.printStackTrace();
}
});
You can also push multiple events to multiple collections in a single call:
// Construct the events
HashMap<String, Object> event1 = new HashMap<String, Object>();
event1.put("product", "banana");
event1.put("quantity", 5);
event1.put("totalCost", 14.75);
HashMap<String, Object> event2 = new HashMap<String, Object>();
event2.put("product", "carrot");
event2.put("quantity", 2);
event2.put("totalCost", 4.00);
// Create the batch (collection name is the key)
HashMap<String, Map<String, Object>[]> batch = new HashMap<String, Map<String, Object>[]>();
batch.put("productsSold", new Map[]{event1, event2});
client.pushBatchAsync(batch, new ConnectBatchCallback() {
public void onSuccess(Map<String, Iterable<EventPushResponse>> details) {
// details will contain the success status of each event
}
public void onFailure(ConnectException e) {
e.printStackTrace();
}
});
When using the synchronous pushing events, exceptions are thrown, so you should either ignore or handle those exceptions gracefully.
Specifically, the following exceptions can be thrown when pushing events synchronously:
InvalidEventException
- the event being pushed is invalid (e.g. invalid event properties)ServerException
- a server-side exception occurred in Connect's APIConnectException
- a generic exception. (e.g. a network failure)To queue events, the SDK uses an EventStore
to store and retrieve events for queuing and later pushing, respectively.
By default, AndroidConnectClient
uses FileEventStore
to store events temporarily on the filesystem (in Android's cache directory).
This store is persistent and will guarantee delivery even in the event of app/device failure.
Currently, this SDK does not support bulk importing events.
However, you can use the HTTP API to run bulk imports if you need.
There are a number of restrictions on the properties you can use in your events and the limitations on querying which influences how you should structure your events.
Refer to restrictions in the modeling your events section.
You can ensure delivery of events reliably by queuing the events and configuring event stores.
You should then handle the response from pushPending()
or pushPendingAsync()
to verify that all the events were successfully pushed.
client.pushPendingAsync(new ConnectBatchCallback() {
public void onSuccess(Map<String, Iterable<EventPushResponse>> details) {
for (String collection : details.keySet()) {
for (EventPushResponse eventResponse : details.get(collection)) {
// eventResponse will contain the details about the success of the event.
}
}
}
public void onFailure(ConnectException e) {
e.printStackTrace();
}
});
Events also allow a custom ID to be sent in the event document which will prevent duplicates (i.e. guarantees idempotence even if the event is delivered multiple times). For example:
// Construct the event
HashMap<String, Object> event = new HashMap<String, Object>();
event.put("product", "banana");
event.put("quantity", 5);
event.put("totalCost", 14.75);
// Set an ID on the event to prevent duplicates
event.put("id", "1849506679");
// Push the event to Connect
client.pushAsync("productsSold", event, new ConnectCallback() {
public void onSuccess() {}
public void onFailure(ConnectException e) {
e.printStackTrace();
}
});
All events have a single timestamp
property which records when the event being pushed occurred. Events cannot
have more than one date/time property. If you feel you need more than one date/time property, you probably need
to reconsider how you're modeling your events.
Querying
You can only run time interval queries or timeframe filters on the
timestamp
property. No other date/time property in an event is supported for querying.
By default, if no timestamp
property is sent with the event, the SDK will use the current date and time as
the timestamp of the event.
The timestamp, however, can be overridden to, for example, accommodate historical events or maintain accuracy of event times when events are queued. For example:
// Construct the event
HashMap<String, Object> event = new HashMap<String, Object>();
event.put("product", "banana");
event.put("quantity", 5);
event.put("totalCost", 14.75);
// Set the event's timestamp
event.put("timestamp", new Date());
// Push the event asynchronously to Connect
client.pushAsync("productsSold", event, new ConnectCallback() {
public void onSuccess() {
// Called if the event was successfully pushed
}
public void onFailure(ConnectException e) {
e.printStackTrace();
}
});
Timezones
Timestamps are always recorded in UTC. If you supply a timestamp in a timezone other than UTC, it will be converted to UTC. When you query your events, you can specify a timezone so things like time intervals will be returned in local time.
Currently, this SDK does not support querying events. However, you have the following options to query:
We're really looking forward to supporting querying in all SDKs soon!
Currently, this SDK does not support exporting events.
However, you can use HTTP API to perform exports as required.
Currently, this SDK does not support deleting collections.
However, you can use the one of the following methods to delete collections if required:
Connect allows you to manage multiple projects under a single account so that you can easily segregate your collections into logical projects.
You could use this to separate analytics for entire projects, or to implement separation between different environments (e.g. My Project (Prod) and My Project (Dev)).
To start pushing and querying your event data, you will need both a project ID and an API key. This information is available to you via the admin console inside each project under the "Keys" tab:
By default, you can choose from four different types of keys, each with their own specific use:
Push/Query Key
- you can use this key to both push events and execute queries.
You should only use this key in situations where it is not possible to isolate merely pushing or querying.
Push Key
- you can only use this key to push events.
You should use this key in your apps where you are tracking event data, but do not require querying.
Query Key
- you can only use this key to execute queries.
You should use this key in your reporting interfaces where you do not wish to track events.
Master Project Key
- you can use this key to execute all types of operations on a project, including
pushing, querying and deleting collections.
Keep this key safe - it is intended for very limited use and definitely should not be included in your main apps.
You must use your project ID and desired key to begin using Connect:
AndroidConnectClient client = new AndroidConnectClient(getBaseContext(), "YOUR_PROJECT_ID", "YOUR_API_KEY");
Security is a vital component to the Connect service and we take it very seriously. It is important to consider how to ensure your data remains secure.
API keys are the core security mechanisms by which you can push and query your data. It is important to keep these keys safe by controlling where these keys exist and who has access to them.
Each key can either push, query or both. The most important key is the Project Master Key
which can perform all of
these actions, as well as administrative functions such as deleting data. Read more about the keys here.
You should carefully consider when and which API keys to expose to users.
Crucially, you should never expose your Project Master Key
to users or embed it in client applications.
If this key does get compromised, you can reset it.
If you embed API keys in client applications, you should consider these keys as fully accessible to anyone having access to that client application. This includes both mobile and web applications.
While you can use a Push Key
to prevent clients from querying events, you cannot restrict the collections or events
clients can push to the API. Unfortunately, this is the nature of tracking events directly client-side and opens the
door to malicious users potentially sending bad data.
In many circumstances, this is not an issue as users can already generate bad data simply by using your application in an incorrect way, generating events with bad or invalid data. In circumstances where you absolutely cannot withstand bad event data, you should consider pushing the events server-side from a service under your control.
Finally, if a Push Key
is compromised or being used maliciously, you can always reset it by resetting the master key.
To query events, you must use an API key that has query permissions. By default, a Query Key
has full access to all
events in all collections in your project. If this key is exposed, a client could execute any type of query on your
collections.
You have a number of options on querying events securely:
For internal querying or dashboard, you may consider it acceptable to expose the normal Query Key
in client applications.
Keep in mind that this key can execute any query on any collection in the project.
Generate a filtered key, which applies a specific set of filters to all queries executed by clients with the key.
Only allow clients to execute queries via a service you control, which in turn executes queries via the Connect API server-side.
Finally, if a Query Key
is compromised or being used maliciously, you can always reset it by resetting the master key.
Resetting the Project Master Key
will invalidate the previous key and generate a new, random key. This action will also
reset all other keys for the project (including the push, query and any filter keys generated).
Doing this is irreversible and would prevent all applications with existing keys from pushing to or querying the project.
You can only reset the master key in the projects section of the admin console.
Filtered keys allows you to create an API key that can either push or query, and in the case of querying, apply one or more filters to all queries executed with the key.
This allows you to have finer control over security and what data clients can access, especially in multi-tenant environments.
Filters are only applied to queries
Any filters specified in your filtered key only apply to querying. We currently do not support applying filters to restrict the pushing of events.
Filtered keys can only push or query (as you specify), never administrative functions or deleting data.
Filtered keys are generated and encrypted with the Project Master Key
. You do not have to register the
filtered key with the Connect service.
It is important that you never generate filtered keys client-side and always ensure the are generated by a secure, server-side service.
Create a JSON object describing the access allowed by the key, including any filters, for example:
{
"filters": {
"customerId": 1234
},
"canQuery": true,
"canPush": false
}
Property | Type | Description |
---|---|---|
filters |
object |
The filters to apply all queries executed when using the key. This uses the same specification for defining filters when querying normally. |
canQuery |
boolean |
Whether or not the key can be used to execute queries. If false , the filters property is ignored (as it does not applying to pushing). |
canPush |
boolean |
Whether or not the key can be used to push events. |
Serialize the JSON object to a string.
Generate a random 128-bit initialization vector (IV).
Encrypt the JSON string using AES256-CBC with PKCS7 padding (128-bit block size) using the 128-bit random IV we generated in the previous step
and the Project Master Key
as the encryption key.
Convert the IV and cipher text to hex strings and combine them (IV first, followed by the cipher text), separating them with a hyphen (-). The resulting key should look something like this:
5D07A77D87D5B20FA5508303F748A43B-DDFA284A9341068A704A846E83ACF49069D960632C4F74A44B5EE330073F79A8324ADC91023F88F63AAE4507F3D119B5C7F31A2D7D9616408E9665EC6C1DEBE3
You can now use this API key to either push events or execute queries depending on the canPush
and canQuery
properties, respectively.
Finally, if a filtered key is compromised or being used maliciously, you can always reset it by resetting the master key.
When using Connect to analyze and visualize your data, it is important to understand how best to model your events. The way you structure your events will directly affect your ability to answer questions with your data. It is therefore important to consider up-front the kind of questions you anticipate answering.
An event is an action that occurs at a specific point in time. To answer "why did this event occur?", our event needs to contain rich details about what the "world" looked like at that point in time.
Put simply, events = action + time + state.
For example, imagine you are writing an exercise activity tracker app. We want to give users of your app the ability to analyze their performance over time. This is an event produced by our hypothetical activity tracker app:
HashMap<String, Object> myEvent = new HashMap<String, Object>();
myEvent.put("type", "cycling");
myEvent.put("timestamp", new Date());
myEvent.put("duration", 67);
myEvent.put("distance", 21255);
myEvent.put("caloriesBurned", 455);
myEvent.put("maxHeartRate", 182);
HashMap<String, Object> user = new HashMap<String, Object>();
user.put("id", 698396);
user.put("firstName", "Bruce");
user.put("lastName", "Jones");
user.put("age", 35);
myEvent.put("user", user);
What happened? In the above example, the action is an activity was completed.
In most circumstances, we group all events of the same action into a single collection.
In this case, we could call our collection activityCompleted
, or alternatively, just activity
.
When did it happen? In the above example, we specified the start time of the activity as the value of the timestamp property. The top-level timestamp property is a special property in Connect. This is because time is an essential property of event data - it's not optional.
When an event is pushed to Connect, the current time is assigned to the timestamp property if no value was provided by you.
What do we know about this action? What do we know about the entities associated with this action? What do we know about the "world" at this moment in time? Every property in our event, besides the timestamp and the name of the collection, serves to answer those questions. This is the most important aspect of our event - it's where all the answers live.
The richer the data you provide in your event, the more questions you can answer for your users, therefore it's important to enrich your events with as much information as possible. In stark contrast to the relational model where you would store this related information in separate tables and join at query time, in the event model this data is denormalized into each event, so as to know the state of the "world" at the point in time of the event.
It is important when modeling your events to consider how you intend to group those events into collections. This is a careful balance between events being broad enough to answer queries for your users, while specific enough to be manageable.
In our activity example, the activity contains different properties based on what the type of activity. Our cycling activity contains properties associated with the bike that was used, while a kayaking activity may contain properties associated with a kayak that is used.
Because a kayaking event may have different properties to a running event, it might seem logical to put each of them in distinct collections. However, if we
had distinct cycling
, running
and kayaking
collections, we would lose the opportunity to query details that are common to all activities.
As a general rule, consider the common action among your events and decide if the specific variants of that action warrant grouping those events together.
Events have the following core properties:
It is also important to consider how to group events into collections to enable future queries to be answered.
Consider our example event again, notice the age property of the user:
HashMap<String, Object> myEvent = new HashMap<String, Object>();
myEvent.put("type", "cycling");
...
HashMap<String, Object> user = new HashMap<String, Object>();
user.put("id", 698396);
user.put("firstName", "Bruce");
user.put("lastName", "Jones");
user.put("age", 35);
myEvent.put("user", user);
The user's age is going to be duplicated in every activity he/she completes throughout the year. This may seem inefficient; however, remember that Connect is about analyzing. This denormalization is a real win for analysis; the key is that event data stores state over time, rather than merely current state. This helps us answer questions about why something happened, because we know what the "world" looked like at that point of time.
For example imagine we wanted to chart the average distance cycled per ride, grouped by the age of the rider at the time of the ride. We could simply execute the following query:
var query = connect.query('activity')
.select({ averageDistance: { avg: 'distance' } })
.groupBy('user.age');
var chart = connect.chart(query, '#chart', {
title: 'Average distance per activity by age',
chart: { type: 'bar' }
});
It's this persistence of state over time that makes event data perfect for analysis.
By their very nature, events cannot change, as they always record state at the point in time of the event. This is also the reason to record as much rich information about the event and "state of the world" as possible.
For example, in our example event above, while Bruce Jones may now be many years older, at the time he completed his bike ride, he was 35 years of age. By ensuring this event remains immutable, we can correctly analyze bike riding over time by 35-year-olds.
Consider events as recording history - as much as we'd occasionally like to, we can't change history!
Events are rich in that they specify very detailed state. They specify details about the event itself, the entities involved and the state of the "world" at that point in time.
Consider our example activity event - the top level type property describes something about the activity itself (a run, a bike ride, a kayak etc.). The user property specifies rich information about the actor who performed the event. In this case it's the person who completed the activity, complete with their name and age.
In reality, though, we may decide to include a few other nested entities in our event, for example:
HashMap<String, Object> myEvent = new HashMap<String, Object>();
myEvent.put("type", "cycling");
...
HashMap<String, Object> user = new HashMap<String, Object>();
user.put("id", 698396);
user.put("firstName", "Bruce");
...
HashMap<String, Object> bike = new HashMap<String, Object>();
bike.put("id", 231806);
bike.put("brand", "Specialized");
bike.put("model", "S-Works Venge");
HashMap<String, Object> weather = new HashMap<String, Object>();
weather.put("condition", "Raining");
weather.put("temperature", 21);
weather.put("humidity", 99);
weather.put("wind", 17);
myEvent.put("user", user);
myEvent.put("bike", bike);
myEvent.put("weather", weather);
Note our event now includes details about the bike used and the weather conditions at the time of the activity. By adding this extra bike state information to our event, we have opened up extra possibilities for interrogating our data. For example, we can now query the average distance cycled by each model of bike that was built by "Specialized":
var query = connect.query('activity')
.select({ averageDistance: { avg: 'distance' } })
.groupBy('bike.model');
.filter({
'bike.brand': 'Specialized'
});
The weather also provides us with exciting insights - what did the world look like at this point in time? What was the weather like? Storing this data allows us to answer yet more questions. We can test our hypothesis that "older people are less scared of riding in the rain" by simply charting the following query:
var query = connect.query('activity')
.select({ averageDistance: { avg: 'distance' } })
.groupBy(['user.age', 'weather.condition']);
As you can see, the richer and more denormalized the event, the more interesting answers can be derived when later querying.
Events in Connect should be considered semi-structured - that is, they have an inherent structure, but it is not defined. This means you can, and should, push as much detailed information about an event and the state of the "world" as possible. Moreover, this allows you to improve your schema over time and add extra information about new events as that information becomes available.
While you can post almost any event structure to Connect, there are a few, by-design restrictions.
You cannot have any property in the root document beginning with "tp_". This is because we prefix our own internal properties with this. Internally, we merge our properties into your events for performance at query time.
The property "_id" is reserved and cannot be pushed.
The properties "id" and "timestamp" have special purposes. These allow consumers to specify a unique ID per event and override the event's timestamp respectively. You cannot use the "id" property in queries. Refer to "reliability of events" and "timestamps" for information.
The length of property names can't exceed 255 characters. If you need property names longer than this, you probably need to reconsider the structure of your event!
Properties cannot include a dot in their names. This is because dots are used in querying to access nested properties. The following is an example of an invalid event property due to a dot in the name:
event.put("invalid.property", "value");
While you can create events with arrays, it is currently not possible to take advantage of these arrays at query time. Therefore, you should avoid using arrays in your events unless you plan to export the raw events.
Distinct count is currently not supported for querying, therefore you should consider how to structure your event if your application relies on this.