Recipes  Built by Developers For Developers

At Data Applied, we take a pragmatic approach to programmability. We are committed to making analytics easily accessible to users but also programs. We don't assume that our Web client does everything everyone expects it to. We also understand that different developers prefer different tools and environments. Our XML-based Web API opens things up, while ensuring security. We enpower developers to use analytics the way they want. On this page, we introduce key concepts and code recipes. Click here to show all details.

Understanding Entities

Entities represent objects such as workspaces, tables, fields, tasks, or task results. Entities can be transmitted using messages, and are serialized to XML when transmitted over HTTP. Click here to view technical information about entities. Click here to view some key entities.

WorkspaceInfo Secure container encapsulating other objects.
TableInfo Table (data set) uploaded by a user for analysis.
FieldInfo Field (column) found in an uploaded data set.
TaskInfo Task to execute against a data set.
TaskResultInfo Analysis results obtained by executing tasks.
ChunkInfo Chunk of data uploaded to create a new data set.

Understanding Messages

Messages represent actions such as create, retrieve, update, or delete. Messages can transmit entities, and are serialized to XML when transmitted as HTTP requests / responses. Click here to view technical information about messages. Click here to view some key messages.

CreateMessage Creates a single specified entity.
UpdateMessage Updates a single specified entity.
DeleteMessage Deletes a single specified entity.
StartTaskMessage Starts a previously created task.
RetrieveMessage Retrieves / finds entities using condition restrictions.
RetrieveDataMessage Retrieves uploaded data rows as XML.

Serializing Entities

Entities use simple XML serialization rules when transmitted as part of HTTP requests or responses. Entities are usually wrapped within messages, which are also serialized to XML. Entities and Messages are serialized in a similar manner. Click here to view usage examples.

  • Consider entity "Entity" with a string property "Prop" set to value "hello"
  • Its representation is <Entity><Prop>hello</Prop></Entity>
  • Consider entity "Entity" with a nester property "Prop" set to entity "Nested"
  • Its representation is <Entity><Prop><Nested>...</Nested></Prop></Entity>
  • Here is how entity properties should be serialized to XML:
  • Boolean: <Entity><Prop>True</Prop></Entity>
  • Numeric: <Entity><Prop>97.37</Prop></Entity>
  • Enum: <Entity><Prop>1</Prop></Entity>
  • String: <Entity><Prop>hello</Prop></Entity>
  • Guid: <Entity><Prop>83e4e933-...-b11f-f7485e8eb83c</Prop></Entity>
  • Date: <Entity><Prop>12/29/2009 12:47:28 PM</Prop></Entity>
  • Nested: <Entity><Prop><Nested>…</Nested></Prop></Entity>

Serializing Messages

Messages use simple XML serialization rules when transmitted as part of HTTP requests or responses. Messasges can have nested properties which specify entities. Such entities are serialized to XML as nested properties. Click here to view usage examples.

  • Consider message "Msg" with a string property "Prop" set to value "hello"
  • Its representation is <Msg><Prop>hello</Prop></Msg>
  • Consider message "Msg" with a nester property "Prop" set to entity "Nested"
  • Its representation is <Msg><Prop><Nested>...</Nested></Prop></Msg>
  • Here is how message properties should be serialized to XML:
  • Boolean: <Msg><Prop>True</Prop></Msg>
  • Numeric: <Msg><Prop>97.37</Prop></Msg>
  • Enum: <Msg><Prop>1</Prop></Msg>
  • String: <Msg><Prop>hello</Prop></Msg>
  • Guid: <Msg><Prop>83e4e933-...-b11f-f7485e8eb83c</Prop></Msg>
  • Date: <Msg><Prop>12/29/2009 12:47:28 PM</Prop></Msg>
  • Nested: <Msg><Prop><Nested>…</Nested></Prop></Msg>

Submitting Requests

To submit a request, create a message, set properties, and serialize it (including nested entities) to XML. Then, send this XML to the execution URL, and receive as a response message XML of the same type. Click here to view usage examples.

  • Serialize your message to XML, and submit bytes
  • All XML requests should be sent in UTF-8 format
  • All XML responses will be sent in UTF-8 format (with a BOM present)
  • All XML requests should be sent to %Base_URL%/Execute/default.aspx

Optimizing Requests

Requests can be optimized in terms of bandwidth usage and execution speed using different strategies. Those strategies include caching static data, requesting specific properties, batching requests, serializing efficiently, compressing data, etc. Click here to view recommendations.

  • Consider caching authentication tickets to avoid re-authentication
  • Consider caching static results (ex: fields associated with a data table)
  • Only update properties you need (set UpdateMessage.UpdateFieldsXml)
  • Only retrieve properties you need (set RetrieveMessage.RetrieveFieldsXml)
  • Only retrieve fields you need (set RetrieveDataMessage.RetrieveFieldsXml)
  • Specify a retrieve count when possible (set RetrieveMessage.Count)
  • Avoid unnecessary sorting and grouping clauses
  • Batch requests when possible (see "Batching Requests")
  • Consider compressing requests (see "Compressing Requests")

In addition, properties having the default value can be omitted from XML when serialized. For example, the following representations are equivalent: <Entity><Prop>False</Prop></Entity> and <Entity></Entity>.

  • Boolean properties have a default of false
  • Numeric properties have a default value of 0
  • Enum properties have a default of 0
  • String properties have a default of null
  • Guid properties have a default of the empty GUID
  • Date properties have a default of Jan 1, 0001
  • Nested entities have a default of null

Retrieving Data

Entities and data rows are found and retrieved using retrieve requests. Condition restrictions can be used to specify selection, ordering, and grouping clauses. Callers can also specify which properties or fields to retrieve, and use paging. Click here to view usage examples.

  • Retrieve entities using RetrieveMessage requests
  • Retrieve data rows using RetrieveDataMessage requests
  • Retrieve tasks using RetrieveTaskMessage requests
  • Retrieve task results using RetrieveTaskResultMessage requests
  • Specify selection / ordering / grouping by setting a ConditionRestriction
  • Specify select clauses by adding child ValueRestriction instances
  • Specify ordering clauses by adding child Ordering instances
  • Specify grouping clauses by adding child Grouping instances
  • To use paged retrieval, see "Paging Retrieves"
  • To specify which properties to retrieve, see "Optimizing Requests"

Registering Users

User registration includes security checks designed to prevent abuse. User registrations can be performed programmatically, but require interaction to resolve a CAPTCHA challenge. In addition, activating a link received by e-mail may be required. Click here to view usage examples.

  • Create a CaptchaMessage message
  • Specify the desired account name and submit the CAPTCHA request
  • Receive a CAPTCHA response containing image bytes and a hash value
  • Display the image to the user and ask the user to read characters
  • Create a RegisterUserMessage message
  • Specify the hash value & read characters and submit the register request
  • A user account will be created if supplied characters are valid
  • In some cases, an e-mail with an activation link may be received
  • Activating the link is then required before log in can succeed

Intercepting Messages

Callouts are function callbacks which can be registerd to intercept messages and alter normal processing. Pre-callouts execute before processing, while post-callouts execute after processing. Callbacks must be implemented as .NET assemblies. Click here to view usage examples.

  • Callout assemblies are registered by adding an entry to an XML config file
  • Callout are implemented as methods following a specific signature
  • Callouts can alter normal processing (ex: block, modify, etc.)
  • Callouts can perform auxiliary functions (ex: forward, log, etc.)
  • For detailed information about callouts, click here

Setting Properties

Certain properties cannot be set nor modified for security reasons (calculated). Other properties can be set once, but subsequently not modified (set once). Other properties must be set to ensure naming, identification, or location (mandatory). Click here to view usage examples.

  • Ex: the DateCreated property can never be set nor modified (calculated)
  • Ex: the Id property can be set but not modified (set once)
  • Ex: the Id property is required for delete messages to succeed (mandatory)
  • Specified properties which cannot be set are always ignored
  • Specified properties which cannot be updated are ignored or cause errors

Handling Errors

When requests succeed, the response message is of the same type as the request message. For example, successful RetrieveMessage requests result in RetrieveMessage responses. However, when an error occurs, an ErrorMessage response is sent instead. Click here to view detailed information about error codes. Click here to view usage examples.

  • ErrorMessage responses indicate an error occurred
  • The ErrorCode property contains a code used to identify the error type
  • The ErrorDetails property contains textual error details

Compressing Requests

Requests can optionally be compressed using the GZip format. Because XML is highly compressible, this is useful both when uploading large amounts of data, or for large message requests. We recommend compressing all submitted data if your client has sufficient computing power. Click here to view usage examples.

  • Compression is applied to entire request XML
  • Normally, requests are serialized to XML, and UTF-8 bytes submitted
  • To compress, zip UTF-8 bytes before submission
  • To compress, also set the "Zipped" HTTP header
  • This header can be set to any value (ex: "1") to indicate compression
  • To compress with .NET, use System.IO.Compression.GZipStream
  • To compress with Java, use java.util.zip.GZIPInputStream
  • Responses may also be compressed by the server
  • Howver, responses are compressed directly at the HTTP level
  • However, most Web proxies make this type of compression transparent

Using Joins

In some cases, data cannot be retrieved using simple condition restrictions: joins must be used. For example, suppose that, after executing a clustering task, we want to retrieve all data rows which are members of a given cluster. A join restriction must be used as a bridge between task results and data rows. Click here to view usage examples.

  • Create a RetrieveDataMessage message
  • Specify a data table to retrieve data rows from
  • Create a TaskResultJoinRestriction instance
  • On this instance, specify the task which generated task results to retrieve
  • On this instance, specify the type of task result to retrieve
  • Set the instance on the RetrieveDataMessage.JoinRestriction property
  • Specify data fields & task result properties to retrieve
  • Specify any condition restriction to restrict results
  • Submit the data retrieve request
  • Receive a data retrieve response containing task properties & data rows

Leveraging Library Code

Library code allows developers to quickly develop applications without having to deal with manually composing or parsing XML. Click here to download library code and get started today. Click here to compare code written using the library with raw XML messages.

Authenticating Requests

Each request is authenticated by attaching an authentication ticket, which can be obtained by presenting valid user credentials. An authentication ticket consists in a user ID, followed by a cryptographic hash. Click here to view usage examples.

  • Create a LogonMessage message
  • Specify the user name and password and submit the log in request
  • Receive a log in response containing a ticket and account info
  • Attach the ticket to subsequent requests to authenticate them
  • Consider caching tickets to avoid repeat authentication
  • To get account info from any ticket, use WhoAmIMessage requests

Uploading Data

To create a new data table, chunks of data (base64-encoded bytes) must be uploaded. Once all chunks have been uploaded, a data import task must be created & started to import chunk data, gather statistics, index fields, create the data table, etc. Click here to view usage examples.

  • Generate a single unique table info ID (TableInfoId)
  • Read blocks of bytes from a target CSV or Excel file
  • Base64-encode each block of bytes
  • Create a ChunkInfo entity for each block
  • Set the same TableInfoId property on each ChunkInfo
  • Create a CreateMessage message for each ChunkInfo
  • Submit the create request for each ChunkInfo
  • Create & start a RootDataUploadTaskInfo task (see "Executing Tasks")
  • Wait for the task to complete (see "Executing Tasks")
  • A new data table will be created with its ID set to the specified TableInfoId
  • Consider compressing requests (see "Compressing Requests")
  • Consider specifying the separator & culture when uploading CSV data
  • For example, US CSV files may specify dates in MM/DD/YY format
  • However CSV files generated in other locales may use a different one
  • Consider specifying the FileType property when uploading data
  • Consider specifying the Context property when uploading Excel files

Transforming Data

To transform data, a source table must be selected. Then, a data transformation task must be created and started to execute steps such as: convert a field, filter rows, sample rows, rank rows, scramble fields, set fields to calculated values, etc. Click here to view usage examples.

  • Identify the data set to copy and optionally transform (SourceTableInfoId)
  • Generate a single unique table info ID (TargetTableInfoId)
  • Create a TransformSequence sequence
  • On this sequence, set a list of transformation steps to execute
  • Serialize the sequence to XML
  • Create & start a RootDataTransformTaskInfo task (see "Executing Tasks")
  • Wait for the task to complete (see "Executing Tasks")
  • A new data table will be created with its ID set to the specified one

Executing Tasks

To execute a task, a target data table and a type of task must be selected. The task must then be created and started. To check if a task has completed, polling can be used. Once the task has completed, analysis results can be retrieved. Click here to view usage examples.

  • Select the type of task to execute (ex: a RootAssociationTaskInfo)
  • Create an instance of this type of task
  • Create a CreateMessage message
  • Specify the task instance and submit the create request
  • Create a StartTaskMessage message
  • Specify the task instance and submit the start request
  • Create a RetrieveTaskMessage message
  • Submit the retrieve task request
  • Receive a retrieve task response and check status for completion
  • Create a RetrieveTaskResultMessage message
  • Specify the task and submit the retrieve task result request
  • Receive retrieve task result responses, including analysis results

Paging Retrieves

Sometimes, large amounts of data must be retrieved. All results may not fit in one HTTP response. Data Applied limits the number of entities or data rows returned by a single request to 1000. If more data rows must be retrieved, you must use paging. Click here to view usage examples.

  • Create a retrieve message (ex: RetrieveMessage, RetrieveDataMessage)
  • Specify the max count of entities or data rows to retrieve (ex: 500)
  • Specify the offset from which to retrieve data (ex: offset 0, then 500, etc.)
  • Preferably, specify an order in the retrieve message's condition restriction
  • For example, order by ID to ensure consistency between paged retrievals
  • Submit the retrieve request and aggregate results

Batching Requests

Sometimes, executing a sequence of requests using a single request can be beneficial. For example, a developer may want to create, start, and retrieve a task using a single HTTP request, so as to reduce latency and traffic. Click here to view usage examples.

  • Create a ComplexMessage message
  • Append a sequence of messages to execute
  • Attach a ticket to the complex request
  • Indicate if sequence execution should stop upon failure
  • Submit the complex request
  • Receive a complex response containing multiple responses

Understanding Rights

Rights determine which actions users can perform on a workspace. Each right maps to a user, a workspace, and a right level (read / write / manage). Users without any right to a workspace cannot perform any action (not even find the workspace). Click here to view usage examples.

Manage Level:

  • View all tables, tasks, images, and comments in the workspace
  • Create / update / delete all workspace tables
  • Create / update / delete all workspace tasks
  • Create / update / delete all images & comments
  • Grant / revoke workspace rights & view security logs

Write Level:

  • View all tables, tasks, images, and comments in the workspace
  • Create workspace tables, but only update / delete owned ones
  • Create workspace tasks, but only update / delete owned ones
  • Create images & comments, but only update / delete owned ones
  • Cannot grant / revoke workspace rights or view security logs

Read Level:

  • View all tables, tasks, images, and comments in the workspace
  • Cannot ceate / update / delete workspace tables
  • Cannot create / update / delete workspace tasks
  • Cannot create / update / delete images or comments
  • Cannot grant / revoke workspace rights or view security logs

Managing Rights

Rights can be managed by users having manage-level rights on a workspace. Right management actions include granting, revoking, and changing rights. Users who are granted rights must approve them before they become effective. Click here to view usage examples.

  • Determine current rights using RetrieveMessage requests
  • Change rights using UpdateMessage requests (specify a new right level)
  • Revoke rights using DeleteMessage requests
  • Grant rights using CreateMessage requests (specify a user & right level)
  • Granted rights will be considered pending until approved by recipients
  • As a recipient, approve pending rights using UpdateMessage requests
  • As a recipient, reject pending rights using DeleteMessage requests
  • Delegate management responsability by granting manage rights to others
  • By default, retrieve requests ignore workspaces with pending rights
  • Set the IncludePendingApproval property to override this behavior

Restricting Access

Access can be restricted using rights (see "Managing Rights"), but also using restricted tickets. With restricted tickets, users can allow third-parties to perform operations on their behalf - securely. This is convenient if the 3rd party is not a user of the system. Click here to view usage examples.

  • Create a RestrictAccessMessage message
  • Specify a valid ticket and a set of access restrictions to apply to it
  • Obtain a ticket which is a restricted version of the original
  • Use RightLevelRestriction to lower the overall right level
  • For example, use this to lower any right you have to no more than "read"
  • Use InvocationRestriction to restrict allowed web service calls
  • For example, use this to only allow Read operations on TableInfo entities
  • Use WorkspaceInfoRestriction to restrict access scope
  • For example, use this to restrict access to a specific workspace
  • Use ExpirationRestriction to control ticket expiration
  • For example, use this to issue short-lived tickets valid for 5 minutes
  • Note that you only decrease access using restrictions (never increase it)

Sampling & Identification

When data is uploaded, a random value is assigned to each row, while a field is registered to keep track of the additional column. For large data sets, tasks can be instructed to only process a random sample, or only take specific fields into consideration to run faster. Click here to view usage examples.

  • Random values are used for sampling but also data row identification
  • To specify a task's sample size, set the SampleSize property
  • To specify a task's field subset, set the FieldInfoFilterXml property
  • When all data is retrieved from a data table, the random field is included
  • Task results often specify random values to identify data rows
  • The random field is the one with a FieldPurpose set to RandomIdentity
  • To retrieve matching data rows, reference this field in a condition restriction

Versioning Messages

Because the XML-based Web API may change over time, it includes a version control mechanism. Requests can specify a version number. If omitted, the server assumes requests are up-to-date, and don't need special handling. If specified, the server may be able to perform special handling to handle a legacy format. Click here to view usage examples.

  • Requests can specify the version of the API used to format XML
  • Responses always indicate the latest version implemented by the server
  • Specify the Version property if you use (or may be using) a legacy format
  • Submit any request and read the Version property to get the server's version

Understanding Licensing

Licenses specify usage restrictions. For example, licenses affect how many data tables a user can create, how many rows can be imported, etc. Org licenses apply to all users, while user licenses are user-specific. Org licenses are cumulative: each added org license unlocks new capabilities for all. User licenses however are restrictive: they impose usage restrictions overriding org licenses. Click here to view detailed information about license restrictions. Click here to view usage examples.

  • Add org licenses using CreateMessage requests
  • Delete org licenses using DeleteMessage requests
  • Replace user licenses using UpdateMessage requests
  • User licenses can never be removed, only replaced
  • Org licenses can never be removed or modified, only added
  • License keys are always verified using cryptographic methods