Alterra.ai Phraser API

Phraser is a semantic intent classifier for natural language questions and commands.

Using this API you may build your own conversational virtual agents, messenger bots, Alexa skills and Actions on Google. You may voice-enable mobile apps and IoT devices. You may build a fully autonomous solution or pair it up with humans.

Phraser is a query classifier. Given a set of queries and a set of classes (intents) it will determine to which class a particular query belongs to.

Users may ask the same question or make the same command in a multitude of semantically equivalent ways – Phraser will reduce these paraphrases to one canonical form (assign to a pre-defined class). Your program may then reply with text or call any other function.

Intent is a group of semantically similar natural language queries. E.g., queries like “How can I call you?”, “What is your phone number?”, “I need a contact number to call”, etc. belong to one “your phone number” intent.

You define the intents as you think fit.

Powered by Deep Learning algorithms, the system requires a training corpus of historic user queries with the correct intents assigned to them. The bigger the corpus, the higher the classification quality. You upload your training corpus to the system via this API, too.

The system keeps a log of queries it receives. Incoming queries may be assigned correct intents by humans and added to the training corpus.

At the core of Phraser is phrase2vec sequence embedding algorithm our company has come up with. It’s like word2vec, but for multiple-word questions and commands. It maps short phrases to vectors in a multi-dimensional space, so as semantically similar phrases cluster together. These vectors are also available via this API.

Cosine similarity between vectors defines how semantically close the underlying phrases are. Unlike edit (Levenshtein) distance, this distance doesn’t count words. E.g. phrases like “How can I call you?” and “What is your phone number?” will be very close to each other, despite the fact that edit distance between them is infinite (no common words at all).

The API has three main parts:

Performing intent classification
Defining intents
Working with the training corpus and query log

The first part is used during the serving time. It performs actual classification. The user enters a query; you pass it to this API, and it returns the intent candidates, ordered by matching score.

The other two parts are used ahead of time, to upload and edit data, and train machine learning.

The API may return more than one result. If your application is a fully automated bot you may display only the first result. You may also display several results. If you have a human in the loop you may display several results to the agent and let him manually select which one to send to the end-user.

All APIs defined here follow the REST paradigm.

All methods require an API key. API key is passed as “Authorization” header in the request. You automatically receive your API key when you self-register for the service on Alterra’s website.

All GET methods take arguments as CGI parameters in the URL.

All POST and PUT requests take arguments in the request body, which should be in JSON format.

All methods return JSON. It may be empty if the only result is an operation status which is reported as HTTP status code.

The end-point for this API is at http://next.alterra.ai/api/phraser/v1/

Search API

This is the main part. It is used during the serving time. It performs actual classification. The user queries your application, you pass the query to this API, which returns the search results, i.e. the list of candidate intents, ordered by mathing score (relevancy).

Search: `GET /api/phraser/v1/search`

Given a user query, find relevant intent

Arguments

Field name	In	Type	Always present	Description
`query`	query	string	Y	user query

Example:

GET /api/phraser/v1/search?query=How+can+I+contact+you

Server reply

Type	Description
Search response object	A search response object (list of search results, ordered by relevancy)

Example:

{
  "search_id": "73b61636-29b1-4bee-8845-3bc0ffe9a86a",
  "results":
    [
        {
            "intent_id": "42",
            "title": "send_email"
        },
        {
            "intent_id": "43",
            "title": "order_pizza"
        }
    ]
}

Phrase2Vec: `GET /api/phraser/v1/vector`

Given a phrase, get DNN result vector

Arguments

Field name	In	Type	Always present	Description
`text`	query	string	Y	phrase text

Example:

GET /api/phraser/v1/vector?text=Hello+world

Server reply

Type	Description
Vector response object	A vector response object

Train search algo on updated corpus: `POST /api/phraser/v1/train`

Arguments

No arguments

Server reply

Train successful

Intents API

This part is used for uploading and editing the intents.

Each POST, PUT or DELETE endpoint may be used with an optional CGI parameter train=true to re-train ML models after API call succeeds. If you plan a run of consequent API calls updating the corpus, it is advised not to use train=true, but rather invoke train API call for re-training of the ML models after finishing editing the training corpus with this API.

Get all intents: `GET /api/phraser/v1/intents`

Get a list of intents. With pagination.

Arguments

Field name	In	Type	Always present	Description
`offset`	query	integer	N	start offset
`limit`	query	integer	N	maximum number of intents to return
`queries`	query	boolean	N	Return queries with intents
`common`	query	string	N	Common corpus name

Example:

GET /api/phraser/v1/intents?offset=42&limit=2

Server reply

Type	Description
List of Intent objects	List of intents

Example:

[
    {
        "name": "send_email",
        "id": "42"
    },
    {
        "name": "order_pizza",
        "id": "43"
    }
]

Add intents to the corpus: `POST /api/phraser/v1/intents`

Add new intents

Arguments

Field name	In	Type	Always present	Description
`body`	body	List of Intent objects	Y	List of intents. If no IDs are specified, they will be generated by the server. Otherwise IDs must be distinct positive integers
`replace`	query	boolean	N	Replace all intents with new ones
`partial`	query	boolean	N	Skip input intents with unavailable ids

Example:

POST /api/phraser/v1/intents
[
    {
        "name": "send_email",
        "id": "42"
    },
    {
        "name": "order_pizza",
        "id": "54"
    }
]

Server reply

Type	Description
List of integers	List of added intent ids

Example:

[42, 54]

Replace intents in the corpus: `PUT /api/phraser/v1/intents`

Replace existing intents

Arguments

Field name	In	Type	Always present	Description
`body`	body	List of Intent objects	Y	List of intents. You must fill id fields
`partial`	query	boolean	N	Skip input intents with unknown ids

Example:

PUT /api/phraser/v1/intents
[
    {
        "name": "send_email",
        "id": "42"
    },
    {
        "name": "order_pizza",
        "id": "54"
    }
]

Server reply

Type	Description
List of integers	List of replaced intent ids

Example:

[42, 54]

Delete all intents: `DELETE /api/phraser/v1/intents`

Delete all intents in the corpus

Arguments

No arguments

Server reply

Intents successfully deleted

Get intent by ID: `GET /api/phraser/v1/intents/{intentId}`

Returns an intent by given ID

Arguments

Field name	In	Type	Always present	Description
`intentId`	path	integer	Y	ID of the intent that needs to be fetched
`queries`	query	boolean	N	Return queries with intent

Server reply

Type	Description
Intent object	successful operation

Example:

{
    "name": "send_email",
    "id": "42"
}

Delete an intent: `DELETE /api/phraser/v1/intents/{intentId}`

Delete intent

Arguments

Field name	In	Type	Always present	Description
`intentId`	path	integer	Y	Intent id to delete

Server reply

Intent successfully deleted

Queries API

This part is used for working with the query log and training corpus for Machine Learning.

There are two types of queries:

All queries entered by users in the past – see Log entries
Queries admitted to the training corpus – see Queries

This API deals with the latter.

These two sets have a big overlap by may be not equal. Indeed, only legitimate user queries shall be added to the training corpus. Garbage and spam shall be discarded. On the other hand, some queries in the training corpus may come from sources other than logged user queries (e.g. other public training corpora).

Get all queries: `GET /api/phraser/v1/queries`

Get a list of queries associated with intents. With pagination.

Arguments

Field name	In	Type	Always present	Description
`offset`	query	integer	N	start offset
`limit`	query	integer	N	maximum number of queries to return
`common`	query	string	N	Common corpus name

Example:

GET /api/phraser/v1/queries?offset=1138&limit=1

Server reply

Type	Description
List of Query objects	List of queries

Example:

[
    {
        "hash": "23db4",
        "text": "Who should I contact about my booking?",
        "intent_id": 2
    }
]

Add queries: `POST /api/phraser/v1/queries`

Add queries associated with intents. Only absent queries are really added. The result will contain query hashes corresponding to input query list. intent_id field must be set in all queries

Arguments

Field name	In	Type	Always present	Description
`body`	body	List of Query objects	Y	List of queries. You souldn’t fill hash fields, they will be generated by the server
`partial`	query	boolean	N	Skip input queries with unknown intent ids

Example:

POST /api/phraser/v1/queries
[
    {
        "text": "Who should I contact about my booking?",
        "intent_id": 2
    },
    {
        "text": "Where is my confirmation?"
        "intent_id": 5
    },
    {
        "text": "WHERE IS MY CONFIRMATION???"
        "intent_id": 5
    },
]

Server reply

Type	Description
List of strings	List of query hashes

Example:

["67F227A57F1A496F", "47E25DF1D9BAF663", "47E25DF1D9BAF663"]

Delete all queries: `DELETE /api/phraser/v1/queries`

Delete all queries associated with intents

Arguments

No arguments

Server reply

Queries successfully deleted

Get query by ID: `GET /api/phraser/v1/queries/{queryHash}`

Returns an query by given ID

Arguments

Field name	In	Type	Always present	Description
`queryHash`	path	string	Y	Hash of the query that needs to be fetched

Server reply

Type	Description
Query object	successful operation

Example:

{
    "hash": "34512",
    "text": "Where is my confirmation?",
    "intent_id": 13
}

Deletes a query: `DELETE /api/phraser/v1/queries/{queryHash}`

Arguments

Field name	In	Type	Always present	Description
`queryHash`	path	string	Y	Hash of the query to delete

Server reply

Query successfully deleted

Get a list of the intent’s queries: `GET /api/phraser/v1/intents/{intentId}/queries`

Get a list of queries attached to the specific intent. With pagination.

Arguments

Field name	In	Type	Always present	Description
`intentId`	path	integer	Y	ID of the intent whose queries needs to be fetched
`offset`	query	integer	N	start offset
`limit`	query	integer	N	maximum number of queries to return

Example:

GET /api/phraser/v1/intents/1/queries?offset=1138&limit=2

Server reply

Type	Description
List of Query objects	List of intent queries

Example:

[
    {
        "hash": "123e4",
        "intent_id": 1,
        "text": "Where's the money Lebowski?"
    },
    {
        "id": "9ffc94",
        "intent_id": 1,
        "text": "Who should I contact about my booking?"
    }
]

Add queries to the intent: `POST /api/phraser/v1/intents/{intentId}/queries`

Add new queries to the intent. Only absent queries are really added. The result will contain query hashes corresponding to input query list. intent_id field is ignored

Arguments

Field name	In	Type	Always present	Description
`intentId`	path	integer	Y	ID of the intent to add queries to
`body`	body	List of Query objects	Y	List of queries. intent_id fields are ignored

Example:

[
    {
        "text": "Where is my booking confirmation?"
    }
]

Server reply

Type	Description
List of strings	List of query hashes

Example:

["706CB0285892CF2E"]

Delete all intent queries: `DELETE /api/phraser/v1/intents/{intentId}/queries`

Delete all queries associated with the intent

Arguments

Field name	In	Type	Always present	Description
`intentId`	path	integer	Y	ID of the intent to add queries to

Server reply

Queries successfully deleted

Get query by ID: `GET /api/phraser/v1/intents/{intentId}/queries/{queryHash}`

Returns an query by given ID

Arguments

Field name	In	Type	Always present	Description
`intentId`	path	integer	Y	ID of the intent whose queries needs to be fetched
`queryHash`	path	string	Y	ID of the query that needs to be fetched

Server reply

Type	Description
Query object	successful operation

Delete a query from the intent: `DELETE /api/phraser/v1/intents/{intentId}/queries/{queryHash}`

Arguments

Field name	In	Type	Always present	Description
`intentId`	path	integer	Y	ID of the intent whose queries needs to be deleted
`queryHash`	path	string	Y	Query id to delete

Server reply

Query successfully detached from the intent

Logs API

This part is used for working with the raw log of all queries entered by end-users in the past – see Log entries. The set of queries in this log may not fully coincide with the Queries in the training corpus. Indeed, only legitimate user queries shall be added to the training corpus. Garbage and spam may be discarded.

The intended use of this API is as follows. The system logs all end-user queries. You retrieve them, one-by-one or in bulk, make humans assign the right intent id to each query, add them to the training set (via Queries API), and invoke re-training of the ML models. You may use dedicated (hired) labelers or rely on the end-users or the community feedback.

For you, the raw query log is read-only. The system logs all end-user queries, as is. You may terieve them, but not modify.

Log entries: `GET /api/phraser/v1/log`

Get a list of log entries. With pagination.

Arguments

Field name	In	Type	Always present	Description
`offset`	query	integer	N	start offset
`limit`	query	integer	N	maximum number of entries to return

Example:

GET /api/phraser/v1/log?timestamp=2017-03-21&offset=4238&limit=1

Server reply

Type	Description
List of Log entry objects	List of log entries

Example:

[
    {
        "query": "How can I pay?",
        "search_id": "73b61636-29b1-4bee-8845-3bc0ffe9a86a",
        "query_hash": "01C5A0BBD64963CC",
        "timestamp": "2017-03-21T13:23:53+00:00",
        "intent_ids": [ 3, 5, 8 ]
    }
]

Get log entry by search ID: `GET /api/phraser/v1/log/{searchId}`

Returns a log entry by given ID

Arguments

Field name	In	Type	Always present	Description
`searchId`	path	string	Y	Search ID string

Server reply

Type	Description
Log entry object	successful operation

Example:

{
    "query": "How can I pay?",
    "search_id": "73b61636-29b1-4bee-8845-3bc0ffe9a86a",
    "query_hash": "01C5A0BBD64963CC",
    "timestamp": "2017-03-21T13:23:53+00:00",
    "intent_ids": [ 3, 5, 8 ]
}

Entities

Below is the description of entities used in this API. Each entity is represented by a JSON object.

Intent

One intent

Field name	Type	Always present	Description
`id`	integer	N	ID of the intent. Positive integer
`name`	string	Y	Name of the intent
`comment`	string	N	Private comment, not visible to end-users
`queries`	List of strings	N	List of queries attached to the intent
`do_not_index`	boolean	N	Do not index this intent, only match regexps

Example:

{
    "id": 42,
    "name": "send_email",
}

Intent objects are used in Intents API.

Query

A user query stored in the system and used for training.

Field name	Type	Always present	Description
`hash`	string	N	Hash of a normalized query text, used as query identificator
`intent_id`	integer	N	Id of intent that answers this question, if known. A special value of -1 is used when there is no good answer in the (and the query should be forwarded to a human)
`text`	string	Y	Query text

intent_id shall be trusted; it is assigned by a human labeler and used for training the algorithm. intent_id=0 can be used to store not-yet-labeled queries.

Example:

{
    "text": "Where's the money Lebowski?",
    "hash":  1138,
    "intent_id": 0
}

Query objects are used in Queries API.

Together, Intents and Queries form the training corpus for Machine Learning.

Log entry

A user query that the system saw in the past.

Field name	Type	Always present	Description
`query`	string	Y	User query text
`query_hash`	string	N	User query hash
`search_id`	string	Y	The ID of the search assigned when the search was performed
`timestamp`	string	Y	Search date and time in ISO 8601 (RFC 3339) format
`intent_ids`	List of integers	Y	List of intent_ids returned by the search algorithm as the response to this query, in the order of relevancy (see Search Response)
`selected_id`	integer	N

Unlike intent ids in Queries, these intent ids cannot be trusted. They are assigned by the search algorithm and may contain errors.

Example:

{
    "query": "How do one discover new bots?",
    "query_hash": "F95C6BB30EC32F55",
    "search_id": "05F4F53B-4F5D8162-7852A351-4B90F22E",
    "timestamp": "2017-03-29T12:00:35Z",
    "intent_ids": [42, 17, 5]
}

Log entry objects are used in Logs API.

Search result

One result of search, represented by one intent.

Field name	Type	Always present	Description
`intent_id`	integer	Y	Intent ID of the result
`intent_name`	string	Y	Intent name
`score`	number	N	Match probability

Example:

{
    "intent_id": 42,
    "intent_name": "send_email"
}

Search response object is returned by the Search API.

Search response

Set of all results matching given search query, ordered by relevancy

Field name	Type	Always present	Description
`search_id`	string	Y	The ID of the search performed (for logging/debugging)
`timestamp`	string	N	Search date and time in ISO 8601 (RFC 3339) format
`query_hash`	string	Y	The hash of searched query
`results`	List of Search result objects	Y	Results themselves (see `searchResult`)

Example:

{
  "search_id": "05F4F53B-4F5D8162-7852A351-4B90F22E",
  "query_hash": "F95C6BB30EC32F55",
  "timestamp": "2017-03-29T12:00:35Z",
  "results": [
    {
      "intent_id": 42,
      "intent_name": "send_email"
    },
    {
      "intent_id": 17,
      "intent_name": "get_contacts"
    },
    {
      "intent_id": 5,
      "intent_name": "your_address",
    },
  ]
}

Vector response

Model version and phrase vector itself

Field name	Type	Always present	Description
`model_version`	string	Y	Version of the model used to generate a vector
`timestamp`	string	N	Search date and time in ISO 8601 (RFC 3339) format
`text`	string	N	The input text
`text_hash`	string	N	The hash of given text
`vector`	List of numbers	Y	Vector data

Example:

{
  "model_version": "1",
  "text": "Give me a vector",
  "text_hash": "D95D4477D2BE0412",
  "timestamp": "2017-11-01T12:00:35Z",
  "vector": [
    0.1,
    -0.2,
    0.3
  ]
}

We may change the underlying Deep Learning model without notice. Thus, the model_version is always included in the response. Only vectors created with the same version of the model are comparable.

Vector response object is returned by the Vector API.

Pre-defined intents

There is a number of pre-defined intents (classes) to treat user queries that don’t belong to any of intents you defined. Pre-defined intents are editable and can be manually deleted from the corpus.

Title	ID	Indexed	Description
Wrong Language	`-4`	`No`	The user query is detected to be non-English
Garbage	`-5`	`No`	Completely useless queries. Garbage. Trash. Spam. Not worth answering, ever.
Ignore	`-3`	`No`	Meaningful queries worth answering that however shall not be added to your corpus: off-topic, one-off, too ambiguous, too short, too long and complex, etc. You may forward them to humans.
To do	`-2`	`No`	Meaningful on-topic queries that doesn’t have an intent in the current corpus, but should. You may create new intents and then re-assign these queries to them. Thus, the “to-do” name.

Wrong Language: this API only supports English. The engine includes a language identification algorithm. When it detects a non-English query it returns id=-4.

Garbage, Ignore, To do: use these three classes for questions that don’t have intents in your corpus. (The table above describes the differences between the three.) If the user query is classified as one of these, you may display a “No results found” message to the user.

Since the queries that fall under these categories are quite different from legitimate ones, the classifier may be not as accurate on these queries, compared to good ones. Garbage in – garbage out.

Therefore, you have an option to deactivate some or all of these classes. You do it by setting ‘do_not_index = true’. (In fact, it is the default value.) If you do so, the ML system will not use the respective classes for training and will not reply with the “No results found” message. Instead, it will attempt to find a matching intent in the current corpus. Most likely, it will be incorrect. Garbage in – garbage out.

However, even if you decide to deactivate these classes, you should still use them when labeling the query log. Otherwise, if you try to label these “bad” queries as legitimate it will wreak havoc on the algorithms. Besides, you may later decide to activate these classes — your training corpus will be ready. You may always activate it by changing ‘do_not_index’ from ‘true’ to ‘false’.

Alterra.ai Phraser API

Search API

Search: GET /api/phraser/v1/search

Arguments

Server reply

Phrase2Vec: GET /api/phraser/v1/vector

Arguments

Server reply

Train search algo on updated corpus: POST /api/phraser/v1/train

Arguments

Server reply

Intents API

Get all intents: GET /api/phraser/v1/intents

Arguments

Server reply

Add intents to the corpus: POST /api/phraser/v1/intents

Arguments

Server reply

Replace intents in the corpus: PUT /api/phraser/v1/intents

Arguments

Server reply

Delete all intents: DELETE /api/phraser/v1/intents

Arguments

Server reply

Get intent by ID: GET /api/phraser/v1/intents/{intentId}

Arguments

Server reply

Delete an intent: DELETE /api/phraser/v1/intents/{intentId}

Arguments

Server reply

Queries API

Get all queries: GET /api/phraser/v1/queries

Arguments

Server reply

Add queries: POST /api/phraser/v1/queries

Arguments

Server reply

Delete all queries: DELETE /api/phraser/v1/queries

Arguments

Server reply

Get query by ID: GET /api/phraser/v1/queries/{queryHash}

Arguments

Server reply

Deletes a query: DELETE /api/phraser/v1/queries/{queryHash}

Arguments

Server reply

Get a list of the intent’s queries: GET /api/phraser/v1/intents/{intentId}/queries

Arguments

Server reply

Add queries to the intent: POST /api/phraser/v1/intents/{intentId}/queries

Arguments

Server reply

Delete all intent queries: DELETE /api/phraser/v1/intents/{intentId}/queries

Arguments

Server reply

Get query by ID: GET /api/phraser/v1/intents/{intentId}/queries/{queryHash}

Arguments

Server reply

Delete a query from the intent: DELETE /api/phraser/v1/intents/{intentId}/queries/{queryHash}

Arguments

Server reply

Logs API

Log entries: GET /api/phraser/v1/log

Arguments

Server reply

Get log entry by search ID: GET /api/phraser/v1/log/{searchId}

Arguments

Server reply

Entities

Intent

Query

Log entry

Search result

Search response

Vector response

Pre-defined intents

Search: `GET /api/phraser/v1/search`

Phrase2Vec: `GET /api/phraser/v1/vector`

Train search algo on updated corpus: `POST /api/phraser/v1/train`

Get all intents: `GET /api/phraser/v1/intents`

Add intents to the corpus: `POST /api/phraser/v1/intents`

Replace intents in the corpus: `PUT /api/phraser/v1/intents`

Delete all intents: `DELETE /api/phraser/v1/intents`

Get intent by ID: `GET /api/phraser/v1/intents/{intentId}`

Delete an intent: `DELETE /api/phraser/v1/intents/{intentId}`

Get all queries: `GET /api/phraser/v1/queries`

Add queries: `POST /api/phraser/v1/queries`

Delete all queries: `DELETE /api/phraser/v1/queries`

Get query by ID: `GET /api/phraser/v1/queries/{queryHash}`

Deletes a query: `DELETE /api/phraser/v1/queries/{queryHash}`

Get a list of the intent’s queries: `GET /api/phraser/v1/intents/{intentId}/queries`

Add queries to the intent: `POST /api/phraser/v1/intents/{intentId}/queries`

Delete all intent queries: `DELETE /api/phraser/v1/intents/{intentId}/queries`

Get query by ID: `GET /api/phraser/v1/intents/{intentId}/queries/{queryHash}`

Delete a query from the intent: `DELETE /api/phraser/v1/intents/{intentId}/queries/{queryHash}`

Log entries: `GET /api/phraser/v1/log`

Get log entry by search ID: `GET /api/phraser/v1/log/{searchId}`