Quantcast
Channel: Intel Developer Zone Articles
Viewing all articles
Browse latest Browse all 3384

Saffron Technology™ Cognitive API FAQ

$
0
0
What are synchronous and asynchronous APIs?

Saffron’s Thought Processes (THOPs) are user-defined functions that allow you to tie together various Saffron MemoryBase (SMB) and other capabilities using a scripting language. Thought Processes can help perform a wide variety of actions such as: simplifying the execution of complex sequential queries, calling outside applications to be used in conjunction with Saffron reasoning functions, or converting query results into data suitable for UI widgets.

A key feature of Saffron's thought processes is that they can run synchronously or asynchronously.

By default, Saffron APIs run synchronous thought processes. A synchronous process typically runs via an HTTP GET call with the calling client and then waits for the result. Use synchronous THOPs when when you use the default (single-threaded) WebService engine which is known as THOPs 1.0. This process returns results one at a time; thus, it is slower than asynchronous processes. Still, this process is better for developers who need to troubleshoot or debug issues. Typically, synchronous thought processes (THOPs 1.0) are used for the following operations:

  • simple single queries
  • fast operations
  • operations that do not need asynchronization
  • troubleshooting and debugging in a development environment

Saffron APIs can also run asynchronous thought processes. These processes communicate with calling clients through messages in real time and can operate as long-running operations. Asynchronous APIs are only available with the latest Saffron WebService engine known as THOPS 2.0. This process is much faster than synchronous processes. Typically, asynchronous thought processes (THOPs 2.0) are used for the following operations:

  • complex queries that cannot be expressed in a single query
  • business logic when writing apps based on SUIT and THOPs
  • integrating Saffron APIs and third-party APIs
  • using a stored procedure in a relational database
  • deploying code at runtime

Learn more about thought processes.

What are Batch APIs?

Batch APIs allow you to run the same API (or set of APIs) repeatedly over a large number of items. The Batch API collects a large number of items such as records, rows, ids, and attributes. For each item, a Batch API calls one of the core APIs (such as similarity, classification, recommendation) to complete the process.

A key component of batch APIs are the thought processes under which they run. Thought processes (THOPs) are stored procedures that can run synchronously or asynchronously.

By default, Saffron APIs run synchronous thought processes. A synchronous process typically runs via an HTTP GET call with the calling client and then waits for the result. Use synchronous THOPs when when you use the default (single-threaded) WebService engine which is known as THOPs 1.0. This process returns results one at a time; thus, it is slower than asynchronous processes. Still, this process is better for developers who need to troubleshoot or debug issues.

Example Synchronous APIs:

Saffron APIs can also run asynchronous thought processes. These processes communicate with calling clients through messages in real time and can operate as long-running operations. Asynchronous APIs are only available with the latest Saffron WebService engine known as THOPS 2.0. This process is much faster than synchronous processes.

Example Asynchronous APIs:

What is a signature?
A signature is a list of attributes (category:value) that best characterizes a query item. It represents the most informative and relative attributes for that item. Once the signature is found, it can be used to provide useful and relevant comparison data.
What is the difference between the Classify Item API and the Nearest Neighbor API?

Both APIs can find the classification of a query item. For example, assume that we want to find out the classification (type) of animal:bear. The way to find the answer differs among the two APIs.

The Classify Item API gathers a list of attributes (signature) that best represents the animal:bear. Next, it finds classifications (or types) that are similar to the bear by comparing the attributes of the classifications against the signature of the bear. It then returns the top classification values based on these similar items.

The Nearest Neighbor API also gathers a list of attributes (signature) that best represents the animal:bear. It is different in that it uses the similarity feature to find similar animals (as opposed to finding similar classifications). From the top list of animals that are the most similar to the bear, the API initiates a voting method to return the top classification values.

When should the Classify Item API be used versus the Nearest Neighbor API?

The decision to use the Classify Item API or the Nearest Neighbor API depends on the available ingested data. Datasets that contain a high percentage of one particular classification negatively affect both the algorithm and probability if the Classify Item API is used. Because the data is swayed towards the same type, the query item could be incorrectly labeled. In this situation, the Nearest Neighbor API can cut through too much weight by finding neighbors that are similar to the query item. Even if it finds only one neighbor, that could be enough to get a correct label.

For example, assume that a dataset contains 100 animals. Of these, 60% are classified as invertebrates and 20% are classified as mammals. In spite of the weighted list, we can use the Nearest Neighbor API to find the classification of animal:bear by finding another animal that shares the attribute produces:milk. Since mammals are the only animals that produce milk, we can accurately conclude that the bear is a mammal.

What does "confidence" measure?

Confidence is a measuring tool in the Classification API suite that answers how confident the algorithm is with a classification decision (I am 99% confident that the bear can be classified as a mammal). It is the algorithm's self-assessment (or rating) of a prediction based on the amount of evidence it has. Typically, low confidence indicates a small amount of evidence in the dataset. Examples of evidence might include similarity strength, homogeneity of the neighborhood, information strength, and/or disambiguation level between classes.

The Classification APIs use confidence to:

  • automatically remove the low confidence records by human intervention
  • correct human mistakes
  • detect anomalies
  • better extrapolate overall accuracy from the "truth" set to a "training" set

Note: Do not confuse confidence with real accuracy or with Statistical Confidence.

How do "percent" and "similarity" influence "confidence" when using the Nearest Neighbor API to classify an item?

Confidence is the ultimate metric in that it indicates how confident we are that a query item is properly classified. Percent and similarity are used as evidence to compute confidence. Similarity indicates how similar a query item is to its nearest neighbors and percent shows how many of the neighbors have the same classification (or type). So, in a case where a query item has lots of nearest neighbors and those neighbors are the same type, we can conclude with a high level of confidence that the query item shares the same classification as its nearest neighbors.

Confidence levels decrease as the percent and/or similarity values decrease. A lower percentage indicates that not all of the nearest neighbors share the same classification. A lower similarity score indicates that some of the attributes of the nearest neighbors do not closely match the query item. It also indicates that some of the attributes have low "score" values, which means that they are not as relevant to selecting a classification.

What is the metric score in a signature? Why is it important?

For classification APIs, the metric score measures the relevance of an attribute (in a signature) for predicting the classification of a query item. A higher metric score (1) means an attribute has a higher predictive value against the label of the query item.

For example, assume that we are attempting to classify animal:bear. The classification API returns a list of attributes (signature) that characterizes the bear in hopes that we can find similar attributes that will help us classify it. The attribute behaves:breathes has a lower metric score (.5) because it does not help us narrow down the classification of the bear (mammals, reptiles, amphibians, and other types have the same attribute). The attribute produces:milk has a higher metric score (1) because it provides very useful and accurate information that can help us properly classify the bear. Since our data indicates that all animals with the produces:milk attribute are mammals, we can also label the bear as a mammal.

The higher a metric score is for attributes in a signature, the greater the chances of making an accurate classification. For similarity, a higher score means a better chance of finding similar items.

How can I learn more about APIs?
Refer to our API section of SMB documentation.
How can I learn more about thought processes (THOPs)?

Saffron’s Thought Processes (THOPs) are user-defined functions that allow you to tie together various Saffron MemoryBase (SMB) and other capabilities using a scripting language. Thought Processes can help perform a wide variety of actions such as: simplifying the execution of complex sequential queries, calling outside applications to be used in conjunction with Saffron reasoning functions, or converting query results into data suitable for UI widgets. THOPs are written in one of the supported script languages (in v10 of SMB, only JavaScript is available).

If you are familiar with other database products, think of Thought Processes as stored procedures. THOPs can be created via the REST API or by using the developer tools in Saffron Admin. Once a Thought Process is defined, it becomes callable through the standard REST API.

Learn more about thought processes.


Viewing all articles
Browse latest Browse all 3384

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>