Prebuilt Dashboards design overview


Monitored activity

Prebuilt Dashboards is intended to provide actionable dashboards that reflect the activity passing through the API Gateways.

Key capabilities are:

  • Real-time analytics on service performance with impacts/root-cause analysis
  • Real-time analytics on consumers experience with respect to service performance
  • Usage/Consumption trends relying on historical data
  • Real-time analytics on resources availability 

In typical configurations, API calls are generally emitted by client applications and carried out by front-end API Gateways, then passed to back-end API Gateways to be processed by back-end services. Transactions correspond to the processing of an API call by a single API Gateway instance. That means that an API call processed by both a front-end API Gateway and a back-end API Gateway will lead to 2 transactions (1 per API Gateway).

Prebuilt Dashboards mainly focuses on the activity observed on the front-end side. As the solution can also collect data from the back-end API Gateways, it is always possible to enrich the application to address an additional scope focusing on back-end services.

At that stage, since there is no way to consolidate separate transactions around a single API call, there will not be any end-to-end monitoring between the front-end layer and the back-end layer.

Observation model

Entity Description Data Type
Global
  • 'Global'

Singleton used to consolidate data at a global level.

Configuration data
Gateway Instance Type
  • 'Front-End'
  • 'Back-End'

Determines wheter the related API Gateway instance acts as a front-end ("proxy") or a back-end.

Configuration data
Observation period
  • 'Last 1 hour'
  • 'Last 24 hours'

Enables the user to switch from one observation period to another onAPI Health dashboards.

Configuration data
Transaction status
  • 'success'
  • 'failure'
  • 'exception'

Status of transaction when completed.

Configuration data
Organization Organization to which the client application belongs to. Reference data
Client Application Client application as a consumer of the API services. Reference data
API Logical set of services provided by the company. Reference data
API Method API method called by the consumer. Reference data
Protocol Protocol used by the call emitter. Reference data
Gateway Instance API Gateway instance carrying out the transaction on the front-end or back-end side. Reference data
Group Deployment unit of services on a set of API Gateway instances. Reference data
Remote host Remote host or back-end service that is called to process the transaction. Reference data
User User of the monitoring solution. Reference data
Transaction Processing of an API Call on a single API Gateway instance. Transactional data
Transaction Leg Outbound legs of the transaction .Transactional data

Configuration data  correspond to data whose static values are to be considered as part of the application, no matter the customer context where the solution is deployed. As such, configuration data are space related and are automatically retrieved when importing the application.

Reference data  correspond to customer's reference data. Instances are created on the fly by transaction event log integration.

Transactional data  correspond to events having a specific time dimension (life cycle).

 

Transactional data life cycle

A transaction is composed of legs:

  • The first leg (leg 0) is always the inbound leg, its duration value correspond to the overall transaction duration
  • Subsequent legs are outbound calls, therefore their duration values correspond to back-end transactions 

Below is an example of a transaction generated by an API call and processed by a front-end API Gateway:

 

 

Data integration

The monitoring solution relies on the transaction event log as the main data source to collect events passing through the API Gateway instances. 

Data collection

Data collection consists in streaming the transaction event log files as they are written on disk by the API Gateway instances. Filebeat is a log streamer, an external component that we configure on each API Gateway to forward to ADI the event logs as they are being written. 

See Configure API Gateway for more detailed information.

Data parsing and absorption

Transaction event log processing

Data parsing and absorption consists in receiving, extracting data from the JSON format, and storing the events sent by Filebeat. 

Each event log file is composed of:

  • One single header event entry that contains details about the creation of the log file. For example, this includes when the log file is created, and on which host, domain, group, instance, and so on.
  • One or several system entries that contain details about the  API Gateway  system. For example, this includes details such as the amounts of disk space, memory, and CPU.
  • One or several  transaction  entries that contain details about a specific message transaction. For example, this includes details such as the protocol, method, bytes sent and received, IP addresses, ports, service name, and so on.
  • One or several  alert  entries that contain details about a specific system alert. 

This task is carried out by the collectAndProcessEvents routing context which is located in the 04-API-Integration space.


 

Reference data are created on the fly if they do not already exist (for example, when the create close transaction mapping call raises an exception because a transaction is using a reference data that does not exist).  

The original message body is reused as much as possible and enriched by the routes when some values cannot be directly used by the mapping. The setMapValue instruction is used to create those additional entries to the message body.

SEDA endpoint

SEDA endpoints allow a faster absorption of incoming events since they allow multi-threaded absorption. 

See Routes tips, tricks and gotchas for more detailed information.

Data mapping rules

The application database is provisioned with data sent by the routes and originally coming fKey metrics are evaluated against fixedrom two sources : 

  • the transaction event logs data
  • the contextual Filebeat metadata associated to each API Gateway instance (id, group, type) and configured when deploying Filebeat (see Install and configure Filebeat)

 

Transactional data
Entity / Relation Data source / Data location Instance Id Valid time
Transaction

Transaction Event Log

  • transaction

Filebeat configuration file

Transaction.correlationId
  • Start time: leg[0].timestamp
  • End time: leg[0].timestamp + leg[0].duration

client application

  • Transaction.ServiceContext[0].app
 

Transaction valid time

API method

  • Transaction.Leg[0].operation
  Transaction valid time

API

  • Transaction.Leg[0].serviceName
  Transaction valid time

status

  • Transaction.status
  Transaction valid time

gateway instance

  • Filebeat - group
  • Filebeat - instance
  Transaction valid time

gateway instance type

  • Filebeat - type
  Transaction valid time
Transaction Leg

Transaction Event Log

  • transaction
  • Transaction.correlationId
  • Leg.leg
 

transaction

  • Transaction.correlationId
  Leg valid time

Transaction event log example

{
	"type": "header",
	"logCreationTime": "2016-05-01 01:20:01.295",
	"hostname": "vespaapiprdif101vm",
	"domainId": "312e6168-18ad-4182-ba3d-b4a7eea9f658",
	"groupId": "group-2",
	"groupName": "VESPA_Group_0",
	"serviceId": "instance-1",
	"serviceName": "VESPA_Instance_1",
	"version": "v7.4.0-Internal"
}
{
	"type": "system",
	"time": 1462058459160,
	"diskUsed": 28,
	"instCpu": 1,
	"sysCpu": 1,
	"instMem": 874056,
	"sysMem": 7539200,
	"sysMemTotal": 8054040
}
{
	"type": "transaction",
	"time": 1425291330502,
	"path": "/stockquote.asmx",
	"protocol": "http",
	"protocolSrc": "8080",
	"duration": 1842,
	"status": "success",
	"serviceContexts": [{
		"service": "StockQuote",
		"monitor": true,
		"client": null,
		"org": null,
		"app": null,
		"method": "GetQuote",
		"status": "success",
		"duration": 1824
	}],
	"customMsgAtts": {
		
	},
	"correlationId": "4038f4540400788ebe4f84ca",
	"legs": [{
		"uri": "/stockquote.asmx",
		"status": 200,
		"statustext": "OK",
		"method": "POST",
		"vhost": null,
		"wafStatus": 0,
		"bytesSent": 1278,
		"bytesReceived": 612,
		"remoteName": "127.0.0.1",
		"remoteAddr": "127.0.0.1",
		"localAddr": "127.0.0.1",
		"remotePort": "49104",
		"localPort": "8080",
		"sslsubject": null,
		"leg": 0,
		"timestamp": 1425291328660,
		"duration": 1843,
		"serviceName": "StockQuote",
		"subject": null,
		"operation": "GetQuote",
		"type": "http",
		"finalStatus": "Pass"
	},
	{
		"uri": "/stockquote.asmx",
		"status": 200,
		"statustext": "OK",
		"method": "POST",
		"vhost": null,
		"wafStatus": 0,
		"bytesSent": 736,
		"bytesReceived": 1202,
		"remoteName": "www.webservicex.net",
		"remoteAddr": "173.201.44.188",
		"localAddr": "10.142.10.142",
		"remotePort": "80",
		"localPort": "49438",
		"sslsubject": null,
		"leg": 1,
		"timestamp": 1425291329916,
		"duration": 566,
		"serviceName": "StockQuote",
		"subject": null,
		"operation": "GetQuote",
		"type": "http",
		"finalStatus": null
	}]
}

 

Reference data
Entity / Relation Data source / Data location Instance Id Valid time
Gateway Instance

Transaction event Log

  • system
  • header
Filebeat configuration file
  • Filebeat - group
  • Filebeat - instance
Start time: first activity date from transaction event log

type

Filebeat - gateway type

  Gateway instance valid time
API

Transaction event Log

  • transaction
  • Filebeat - gateway instance type
  • Transaction.Leg[0].serviceName

Uncategorized service if empty

Start time: first activity date from transaction event log

gateway instance type

Filebeat - gateway type   API instance valid time
API Method

Transaction event Log

  • transaction


  • Filebeat - gateway instance type
  • Transaction.Leg[0].serviceName
  • Transaction.Leg[0].operation

Uncategorized method if empty

Start time: first activity date from transaction event log

gateway instance type

Filebeat - gateway type   API Method instance valid time

API

Transaction.Leg[0].serviceName

  API Method instance valid time
Client Application

Transaction event Log

  • transaction

Transaction.ServiceContext[0].app

Uncategorized application if empty

Start time: first activity date from transaction event log

organization

Transaction.ServiceContext[0].org    Client Application instance valid time
Organization

Transaction event Log

  • transaction


Transaction.ServiceContext[0].org

Uncategorized organization if empty

Start time: first activity date from transaction event log
Remote host

Transaction event Log

  • transaction
  • Transaction.Leg[1+].remoteName
  • Transaction.Leg[1+].remotePort
Start time: first activity date from transaction event log
Group

Transaction event Log

  • header

Filebeat configuration file

Filebeat - group

Start time: first activity date from transaction event log
Protocol

Transaction event Log 

  • transaction
 Transaction.protocol Start time: first activity date from transaction event log

 Filebeat configuration file example

filebeat:
  prospectors:
    - paths:
        - /opt/Axway/API/apigateway/events/group-1_instance-1.log 
        - /opt/Axway/API/apigateway/events/processed/group-1_instance-1_*.log.PROCESSED
      scan_frequency: 10s
      backoff: 1s
      close_inactive: 1h
      ignore_older: 24h
      clean_inactive: 48h
      fields:
        group: 1
        instance: 1
        type: Front-End
    - paths:
        - /opt/Axway/API/apigateway/events/group-1_instance-2.log
        - /opt/Axway/API/apigateway/events/processed/group-1_instance-2_*.log.PROCESSED
      scan_frequency: 10s
      backoff: 1s
      close_inactive: 1h
      ignore_older: 24h
      clean_inactive: 48d
      fields:
        group: 1
        instance: 2
        type: Front-End
  registry_file: /opt/Axway/filebeat/.filebeat
output:
  logstash:
    hosts: ["adi.collector.node.ip:5044"]
    compression_level: 0
    ssl:
      certificate_authorities: ["/opt/Axway/filebeat/myCA.pem"]
      certificate: "/opt/Axway/filebeat/cert.pem"
      key: "/opt/Axway/filebeat/certkey.pem"
      key_passphrase: "axway*"
      supported_protocols: [TLSv1.2]
      cipher_suites: [ECDHE-RSA-AES-256-GCM-SHA384]



Metrics and evaluations

Key metrics

Transactions and legs have a very short lifetime and are logged in the transaction event log file after they are completed.

In that case, the purpose of monitoring is to analyze the transactions completed over a certain period of time and to evaluate the situation from their characteristics and their final status. The finest granularity of calculation relies on 3 dimensions: API method x Client application x Transaction status. The basic metrics are calculated over the last 5 min (which is the finest granularity of calculation period for that application) and then re-aggregated to cover a larger period or a higher aggregation level.

Basic key metrics are the following:

Metric Description

#transactions completed

∑ transactions completed
average response time ∑ response time (age) of transations / ∑ transactions completed
status rate ∑ transactions completed (w/ status as a dimension) /∑ transactions completed (w/o status as a dimension)
TPS ∑ transactions completed / observation period in seconds
total bytes sent ∑ bytes sent
total bytes received ∑ bytes received
total bytes exchanged ∑ bytes exchanged
traffic variation

(∑ transactions completed over the last period -∑ transactions completed over the previous period) /∑ transactions completed over the previous period

average latency

∑ latency /∑ transactions completed

Transaction latency is calculated by data integration.

disk usage latest valid value found on the API Gateway instance
cpu usage latest valid value found on the API Gateway instance

Evaluations

Key metrics are evaluated against fixed threshold values that can be modified using configuration files. See Configure thresholds for more detailed information.

Evaluation Description
High response time average response timeevaluation against the corresponding API Method threshold value
High failure rate status ('failure') rate evaluation against the corresponding API Method threshold value
High exception count

#transactions completed evaluation against thecorresponding API Method threshold value

High TPS TPSevaluation against the corresponding API Method threshold value
High disk usage disk usageevaluation against the corresponding API Gateway instance threshold value
High CPU Usage CPU Usage evaluation against the corresponding API Gateway instance threshold value

 

Manage end user rights

User rights management is built around two main concepts:

  • Application Perspective: the monitoring solution offers a range of perspectives depending on the profiles of users the monitoring application can handle. An Application Perspective is a navigation context composed of a home dashboard and a subset of other dashboards, depending on the profile the user will have.
  • User Data Filtering: the monitoring solution offers an assignation system to enable an API Product Manager/Relationship Manager to monitor a specific subset of APIs/Organizations.

Application perspectives

In the context of the Prebuilt Dashboards, the following corresponding application perspectives will be predefined:

Application perspective User type Domain of concern Filtering rule
API Administrator API Administrator API Health None
Relationship manager (Admin) Relationship Manager Client Application Health None
Relationship manager (User) Relationship Manager Client Application Health Client applications related to the Organization the user is assigned to.
API Product manager (Admin) API Product Manager API Usage None
API Product manager (User) API Product Manager API Usage APIs the user is assigned to.
API Infrastructure Administrator API Infrastructure Administrator API Infrastructure Health None

User data filtering

 

User data filtering is handled by the Observation Model.

From a functional standpoint:

  • The Client Application Health domain can be filtered using the relationship between the User and Organization entities: this way the connected User will have visibility for the Client Applications relating to the Organizations this user is assigned to.
  • The API Usage domain can be filtered using the relationship between the User and API entities, this way the connected User will have visibility only for the APIs this User is assigned to.

Each one of these relationships will be effective as of a given applicability date. 

Thus, assigning an Organization/API to a user will consist in creating a relationship with an applicability date between this user and the Organization/API. Removing the access rights will consist in closing  (i.e. setting an end date to) the relationship.

Dashboards - Noteworthy features

Dashboard states

Dashboard states allow you to change the layout and the content of a single dashboard by switching from a parameter value to another. This mechanism can be considered as an alternative to using separate dashboards when, apart from a few differences, there is a need to address quite similar topics. Typically, in the context of API Health Analytics domain, dashboard states have been used to handle different observation periods.

The Observation period has been created to be used as a dashboard parameter, on which dashboard rules and dashboard states are relying on. See Dashboard states for a dynamic dashboard layout for more detailed information.

Custom observation periods

You can configure a custom observation period for API Usage dashboards.

See Configure the API Usage observation period for further information.

Related Links