There are of course two key aspects to the IoT: the devices themselves and the server-side architecture that supports them.

Often a third-category may be a low power gateway that performs aggregation, event processing, bridging, etc. that might sit between the device and the wider Internet.

In both cases, the devices probably have intermittent connections based on factors such as GPRS connectivity, battery discharging, radio interference, or simply being switched off.

There are effectively three classes of devices:

The smallest devices have embedded 8-bit System-On-Chip (SOC) controllers. A good example of this is the open source hardware platform Arduino: e.g the Arduino Uno platform and other 8-bit Arduinos. These typically have no operating system.

The next level up are the systems based on Atheros and ARM chips that have a very limited 32-bit architecture. These often include small home routers and derivatives of those devices. Commonly, these run a cut-down or embedded Linux platform, such as OpenWRT, or dedicated embedded operating systems. In some cases they may not use an OS, e.g. the Arduino Zero, or the Arduino Yun.

The most capable IoT platforms are full 32-bit or 64-bit computing platforms. These systems, such as the Raspberry Pi or the BeagleBone, may run a full Linux OS or another suitable Operating System, such as Android. In many cases these are either mobile phones or based on mobile-phone technology. These devices may also act as gateways or bridges for smaller devices, e.g. if a wearable connects via Bluetooth Low Energy to a mobile phone or Raspberry Pi, which then bridges that onto the wider Internet.

The communication between devices and the Internet or to a gateway includes many different models:

Direct Ethernet or Wi-Fi connectivity using TCP or UDP (we will look at protocols for this later)

Bluetooth Low Energy

Near Field Communication (NFC)

Zigbee or other mesh radio networks

SRF and point-to-point radio links

UART or serial lines

SPI or I2C wired buses

Figure 1 below illustrates the two major modes of connectivity.

IoT devices are inherently connected – we need a way of interacting with them, often with firewalls, network address translation (NAT) and other obstacles in the way.

There are billions of these devices already and the number is growing quickly; we need an architecture for scalability. In addition, these devices are typically interacting 24x7, so we need a highly-available (HA) approach that supports deployment across data centers to allow disaster recovery (DR).

The devices may not have UIs and certainly are designed to be “everyday” usage, so we need to support automatic and managed updates, as well as being able to remotely manage these devices.

IoT devices are very commonly used for collecting and analyzing personal data. A model for managing the identity and access control for IoT devices and the data they publish and consume is a key requirement.

Our aim is to provide an architecture that supports integration between systems and devices.

There are some specific requirements for IoT that are unique to IoT devices and the environments that support them, e.g. many requirements emerge from the limited form factors and power available to IoT devices.

Other requirements come from the way in which IoT devices are manufactured and used. The approaches are much more like traditional consumer product designs than existing Internet approaches. Of course there are a number of existing best practices for the server-side and Internet connectivity that need to be remembered and factored in.

We can summarize the overall requirements into some key categories:

Connectivity and communications

Device management

Data collection, analysis, and actuation

Scalability

Security

High Availability

Predictive analysis

Integration

Existing protocols, such as HTTP, have a very important place for many devices.

Even an 8-bit controller can create simple GET and POST requests and HTTP provides an important unified (and uniform) connectivity.

However, the overhead of HTTP and some other traditional Internet protocols can be an issue for two main reasons. Firstly, the memory size of the program can be an issue on small devices.

However, the bigger issue is the power requirements. In order to meet these requirements, we need a simple, small and binary protocol. We will look at this in more detail below. We also require the ability to cross firewalls.

In addition, there are devices that connect directly and those that connect via gateways. The devices that connect via a gateway potentially require two protocols: one to connect to the gateway, and then another from the gateway to the cloud.

Finally, there is obviously a requirement for our architecture to support transport and protocol bridging, e.g. we may wish to offer a binary protocol to the device, but allow an HTTP-based API to control the device that we expose to third parties.

While many IoT devices are not actively managed, this is not necessarily ideal. We have seen active management of PCs, mobile phones, and other devices become increasingly important, and the same trajectory is both likely and desirable for IoT devices. What are the requirements for IoT device management? The following list covers some widely desirable requirements:

The ability to disconnect a rogue or stolen device

The ability to update the software on a device

Updating security credentials

Remotely enabling or disabling certain hardware capabilities

Locating a lost device

Wiping secure data from a stolen device

Remotely re-configuring Wi-Fi, GPRS, or network parameters

The list is not exhaustive, and conversely covers aspects that may not be required or possible for certain devices.

A few IoT devices have some form of UI, but in general IoT devices are focused on offering one or more sensors, one or more actuators, or a combination of both. The requirements of the system are that we can collect data from very large numbers of devices, store it, analyze it, and then act upon it.

The reference architecture is designed to manage very large numbers of devices. If these devices are creating constant streams of data, then this creates a significant amount of data. The requirement is for a highly scalable storage system, which can handle diverse data and high volumes.

The action may happen in near real time, so there is a strong requirement for real-time analytics. In addition, the device needs to be able to analyze and act on data. In some cases this will be simple, embedded logic. On more powerful devices we can also utilize more powerful engines for event processing and action.

Any server-side architecture would ideally be highly scalable, and be able to support millions of devices all constantly sending, receiving, and acting on data. However, many “high-scalability architectures” have come with an equally high price – both in hardware, software, and in complexity.

An important requirement for this architecture is to support scaling from a small deployment to a very large number of devices.

Elastic scalability and the ability to deploy in a cloud infrastructure are essential.

The ability to scale the serverside out on small cheap servers is an important requirement to make this an affordable architecture for small deployments as well as large ones.

Security is one of the most important aspects for IoT. IoT devices are often collecting highly personal data, and by their nature are bringing the real world onto the Internet (and viceversa). This brings three categories of risks:

Risks that are inherent in any Internet system, but that product/IoT designers may not be aware of

Specific risks that are unique to IoT devices

Safety to ensure no harm is caused by, for instance, misusing actuators

The first category includes simple things such as locking down open ports on devices (like the Internet-attached fridge that had an unsecured SMTP server and was being used to send spam).

The second category includes issues specifically related to IoT hardware, e.g. the device may have its secure information read. For example, many IoT devices are too small to support proper asymmetric encryption. Another specific example is the ability for someone to attack the hardware to understand security.

Another example - university security researchers who famously reverse-engineered and broke the Mifare Classic RFID card solution3. These sort of reverse engineering attacks are an issue compared with pure web solutions where there is often no available code to attack (i.e. completely server-side implementation).

Two very important specific issues for IoT security are the concerns about identity and access management. Identity is an issue where there are often poor practices implemented. For example, the use of clear text/ Base64 encoded user IDs/passwords with devices and machine-to-machine (M2M) is a common mistake. Ideally these should be replaced with managed tokens such as those provided by OAuth/OAuth24.

Another common issue is to hard-code access management rules into either client- or server-side code. A much more flexible and powerful approach is to utilize models such as "Attribute Based Access Control" and "Policy Based Access Control". The most well known of these approaches is that provided by the XACML standard5. Such approaches remove access control decisions from hard-coded logic and externalize them into policies, which enabled the following:

More powerful and appropriate decisions;

Can potentially be based on contexts such as location, or which network is being used, or the time of day;

Access control can be analyzed and audited; and

Policies can be updated and changed, even dynamically, without recoding or modifying devices.

Our security requirements therefore should support

Encryption on devices that are powerful enough;

A modern identity model based on tokens and not userids/passwords;

The management of keys and tokens as smoothly/remotely as possible; and Policy-based and user-managed access control for the system based on XACML.

The reference architecture consists of a set of components. Layers can be realized by means of specific technologies, and we will discuss options for realizing each component. There are also some cross-cutting/vertical layers such as access/identity management.

The layers are

Client/external communications - Web/Portal, Dashboard, APIs

Event processing and analytics (including data storage)

Aggregation/bus layer – ESB and message broker

Relevant transports - MQTT/HTTP/XMPP/CoAP/AMQP, etc.

Devices

The cross-cutting layers are : Device manager Identity and access management

The bottom layer of the architecture is the device layer. Devices can be of various types, but in order to be considered as IoT devices, they must have some communications that either indirectly or directly attaches to the Internet. Examples of direct connections are

Arduino with Arduino Ethernet connection

Arduino Yun with a Wi-Fi connection

Raspberry Pi connected via Ethernet or Wi-Fi

Intel Galileo connected via Ethernet or Wi-Fi Examples of indirectly connected device include

ZigBee devices connected via a ZigBee gateway

Bluetooth or Bluetooth Low Energy devices connecting via a mobile phone

Devices communicating via low power radios to a Raspberry Pi

Each device typically needs an identity. The identity may be one of the following:

A unique identifier (UUID) burnt into the device (typically part of the System-on-Chip, or provided by a secondary chip)

A UUID provided by the radio subsystem (e.g. Bluetooth identifier, Wi-Fi MAC address)

An OAuth2 Refresh/Bearer Token (this may be in addition to one of the above)

An identifier stored in nonvolatile memory such as EEPROM

For the reference architecture we recommend that every device has a UUID (preferably an unchangeable ID provided by the core hardware) as well as an OAuth2 Refresh and Bearer token stored in EEPROM.

The specification is based on HTTP; however, (as we will discuss in the communications section) the reference architecture also supports these flows over MQTT.

The communication layer supports the connectivity of the devices. There are multiple potential protocols for communication between the devices and the cloud. The most wellknown three potential protocols are

HTTP/HTTPS

MQTT 3.1/3.1.1

Constrained application protocol (CoAP)

HTTP is well known, and there are many libraries that support it. Because it is a simple textbased protocol, many small devices such as 8-bit controllers can only partially support the protocol – for example enough code to POST or GET a resource. The larger 32-bit based devices can utilize full HTTP client libraries that properly implement the whole protocol.

MQTT is a publish-subscribe messaging system based on a broker model. The protocol has a very small overhead (as little as 2 bytes per message), and was designed to support lossy and intermittently connected networks. MQTT was designed to flow over TCP. In addition there is an associated specification designed for ZigBee-style networks called MQTT-SN (Sensor Networks).

CoAP is a protocol from the IETF that is designed to provide a RESTful application protocol modeled on HTTP semantics, but with a much smaller footprint and a binary rather than a text-based approach. CoAP is a more traditional client-server approach rather than a brokered approach. CoAP is designed to be used over UDP.

In order to support MQTT we need to have an MQTT broker in the architecture as well as device libraries.

One important aspect with IoT devices is not just for the device to send data to the cloud/ server, but also the reverse. This is one of the benefits of the MQTT specification: because it is a brokered model, clients connect an outbound connection to the broker, whether or not the device is acting as a publisher or subscriber. This usually avoids firewall problems because this approach works even behind firewalls or via NAT.

In the case where the main communication is based on HTTP, the traditional approach for sending data to the device would be to use HTTP Polling. This is very inefficient and costly, both in terms of network traffic as well as power requirements.

An important layer of the architecture is the layer that aggregates and brokers communications. This is an important layer for three reasons:

The ability to support an HTTP server and/or an MQTT broker to talk to the devices;

The ability to aggregate and combine communications from different devices and to route communications to a specific device (possibly via a gateway)

The ability to bridge and transform between different protocols, e.g. to offer HTTPbased APIs that are mediated into an MQTT message going to the device.

The aggregation/bus layer provides these capabilities as well as adapting into legacy protocols. The bus layer may also provide some simple correlation and mapping from different correlation models (e.g. mapping a device ID into an owner’s ID or vice-versa).

Finally the aggregation/bus layer needs to perform two key security roles. It must be able to act as an OAuth2 Resource Server (validating Bearer Tokens and associated resource access scopes). It must also be able to act as a policy enforcement point (PEP) for policy-based access.

This layer takes the events from the bus and provides the ability to process and act upon these events. A core capability here is the requirement to store the data into a database.

This may happen in three forms. The traditional model here would be to write a serverside application, e.g. this could be a JAX-RS application backed by a database. However, there are many approaches where we can support more agile approaches. The first of these is to use a big data analytics platform. This is a cloud-scalable platform that supports technologies such as Apache Hadoop to provide highly scalable mapreduce analytics on the data coming from the devices.

The second approach is to support complex event processing to initiate near real-time activities and actions based on data from the devices and from the rest of the system.

The reference architecture needs to provide a way for these devices to communicate outside of the device-oriented system. This includes three main approaches.

Firstly, we need the ability to create web-based front-ends and portals that interact with devices and with the event-processing layer.

Secondly, we need the ability to create dashboards that offer views into analytics and event processing.

Finally, we need to be able to interact with systems outside this network using machine-to-machine communications (APIs). These APIs need to be managed and controlled and this happens in an API management system.

The recommended approach to building the web front end is to utilize a modular front-end architecture, such as a portal, which allows simple fast composition of useful UIs.

Of course the architecture also supports existing Web server-side technology, such as Java Servlets/ JSP, PHP, Python, Ruby, etc. Our recommended approach is based on the Java framework and the most popular Java-based web server, Apache Tomcat.

The dashboard is a re-usable system focused on creating graphs and other visualizations of data coming from the devices and the event processing layer.

Device management (DM) is handled by two components. A server-side system (the device manager) communicates with devices via various protocols and provides both individual and bulk control of devices.

It also remotely manages software and applications deployed on the device. It can lock and/or wipe the device if necessary. The device manager works in conjunction with the device management agents. There are multiple different agents for different platforms and device types.

The device manager also needs to maintain the list of device identities and map these into owners. It must also work with the identity and access management layer to manage access controls over devices (e.g. who else can manage the device apart from the owner, how much control does the owner have vs. the administrator, etc.)

There are three levels of device: non-managed, semi-managed and fully managed (NM, SM,FM).

Fully managed devices are those that run a full DM agent. A full DM agent supports:

Managing the software on the device

Enabling/disabling features of the device (e.g. camera, hardware, etc.)

Management of security controls and identifiers

Monitoring the availability of the device

Maintaining a record of the device’s location if available

Locking or wiping the device remotely if the device is compromised, etc.

Non-managed devices can communicate with the rest of the network, but have no agent involved. These may include 8-bit devices where the constraints are too small to support the agent. The device manager may still maintain information on the availability and location of the device if this is available.

Semi-managed devices are those that implement some parts of the DM (e.g. feature control, but not software management).

Key aspects of IOT

A Reference Architecture For The Internet of Things

Requirements for a Reference Architecture

Connectivity and Communications

Device Management

Data Collection, Analysis, and Actuation

Scalability

Security

The Architecture

The Device Layer

The Communications Layer

The Aggregation/Bus Layer

The Event Processing and Analytics Layer

Client/External Communications Layer

Device Management