Reversibility and Low Impact Change are the non-functional requirements for Agile Technology Change

Reversability

I have attended and chaired many Change Review Boards for large organisations for the introduction of new systems and changing exsiting systems. I also overseen many development programmes, taken difficult decisions and helped stakeholders make descisions. The factors I’ve seen that affect the speed of the change is the “reversability” and “impact” of the change.

The reversability of the change is the ease (time, risk and cost) with which a change can be reversed (along with any side effects). The impact of the change is how will it affect the organisation, the customers, the shareholders, the public and the environment. This impact could be financial, reputational, legal and is some casues it can real, perceived or imagined.

For the most part software code is the easiest to change back (or rollback), data is harder and infrastructure very hard. Front end software is generally easier than back end software to roll back because of the dependancies involved.
Changes which cause permenant changes to databases how ever require extensive testing to ensure the change will perform as expected in addition to backups to restore the original state which may incur costs and lost time. Equally the most damaging changes related to those which affect upstream and downstream systems which cannot be restored to a previous state and so parallel running is advisable which is the most expensive of the solutions.

When decided how to authorise change I used the reversable/impact matrix to help decide the ease with which the change can be acheived. The easily reversible low impact change is easily dispatched if the asessment looks reasonable. The high impact easily reversible change requires more planning and techniques (Blue Green Deployment for one) in the implementation to make sure it creeps out rather than floods out. The low impact hard to reverse change needs robust challenge over the level of impact the change will actually incurr. The high impact and hard to reverse changed needs to most preparation in terms of regression testing, swing environments or even parallel running.

Therefore low impact and highly reversible change is the perfect combination for agile delivery which is sometimes termed ‘fail fast’ (which I usually change to ‘fail fast and safely’ to reassure stakeholders).

Cloud technologies, containerisation, orchestration and schema-less databases are so popular amongst architectd and devleopers as they create environments that are easy to create, easy to test, easy to change and all with the mimimum of impact.

GDPR and the Streetlight effect

There is a lot of heated debated about the GDPR (General Data Protection Regulation) – is it a Y2K damp squib or PPI/Libor/Forex all over again?

The General Data Protection Regulation is more certainty much more than Information Security and Privacy-by-Design which are just table stakes in the future.

The Regulation covers:

  • the right to be informed of personally identifiable information gathered and processed
  • the right of access to personally identifiable information
  • the right to rectification of mistakes
  • the right to erasure of personally identifiable information
  • the right to restrict processing personally identifiable information
  • the right to data portability to a new supplier
  • the right to object to marketing, scientific and historical research
  • the right to not be subject to automated decision making and profiling.

I agree that the maximum fine will probably never be levied but the range is sufficient to put it in the same impact levels of the fines that banks have suffered in the last decade. The fine will be based on the ability to pay and the perceived damage caused by a breach. There is also threat of private civil action by those individuals who can claim damages for breaches of the regulation.

The challenge for technologists is to provide compliant holistic solutions that meet the very inexact wording of the regulation around what are reasonable measures. These requirements themselves will be driven by newly formed governance bodies, internal and external compliance, legal counsel, marketers, external suppliers and data processors. A technology programme with a multitude of stakeholders and vague requirements is seldom a success.

The personally identifiable information itself is usually scattered among a number of front office, back office and supplier systems usually without good links to relate the information. There is also the risk that the data resides in the Shadow IT of the organisation away from the control of the technology department.

There are also technical challenges in the solutions proposed, for example, the regulation says that data must be encrypted but as any infosec expert will attest encryption is only as good as the algorithm and where you store the encryption keys. As for the right of erasure, that will be the topic of a future post on the technical challenges of achieving that requirement.

I applaud vendors such as Microsoft this week certifying their solutions can support making an organisation GDPR compliant but that shouldn’t take the focus off the overall organisational and technological change required.

At the moment, it seems we suffer from the streetlight effect in only looking for problems that are perceivably the easiest to fix.

GDPR – Federated Search Technology Pattern

The EU General Data Protection Regulation requires that European Citizen’s personally identifiable information is controlled, secured, available-on-request and able to be erased by 25 May 2018. Additionally customers have the right to easily change their marketing contact preferences within a reasonable time. For most organisations this is a challenge when they are dealing with legacy systems with departmental controls and the only data warehouse is provisioned for analytical purposes only.

One way of achieving compliance is to create a enterprise wide Data Lake containing all the organisations customer information sourced from different business unit’s operational systems and core systems such as email, intranet and shared drives. The Data Lake also contains all the original systems meta-data (data about the data) plus provenance information such as back-links to the systems of record in the individual business units. These back-links to the operational systems allow the right to be deleted to be exercised if requested by a customer. The meta-data contains the creation and last accessed dates from the system of record along with security information to allow the correct access controls to be applied to the Data Lake.

Ensuring a single view of the customer is much easier if all the enterprise data is within the Data Lake including third party SaaS (Software-as-a-service) providers such as Salesforce or Microsoft Dynamics. The addition of Master Data Management technologies from Informatica or IBM can provide data cleansing either before the data reaches the Data Lake or during a Search on the Data Lake. The Data Lake model also allows enterprises to resolve the issues around marketing contact preferences which can be
ifferent in each customer relationship management system or account. Allowing the customer to change their contact status or contact channels becomes easier if they can be found in the Data Lake.

However creating Data Lakes can be very challenging if the security or data models are heavily embedded within the operational systems or local jurisdictional systems have to be used for access control and monitoring.

The alternative model is Federated Search which for some organisations is a better solution. The Federated Search also allows the minimal amount of sharing as it uses a ‘merge on query’ approach to inter-departmental data which allows potentially greater compliance with ‘privacy-by-design’ constraints on the systems. Additionally a Federated Search can cross the organisational boundary into external data processor’s systems in real time.

The Federated Search model requires each departmental operational system to provide a full text search index either re-using an existing index technology or deploying a bespoke search capability using, for instance, Apache Solr or ElasticSearch. An Enterprise Search Service provides a central service or portal from which queries can be made. The Enterprise Search Services cascades any customer lookup queries down to departmental federated query engines which then searches for the data in their local index. If customer data is found then a specific query is constructed on the operational system. The returned data is correlated, matched and linked to provide a single view of the customers data. The actual search uses the security credentials of the user requesting the information of the Enterprise Search Services so the security controls and logging are preserved. In addition the business unit data owner retains real-time control over access to the data and can see the data access patterns within their existing context. Another benefit is the local index protects the operational system from unexpected load or logging as the resulting queries from the federated queries engines can be optimised for extracting specific information and not searching.

From a delivery perspective the Federated Search option means the organisation is not running a big programme at the centre of the organisation with the issues of communication, governance, funding and additional dependencies on an already stretched enterprise. The individual business units have the freedom to define the indexing technologies and subsequent queries and only need to comply with a well defined API for data query and security authentication and authorisation information. The system owner for the Active Directory (or equivalent identity and access management service) is not required to implement consolidation of permissions from various systems. The Security Operations Centre does not need to take on new feeds from a new system and try to correlate them with the existing operational system to determine access patterns.

The central technology programme is therefore responsible for defining the Customer Search API, the Federation Services and the API definitions alongside an on-boarding plan which can meet the speed of the overall organisation and the individual business units.

Data Lakes are a powerful technology for organisation to deploy, however with the impending deadline for GDPR compliance (25 May 2018) looming some organisations may need to take a more expedient approach.

For more information about GDPR, please see the UK Information Comissioner’s Office website: https://ico.org.uk/for-organisations/data-protection-reform/guidance-what-to-expect-and-when/

I encourage you to watch the video and provide feedback in the comments for suggestions, improvements, alternative approaches and critique.